From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3703 invoked by alias); 27 Feb 2006 21:06:35 -0000 Received: (qmail 3684 invoked by uid 22791); 27 Feb 2006 21:06:32 -0000 X-Spam-Status: No, hits=-1.3 required=5.0 tests=AWL,BAYES_40,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 27 Feb 2006 21:06:28 +0000 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id k1RL6QRe017356 for ; Mon, 27 Feb 2006 16:06:26 -0500 Received: from pobox.corp.redhat.com (pobox.corp.redhat.com [172.16.52.156]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id k1RL6Q123214 for ; Mon, 27 Feb 2006 16:06:26 -0500 Received: from [172.16.59.162] (dhcp59-162.rdu.redhat.com [172.16.59.162]) by pobox.corp.redhat.com (8.12.8/8.12.8) with ESMTP id k1RL6QNr001468 for ; Mon, 27 Feb 2006 16:06:26 -0500 Message-ID: <440369D1.6030705@redhat.com> Date: Mon, 27 Feb 2006 21:06:00 -0000 From: William Cohen User-Agent: Mozilla Thunderbird 1.0.7-1.1.fc4 (X11/20050929) X-Accept-Language: en-us, en MIME-Version: 1.0 To: SystemTAP Subject: Systemtap benchmarking doc draft Content-Type: multipart/mixed; boundary="------------060007000904010000070803" X-Virus-Checked: Checked by ClamAV on sourceware.org X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2006-q1/txt/msg00652.txt.bz2 This is a multi-part message in MIME format. --------------060007000904010000070803 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-length: 349 Here are some thoughts on Benchmarking SystemTap. As Brad Chen would say "got any tomatoes?", there are certainly things that need to refined in this document. It is definitely a work in progress. Probably the most useful feedback would be on the metrics. Do they contain useful information? Should some be dropped? Should others be added? -Will --------------060007000904010000070803 Content-Type: text/plain; name="stap_benchmarking" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="stap_benchmarking" Content-length: 5184 Systemtap Benchmarking Feb 27, 2006 1. OVERVIEW Instrumentation to collect measurements perturb the system being measured. The question is often how much the perturbation is instroduced. For the kernel developers are very interested knowing the cost of the instrumentation, so they are able to judge whether the instrumentation will add unacceptable overhead to the system or perturb the system so much to make the measurement inaccurate. This document outlines the suggested set of metrics to measure on SystemTap's performance, the framework to collect the data, and presentation of the results. 2. METRICS The selected metrics should give insight into the cost and overhead of common operations in SystemTap and supporting software. The SystemTap users should be able to gauge how much the instrumentation will affect the system. The metrics also allow the SystemTap developers to monitor SystemTap for performance regressions. Kernel micro-metrics: kprobe, kretprobe, and djprobes costs: insertion removal minimal probe firing colocated probes max number probes active space requirement per probe SystemTap Language micro metrics (costs of various operations): cost of associative array operations Kernel instrumentation limits: Number of probes that can be registered at a time Number of kretprobes active SystemTap limits: Maximum elements in associative array Number of actions/steps in probe. Instrumentation Process costs: latency from time stap is generated to time instrumentation running latency to shutdown instrumentation Profile where time is spend during instrumentation process Size of instrumentation kernel modules 3. BENCHMARK DATA COLLECTION For some of the benchmarks data could be collected from the existing testsuite scripts. However, there some additional benchmarks will be required to exercise particular aspects of the system. A variety of mechanisms will be used to collect the benchmark information. In some cases additional options will need to be passed to the systemtap translator to list the latency required for different phases of the translator. 3.1 USING EXISTING TESTSUITE INFRASTRUCTURE There are existing testsuites for systemtap and the kernel support. Currently, these are function tests to determine whether some aspect of the system is working correctly. However, many of the systemtap scripts would be represenative of what a user might write. The testsuite scripts could provide code to pass through the translator to measure things such as the amount of time required to compile and install a script. Additional performance tests will be need to be written. However, it would be useful to fold these into the existing testsuites to expand the set of tests to run and provide stress testing. 3.2 LATENCY MEASUREMENTS Latency is one of the most visible metrics for the user of SystemTap. How long does it take for the various phases of SystemTap to complete their work and get the instrumentation collecting data? Recently the SystemTap translator was modified to produce timing information with the "-v" option. Such as the example below: $ stap -v -p4 testsuite/buildok/thirteen.stp Pass 1: parsed user script and 10 library script(s) in 180usr/10sys/289real ms. Pass 2: analyzed script: 1 probe(s), 3 function(s), 0 global(s) in 570usr/30sys/631real ms. Pass 3: translated to C into "/tmp/stapZrONwM/stap_2173.c" in 190usr/90sys/302real ms. Pass 4: compiled C into "stap_2173.ko" in 5520usr/840sys/6125real ms. It will be a simple task to parse this information to provide the cost of various phases translation. 3.3 PROCESSOR UTILIZATION The latency measurements provide a coarse-grained view of how long each phase is. Profiling with OProfile will provide some insight into whether there are any hot spots in the SystemTap or the associated code on the system. On X86 and X86_64 OProfile can provide samples on most code include the kernel trap handlers. 3.4 MICROBENCHMARKS The microbenchmarks are listed in section 2. The small scale tests will need to be added testsuite to measure the cost of specific operations. Currently, the testsuite does not include tests to measure these. 4. ARCHIVING THE MEASUREMENTS There is always going to be the cases of "Your mileage may vary," where the mearurements may not apply to exactly to a particular situation. However, it would be useful to have this information publically available for reference, conceviably on a SystemTap webpage on sourceware.org, so people can get a feel for the cost on various systems. Something like the search for SPEC CPU2000 Results [SPEC06] would be nice, but at this time it probably would more work that we can justify at this time. Tracking performance such as done for GCC performance [Novillo2006] or code-size (CSiBE) [CSiBE2006] to identify regressions might be more practical REFERENCES [CSiBE2006] GCC Code-Size Benchmark Environment (CSiBE), Feb 2006. http://www.inf.u-szeged.hu/csibe [Novillo2006] Novillo, Deigo, Performance Tracking for GCC, Feb 2006. /http://people.redhat.com/dnovillo/spec2000/ [SPEC2006] SPEC CPU2000 Results, Feb 2006. http://www.spec.org/cpu2000/results/ --------------060007000904010000070803--