From: William Cohen <wcohen@redhat.com>
To: "Frank Ch. Eigler" <fche@redhat.com>
Cc: systemtap@sources.redhat.com
Subject: Re: Proposed systemtap access to perfmon hardware
Date: Fri, 17 Mar 2006 20:26:00 -0000 [thread overview]
Message-ID: <441B1B5E.8090401@redhat.com> (raw)
In-Reply-To: <y0mslph7z6w.fsf@ton.toronto.redhat.com>
Frank Ch. Eigler wrote:
> wcohen wrote:
>
>
>>To try to get a feel on how the performance monitoring hardware
>>support would work in SystemTap I wrote some simple examples.
>
>
> Nice work. To flesh out the operational model (and please correct me
> if I'm wrong): the way this stuff would all work is:
>
> - The systemtap translator would be linked with libpfm from perfmon2.
> (libpfm license is friendly.)
The libpfm library license is an MIT license, so it should be
compatible with the systemtap licensing.
> - This library would be used at translation time to map perfmon.* probe
> point specifications to PMC register descriptions (pfmlib_output_param_t).
> (This will require telling the system the exact target cpu type for
> cross-instrumentation.)
Yes, this complicates the cross kernel (build instrumentation on one
system and run instrument on another). Different processors
architectures could be used on each. Some performance monitoring systems
such as PAPI has mappings for some generic names. This might help in
some cases. However, there are some differences in computer architecture
that just do not translate to the generic models
> - These descriptions would be emitted into the C code, for actual
> installation during module initialization. For our first cut, since
> there appears to exist no kernel-side management API at the moment,
> the C code would directly manipulate the PMC registers. (This means
> no coexistence for oprofile or other concurrent perfctr probing.
> C'est la vie.)
Would prefer to reuse to other software to access the performance
monitoring hardware. Don't want to generate yet another different piece
of software that uses the performance monitoring hardware. We want
64-bit values, but a number of the counters are much smaller than that
(32-bit). On the pentium 4 the access to the performance counters is
complicated and would prefer not reinventing the code to access the
performance counters. This mechanism will only work with the global
setup like sampling per thread would be unsupported. Also need to
translate between the name and the event number the table in OProfile
and perfmon are getting pretty large to keep all that information and
catch any inabilities to map events to a register.
One advantage of generating the C code would be that it would work with
existing RHEL4 kernel.
> - The "sample" type perfmon probes would map to the same kind of
> dispatch/callback as the current "timer.profile": the probe handler
> should have valid pt_regs available.
Yes, the pt_regs will be available to the sample type probe.
> - The free-running type perfmon probes, probably named
> "perfctr.SPEC.setup" or ".start" or ".begin" would map to a one-time
> initialization that passes a token (PMC counter number?) to the
> handler. Other probe handlers can then query/manipulate the
> free-running counter using that number via the start/stop/query
> functions.
>
> Is that sufficiently detailed to begin an implementation?
Pretty close. The one thing that isn't answered is the division of the
labor for the sampling probes, onetime setup vs sample handler. Want to
have some handle set in a global variable for the probe, but do not want
to execute that everytime that the sample is collected. For the
free-running probes it is pretty clear to handle the samples.
>>[...] print ("ipc is %d.%d \n", ipc/factor, ipc % factor);
>
>
> (An aside: we should have a more compact notation for this. We won't
> support floating point numbers, but integers can be commonly scaled
> like this. Maybe printf("%.Nf", value), where N implies a
> power-of-ten scaling factor, and printf("%*f", value, scale) for
> general factors.)
Yes, some scaling mechanism would be nice in some cases. The chances of
having IPC around the value of one were pretty likely, so I put in the
scaling to give a better picture of what is going on.
-Will
next prev parent reply other threads:[~2006-03-17 20:26 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-15 16:24 William Cohen
2006-03-15 22:34 ` Frank Ch. Eigler
2006-03-17 16:20 ` William Cohen
2006-03-17 17:10 ` Bill Rugolsky Jr.
2006-03-17 17:34 ` Frank Ch. Eigler
2006-03-17 20:26 ` William Cohen [this message]
2006-03-20 17:27 ` Frank Ch. Eigler
2006-03-22 3:34 ` Maynard Johnson
2006-03-22 18:02 ` William Cohen
2006-03-22 22:16 ` Maynard Johnson
2006-03-22 18:30 ` Frank Ch. Eigler
2006-03-22 19:09 Stone, Joshua I
2006-03-22 20:04 ` Frank Ch. Eigler
2006-03-22 23:23 Stone, Joshua I
2006-03-22 23:46 Stone, Joshua I
2006-03-23 12:54 ` Maynard Johnson
2006-03-23 14:46 ` William Cohen
2006-03-23 17:09 Stone, Joshua I
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=441B1B5E.8090401@redhat.com \
--to=wcohen@redhat.com \
--cc=fche@redhat.com \
--cc=systemtap@sources.redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).