From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23206 invoked by alias); 22 Mar 2006 18:02:07 -0000 Received: (qmail 23108 invoked by uid 22791); 22 Mar 2006 18:02:06 -0000 X-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31) by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 22 Mar 2006 18:02:04 +0000 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id k2MI0rk4023505; Wed, 22 Mar 2006 13:00:53 -0500 Received: from pobox.corp.redhat.com (pobox.corp.redhat.com [172.16.52.156]) by int-mx1.corp.redhat.com (8.12.11.20060308/8.11.6) with ESMTP id k2MI0rdE027828; Wed, 22 Mar 2006 13:00:53 -0500 Received: from [172.16.14.129] (to-dhcp29.toronto.redhat.com [172.16.14.129]) by pobox.corp.redhat.com (8.12.8/8.12.8) with ESMTP id k2MI0qOO029223; Wed, 22 Mar 2006 13:00:52 -0500 Message-ID: <442190D3.7060204@redhat.com> Date: Wed, 22 Mar 2006 18:02:00 -0000 From: William Cohen User-Agent: Mozilla Thunderbird 1.0.7-1.1.fc4 (X11/20050929) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Maynard Johnson CC: SystemTAP Subject: Re: Proposed systemtap access to perfmon hardware References: <44183FCF.6010809@redhat.com> <4420C5A2.9060702@us.ibm.com> In-Reply-To: <4420C5A2.9060702@us.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2006-q1/txt/msg00852.txt.bz2 Maynard Johnson wrote: > William Cohen wrote: > >> [snip] >> >> perfmon_create_context:long () >> >> The perfmon_create_context command sets up the performance monitoring >> hardware for the allocated contexts and starts the counters running. >> If successful, the function will return zero. If the operation is >> unsuccessful because an error code will be returned. This function >> should only be used in probe begin. (FIXME list error code returned.) >> >> > I'm confused about the relationship between this function and > perfmon_start_counter, since starting the counters is mentioned in > both. Could you explain at what point this function is invoked and what > the purpose of the context is? I'm not real familiar with the perfmon2 > interface, but just on the face of it, your context doesn't seem like a > one-to-one fit with the way contexts are used in perfmon2. In perfmon2, > a context is created first, which is then passed in to the calls for > setting up events, thereby associating those events with the context. > Then 'start' uses the context to set up the PMU for all requested events > and begin the counting. Yes, perfmon2 has a contexts that sets all the performance monitoring hardware registers. The perfmon2 start and stop control the entire context. Based on the feedback from earlier proposal email, revised to using something like: probe perfmon.event("blah") ... All the probes using the perfmon hardware would be collected together for the perfmon_create_context. The individual start and stop operations would be allowed. It is and open question what the counters default are; do they start running by default or have to be explicitly started. If they are started by default, where exactly are they running? Beginning of begin probe? End of begin probe? >> >> [snip] >> >> perfmon_start_counter:long (event_handle:long) >> >> The event_handle passed in indicates which counter to start. The value >> is returned as a 64-bit long of the current counter value. The return >> value is undefined for an invalid event_handle. >> >> > I think individually starting counters is problematic at a couple > different levels. On some architectures (like PowerPC64), you don't > have fine-grained control over each counter. Also, one usually wants > all counters to begin counting at the same time. Maybe I'm > misinterpreting what the intention of this function is. I was thinking there are cases where one would want to start and stop individual sampling and interval counting. Yes, starting and stoping counters on some architectures can be a problem. I was thinking if cheating and not actually starting and stopping the counters, but rather turning on and off the bits that enabling counting in user and kernel space. Do this by finding which bits to twiddle in the control register. However, maybe this won't work for ppc64. I will have to review the ppc64 hardware manual to see that this scheme would work. >> [snip] >> > >> EVENT SPECIFICATION >> >> The performance monitoring events are specified in strings. The >> information at the very least include the event name being monitored >> >> > Will, you allude to this in a later posting, but I'll reiterate here. > Should the event name be the native event name for the arch? Or some > generic name that is mapped to a native name by some mechanism? Or > either (as in PAPI)? libpfm has some generic names for cycle counts. I expect that events will be both generic names and architecture specific. This will be a lookup in libpfm. >> by the counter. Additional information would include a event mask to >> specify subevents, whether to count in kernel or user space, whether >> to keep track of counts on a per thread or per CPU basis, and the >> interval for the sampling. >> >> (FIXME more detail on the string compositions) >> >> >> SYSTEMTAP PERFORMANCE HARDWARE ACCESS IMPLEMENTATION >> >> The SystemTap access performance monitoring hardware is planned to be >> built on the perfmon2 kernel support. The perfom2 provides reservation >> and access to the performance monitoring hardware on ia64, i386, and >> PowerPC processors. The perfmon2 support is not yet in the upstream >> kernels, but patches are available. >> >> > As a proof of concept, I agree that this is the best route. Reinventing > the wheel would be useless. Maybe building this prototype might help > with refining the perfmon2 interface. I have been working on patching oprofile so that it uses the perfmon2 interface. The work is being done on an amd64 machine. This should allow some examination of the mechanisms for setting up the events and sampling. It should be portable to perfmon2 for i386, ppc64, and ia64. I will make the patches available for comment. Next step would be to protoype similar opertation for systemtap. I am trying to avoid reinventing the wheel. I am also very concerned that raw access of the performance monitoring hardware will further increase the chances of multiple device drivers stepping on each other without knowing about it. -Will