From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23353 invoked by alias); 10 Mar 2006 12:50:51 -0000 Received: (qmail 23270 invoked by uid 22791); 10 Mar 2006 12:50:51 -0000 X-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31) by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 10 Mar 2006 12:50:49 +0000 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id k2AComCe021933 for ; Fri, 10 Mar 2006 07:50:48 -0500 Received: from pobox.toronto.redhat.com (pobox.toronto.redhat.com [172.16.14.4]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id k2ACol104986; Fri, 10 Mar 2006 07:50:47 -0500 Received: from touchme.toronto.redhat.com (IDENT:postfix@touchme.toronto.redhat.com [172.16.14.9]) by pobox.toronto.redhat.com (8.12.8/8.12.8) with ESMTP id k2AColXN007962; Fri, 10 Mar 2006 07:50:47 -0500 Received: from ton.toronto.redhat.com (ton.toronto.redhat.com [172.16.14.15]) by touchme.toronto.redhat.com (Postfix) with ESMTP id 7DA678001FF; Fri, 10 Mar 2006 07:50:47 -0500 (EST) Received: from ton.toronto.redhat.com (localhost.localdomain [127.0.0.1]) by ton.toronto.redhat.com (8.13.1/8.13.1) with ESMTP id k2ACol3B031032; Fri, 10 Mar 2006 07:50:47 -0500 Received: (from fche@localhost) by ton.toronto.redhat.com (8.13.1/8.13.1/Submit) id k2AColKK031031; Fri, 10 Mar 2006 07:50:47 -0500 Date: Fri, 10 Mar 2006 12:50:00 -0000 From: "Frank Ch. Eigler" To: Martin Hunt Cc: systemtap@sources.redhat.com Subject: Re: tutorial draft checked in Message-ID: <20060310125046.GA6930@redhat.com> References: <20060303175653.GE6873@redhat.com> <1141420594.3595.46.camel@dragon> <20060310005104.GF2632@redhat.com> <1141983106.3380.46.camel@dragon> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1141983106.3380.46.camel@dragon> User-Agent: Mutt/1.4.1i Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2006-q1/txt/msg00758.txt.bz2 Hi - hunt wrote: > [...] > We have argued this again and again. I see no reason why you want the > translator to be more complicated and slower. [...] You misjudge my intention. > For the specific case of pmaps I am sure I spent more time arguing about > it than writing it. The disadvantages of what you want to do are > > 1. Reader locks are slow. They don't scale as well as per-cpu spinlocks. At least this is a quantifiable concern. > 2. The translator holds the lock during the whole probe vs the runtime > which holds the lock as short a time as possible. Among other things, this guarantees ACID-style properties for probe handlers, and prevents various race conditions. > 3. Having the translator handle low-level locking eliminates the > possibility of switching the runtime to a more efficient lockless > solution later. By removing locks from the runtime that the translator makes redundant, we still have a "lockless" solution. If locks can be done away with entirely, the translator can be taught not to emit them. It's probably one line of code change. > > Anyway, if the advantage of having unshared per-cpu locks for the <<< > > case was large, the translator could adopt the technique just as > > easily. > > Obviously not true. WHAT can you possibly mean by that? The translator could emit per-cpu spinlocks for pmaps. Its programmer would not even break a sweat. > It is already done and works in the runtime pmap implementation. Yes, but the question is where better to put the locking. > I ran a few benchmarks to demonstrate pmaps scalability and measure the > additional overhead from the translator reader-writer locks. [...] Good. > I ran threads that were making syscalls as fast as possible. > Results are Kprobes/sec > 1 thread 4 threads > Regular 340 500 > Pmaps 340 940 > Pmaps* 380 1040 > > Pmaps* is pmaps with the redundant reader-writer locks removed. How about a result with the redundant spinlocks removed? > Measured overhead of those locks is approximately 10% of the cpu > time for this test case. It sounds a bit high, considering all the other overhead involved. An oprofile count of SMP type events would be interesting. - FChE