From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17252 invoked by alias); 26 Oct 2005 08:40:38 -0000 Mailing-List: contact systemtap-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sources.redhat.com Received: (qmail 17238 invoked by uid 22791); 26 Oct 2005 08:40:35 -0000 Subject: Re: RFC: major runtime map changes From: Martin Hunt To: "Frank Ch. Eigler" Cc: "systemtap@sources.redhat.com" In-Reply-To: <20051026010811.GA27015@redhat.com> References: <1129761252.4284.30.camel@monkey> <1129828225.4047.18.camel@monkey> <20051020175506.GB2761@redhat.com> <1129832480.8161.22.camel@monkey> <20051026010811.GA27015@redhat.com> Content-Type: text/plain Organization: Red Hat Inc Date: Wed, 26 Oct 2005 08:40:00 -0000 Message-Id: <1130300641.4110.13.camel@monkey> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-22) Content-Transfer-Encoding: 7bit X-SW-Source: 2005-q4/txt/msg00090.txt.bz2 On Tue, 2005-10-25 at 21:08 -0400, Frank Ch. Eigler wrote: > [...] > > Sorts become a huge mess; it is not as easy as you make it sound. > > (I'm somewhat curious what extra complications you forsee. Remember, > sorting is interesting only for purposes of consequent iteration.) If you are sorting based on keys, then your proposal works fine. But the more useful case will be sorting based on aggregated values, in which case your proposal of sorting each per-cpu map is just a waste of time. > > And we also lose the ability to just call a simple aggregation > > function and then treat the output exactly as a normal map, reusing > > all the current map functions. > > If it turns out that the pmap api is not too far from the map api, the > translator will have no trouble with either. It will be almost identical. > > While I'm thinking about it, do we want to preserve the per-cpu > > data? [...] If we do not want per-cpu data kept, then we have an > > option of having the local maps cleared after each aggregation > > operation [...] > > That could work well, if it turned out that aggregates were not > aggregated frequently. :-) What I mean is that if the access > is like > > var[idx] <<< exp > > and [idx] reoccurs a lot (which it should - that's the whole point), > then there is a cost to clearing the per-cpu table: the reallocation > of the same index tuple at the next "<<<". Reallocation consists of removing a node from the head of one list and inserting it at the end of two others, which is just some pointer manipulation. The bigger cost is copying the keys into the node, which might be strings. > Anyway, Graydon and I are talking over the whole statistics issue. It > may be that all we will end up needing from the runtime are ordinary > maps that store binary blob values of some initialization-time-fixed > width. It would leave all the mathematics etc. to be open-coded by > the translator, or else added to new runtime sources. I don't understand why you wouldn't want common code like that in the runtime. Martin