From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28518 invoked by alias); 26 Oct 2005 01:08:17 -0000 Mailing-List: contact systemtap-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sources.redhat.com Received: (qmail 28467 invoked by uid 22791); 26 Oct 2005 01:08:14 -0000 Date: Wed, 26 Oct 2005 01:08:00 -0000 From: "Frank Ch. Eigler" To: Martin Hunt Cc: systemtap@sources.redhat.com Subject: Re: RFC: major runtime map changes Message-ID: <20051026010811.GA27015@redhat.com> References: <1129761252.4284.30.camel@monkey> <1129828225.4047.18.camel@monkey> <20051020175506.GB2761@redhat.com> <1129832480.8161.22.camel@monkey> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1129832480.8161.22.camel@monkey> User-Agent: Mutt/1.4.1i X-SW-Source: 2005-q4/txt/msg00089.txt.bz2 Hi - hunt wrote: > [...] > > OK. Maybe it will be useful to avoid such copying, and do it > > virtually, something like this: > > - [3 algorithms] > > That is much more complex than what I proposed and all it does is save > some storage space. In addition, it saves copying and metadata churn (rehashing, reallocation). > Sorts become a huge mess; it is not as easy as you make it sound. (I'm somewhat curious what extra complications you forsee. Remember, sorting is interesting only for purposes of consequent iteration.) > And we also lose the ability to just call a simple aggregation > function and then treat the output exactly as a normal map, reusing > all the current map functions. If it turns out that the pmap api is not too far from the map api, the translator will have no trouble with either. > While I'm thinking about it, do we want to preserve the per-cpu > data? [...] If we do not want per-cpu data kept, then we have an > option of having the local maps cleared after each aggregation > operation [...] That could work well, if it turned out that aggregates were not aggregated frequently. :-) What I mean is that if the access is like var[idx] <<< exp and [idx] reoccurs a lot (which it should - that's the whole point), then there is a cost to clearing the per-cpu table: the reallocation of the same index tuple at the next "<<<". Anyway, Graydon and I are talking over the whole statistics issue. It may be that all we will end up needing from the runtime are ordinary maps that store binary blob values of some initialization-time-fixed width. It would leave all the mathematics etc. to be open-coded by the translator, or else added to new runtime sources. - FChE