From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <systemtap-return-1529-listarch-systemtap=sources.redhat.com@sources.redhat.com>
Received: (qmail 28518 invoked by alias); 26 Oct 2005 01:08:17 -0000
Mailing-List: contact systemtap-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:systemtap-subscribe@sources.redhat.com>
List-Post: <mailto:systemtap@sources.redhat.com>
List-Help: <mailto:systemtap-help@sources.redhat.com>, <http://sources.redhat.com/lists.html#faqs>
Sender: systemtap-owner@sources.redhat.com
Received: (qmail 28467 invoked by uid 22791); 26 Oct 2005 01:08:14 -0000
Date: Wed, 26 Oct 2005 01:08:00 -0000
From: "Frank Ch. Eigler" <fche@redhat.com>
To: Martin Hunt <hunt@redhat.com>
Cc: systemtap@sources.redhat.com
Subject: Re: RFC: major runtime map changes
Message-ID: <20051026010811.GA27015@redhat.com>
References: <1129761252.4284.30.camel@monkey> <y0m7jc82npw.fsf@toenail.toronto.redhat.com> <1129828225.4047.18.camel@monkey> <20051020175506.GB2761@redhat.com> <1129832480.8161.22.camel@monkey>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1129832480.8161.22.camel@monkey>
User-Agent: Mutt/1.4.1i
X-SW-Source: 2005-q4/txt/msg00089.txt.bz2

Hi -

hunt wrote:

> [...]
> > OK.  Maybe it will be useful to avoid such copying, and do it
> > virtually, something like this:
> > - [3 algorithms]
> 
> That is much more complex than what I proposed and all it does is save
> some storage space. 

In addition, it saves copying and metadata churn (rehashing, reallocation).

> Sorts become a huge mess; it is not as easy as you make it sound.

(I'm somewhat curious what extra complications you forsee.  Remember,
sorting is interesting only for purposes of consequent iteration.)

> And we also lose the ability to just call a simple aggregation
> function and then treat the output exactly as a normal map, reusing
> all the current map functions.

If it turns out that the pmap api is not too far from the map api, the
translator will have no trouble with either.

> While I'm thinking about it, do we want to preserve the per-cpu
> data?  [...]  If we do not want per-cpu data kept, then we have an
> option of having the local maps cleared after each aggregation
> operation [...]

That could work well, if it turned out that aggregates were not
aggregated frequently.  :-)  What I mean is that if the access 
is like

   var[idx] <<< exp

and [idx] reoccurs a lot (which it should - that's the whole point),
then there is a cost to clearing the per-cpu table: the reallocation
of the same index tuple at the next "<<<".


Anyway, Graydon and I are talking over the whole statistics issue.  It
may be that all we will end up needing from the runtime are ordinary
maps that store binary blob values of some initialization-time-fixed
width.  It would leave all the mathematics etc. to be open-coded by
the translator, or else added to new runtime sources.


- FChE