From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7321 invoked by alias); 16 Apr 2015 22:44:37 -0000 Mailing-List: contact gsl-discuss-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gsl-discuss-owner@sourceware.org Received: (qmail 7286 invoked by uid 89); 16 Apr 2015 22:44:32 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_05,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: ipmx6.colorado.edu Received: from ipmx6.colorado.edu (HELO ipmx6.colorado.edu) (128.138.128.246) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 16 Apr 2015 22:44:31 +0000 From: Patrick Alken X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2CVBQDINjBV/ymzrIxdgwyCYYFizB9MAQEBAQEBfoRKFUA2AgUTAwsCCwMCAQIBSw0IAQGIJgWjIo9WkQMBhRWBIYoIh2uBRQEEizOXDo12IoFFgkkfgnQBAQE X-IPAS-Result: A2CVBQDINjBV/ymzrIxdgwyCYYFizB9MAQEBAQEBfoRKFUA2AgUTAwsCCwMCAQIBSw0IAQGIJgWjIo9WkQMBhRWBIYoIh2uBRQEEizOXDo12IoFFgkkfgnQBAQE Received: from bonanza.ngdc.noaa.gov ([140.172.179.41]) by smtp.colorado.edu with ESMTP/TLS/DHE-RSA-AES128-SHA; 16 Apr 2015 16:44:29 -0600 Message-ID: <55303B4D.7080903@colorado.edu> Date: Thu, 16 Apr 2015 22:44:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: "gsl-discuss@sourceware.org" Subject: New module for running statistics calculations Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2015-q2/txt/msg00000.txt.bz2 Hi all, I've just added a new module to GSL for running (or online) statistics - ie: computing the mean, variance, standard deviation, skewness, kurtosis, median, and arbitrary percentiles on the fly with a single pass algorithm, without needing to store the whole dataset in memory at once. The mean, variance, standard deviation, skew and kurtosis are exact computations. The median and p-quantile algorithm provides an approximation to the actual quantile, using the algorithm of: R. Jain and I. Chlamtac, The P^2 algorithm for dynamic calculation of quantiles and histograms without storing observations, Communications of the ACM, Volume 28 (October), Number 10, 1985, p. 1076-1085. It is now in the 'master' branch on the git and documented. I'll add an example program a little later this week. Any feedback is welcome. Patrick