From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6463 invoked by alias); 12 May 2013 18:13:42 -0000 Mailing-List: contact gsl-discuss-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gsl-discuss-owner@sourceware.org Received: (qmail 6447 invoked by uid 89); 12 May 2013 18:13:40 -0000 X-Spam-SWARE-Status: No, score=-5.5 required=5.0 tests=BAYES_00,KHOP_THREADED,RCVD_IN_DNSWL_NONE,RCVD_IN_HOSTKARMA_W,RCVD_IN_HOSTKARMA_WL autolearn=ham version=3.3.1 Received: from qmta06.emeryville.ca.mail.comcast.net (HELO qmta06.emeryville.ca.mail.comcast.net) (76.96.30.56) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Sun, 12 May 2013 18:13:38 +0000 Received: from omta20.emeryville.ca.mail.comcast.net ([76.96.30.87]) by qmta06.emeryville.ca.mail.comcast.net with comcast id b6Cb1l0011smiN4A66DdYD; Sun, 12 May 2013 18:13:37 +0000 Received: from max.nulle.part ([98.213.45.232]) by omta20.emeryville.ca.mail.comcast.net with comcast id b6Db1l00W50ZskJ8g6Dc03; Sun, 12 May 2013 18:13:37 +0000 Received: from edd by max.nulle.part with local (Exim 4.80) (envelope-from ) id 1Ubam7-0007s7-BZ; Sun, 12 May 2013 13:13:35 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <20879.56271.237184.255948@max.nulle.part> Date: Sun, 12 May 2013 18:13:00 -0000 To: Peter Teuben Cc: "gsl-discuss\@sourceware.org" , Patrick Alken Subject: Re: Robust linear least squares In-Reply-To: <518FD7B4.8070100@astro.umd.edu> References: <518D6E3B.8080503@colorado.edu> <518FD7B4.8070100@astro.umd.edu> From: Dirk Eddelbuettel X-SW-Source: 2013-q2/txt/msg00002.txt.bz2 On 12 May 2013 at 13:56, Peter Teuben wrote: | Patrick | I agree, this is a useful option! | | can you say a little more here how you define robustness. The one I | know takes the quartiles Q1 and Q3 (where Q2 would | be the median), then define D=Q3-Q1 and only uses points between | Q1-1.5*D and Q3+1.5*D to define things like a robust mean and variance. | Why 1.5 I don't know, I guess you could keep that a variable and tinker | with it. | For OLS you can imagine applying this in an iterative way to the Y | values, since formally the errors in X are neglibable compared to those | in Y. I'm saying iterative, since in theory the 2nd iteration could have | rejected points that should have | been part or the "core points". For non-linear fitting this could be a | lot more tricky. There is an entire "task view" (ie edited survey of available packages) available for R concerning robust methods (for model fitting and more): http://cran.r-project.org/web/views/Robust.html So there is not just one generally accepted best option. That said, having something is clearly better than nothing. But let's properly define the method and delineat its scope/ Dirk -- Dirk Eddelbuettel | edd@debian.org | http://dirk.eddelbuettel.com