* computing R-squared with gsl_multifit
@ 2007-09-25 19:46 Patrick Alken
2007-09-26 15:52 ` Brian Gough
0 siblings, 1 reply; 7+ messages in thread
From: Patrick Alken @ 2007-09-25 19:46 UTC (permalink / raw)
To: gsl-discuss
Hi all,
This is a continuation of the discussion from help-gsl on
computing the R^2 "coefficient of determination" for multi-parameter
fits. I think it would be a useful thing to have and I propose
that the best way to add it in would be to compute it in
gsl_multifit_linear and gsl_multifit_wlinear and store it in
the gsl_multifit workspace. Then provide a function like
gsl_multifit_linear_Rsq to return the value to the user. This
way we wouldn't need to change the api and the multifit_linear
routines would remain backwards compatible.
Computing the value would require adding about 4-5 lines of
code and it can be computed at the same time as chisq so there
doesn't have to be an extra loop added in.
Patrick Alken
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: computing R-squared with gsl_multifit
2007-09-25 19:46 computing R-squared with gsl_multifit Patrick Alken
@ 2007-09-26 15:52 ` Brian Gough
2007-09-26 16:18 ` Dirk Eddelbuettel
2007-09-26 16:20 ` Patrick Alken
0 siblings, 2 replies; 7+ messages in thread
From: Brian Gough @ 2007-09-26 15:52 UTC (permalink / raw)
To: gsl-discuss
At Tue, 25 Sep 2007 13:46:28 -0600,
Patrick Alken wrote:
> Computing the value would require adding about 4-5 lines of
> code and it can be computed at the same time as chisq so there
> doesn't have to be an extra loop added in.
Hello,
I think I'd prefer not to have results passed back through the
workspace, it's not really part of the design -- a separate function
is better, as for the correlation coefficient.
A question: is the formula the same for both weighted and unweighted
fits?
--
Brian Gough
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: computing R-squared with gsl_multifit
2007-09-26 15:52 ` Brian Gough
@ 2007-09-26 16:18 ` Dirk Eddelbuettel
2007-09-26 16:26 ` Patrick Alken
2007-09-26 16:20 ` Patrick Alken
1 sibling, 1 reply; 7+ messages in thread
From: Dirk Eddelbuettel @ 2007-09-26 16:18 UTC (permalink / raw)
To: gsl-discuss
On Wed, Sep 26, 2007 at 04:51:19PM +0100, Brian Gough wrote:
> At Tue, 25 Sep 2007 13:46:28 -0600,
> Patrick Alken wrote:
> > Computing the value would require adding about 4-5 lines of
> > code and it can be computed at the same time as chisq so there
> > doesn't have to be an extra loop added in.
>
> Hello,
>
> I think I'd prefer not to have results passed back through the
> workspace, it's not really part of the design -- a separate function
> is better, as for the correlation coefficient.
>
> A question: is the formula the same for both weighted and unweighted
> fits?
Yes, as it relates 'explained sum of squares' to 'total sum of
squares', so it just uses residuals, irrespective of whether these
were computed with a unit vector of weights, or with actual weights.
Wikipedia is not a bad start:
http://en.wikipedia.org/wiki/R-squared
Hth, Dirk
--
Three out of two people have difficulties with fractions.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: computing R-squared with gsl_multifit
2007-09-26 15:52 ` Brian Gough
2007-09-26 16:18 ` Dirk Eddelbuettel
@ 2007-09-26 16:20 ` Patrick Alken
2007-10-02 8:35 ` Brian Gough
1 sibling, 1 reply; 7+ messages in thread
From: Patrick Alken @ 2007-09-26 16:20 UTC (permalink / raw)
To: gsl-discuss
> A question: is the formula the same for both weighted and unweighted
> fits?
No, unfortunately.
R^2 = 1 - chisq / Sum [ w_i * (y_i - mean(y))^2 ].
In the case of weighted data, the mean(y) is also a weighted mean:
mean(y) = 1/sum [ w_i ] * sum [ w_i y_i ]
I looked at the GNU R source to verify all this.
>
> --
> Brian Gough
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: computing R-squared with gsl_multifit
2007-09-26 16:18 ` Dirk Eddelbuettel
@ 2007-09-26 16:26 ` Patrick Alken
2007-09-26 16:43 ` Dirk Eddelbuettel
0 siblings, 1 reply; 7+ messages in thread
From: Patrick Alken @ 2007-09-26 16:26 UTC (permalink / raw)
To: gsl-discuss
> Yes, as it relates 'explained sum of squares' to 'total sum of
> squares', so it just uses residuals, irrespective of whether these
> were computed with a unit vector of weights, or with actual weights.
>
> Wikipedia is not a bad start:
> http://en.wikipedia.org/wiki/R-squared
Well, that wiki page says:
"If fitting is by weighted least squares or generalized least squares,
alternative versions of R2 can be calculated appropriate to those
statistical frameworks,"
If you use the weights in calculating the residuals then it seems
you'd have to use them in the total sum of squares too. In the GNU
R source, the file src/library/stats/R/lm.R has the code which
calculates R^2 and it definitely takes weights into account for
both the residuals and the total sum of squares.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: computing R-squared with gsl_multifit
2007-09-26 16:26 ` Patrick Alken
@ 2007-09-26 16:43 ` Dirk Eddelbuettel
0 siblings, 0 replies; 7+ messages in thread
From: Dirk Eddelbuettel @ 2007-09-26 16:43 UTC (permalink / raw)
To: Patrick Alken; +Cc: gsl-discuss
On Wed, Sep 26, 2007 at 10:26:29AM -0600, Patrick Alken wrote:
> > Yes, as it relates 'explained sum of squares' to 'total sum of
> > squares', so it just uses residuals, irrespective of whether these
> > were computed with a unit vector of weights, or with actual weights.
> >
> > Wikipedia is not a bad start:
> > http://en.wikipedia.org/wiki/R-squared
>
> Well, that wiki page says:
>
> "If fitting is by weighted least squares or generalized least squares,
> alternative versions of R2 can be calculated appropriate to those
> statistical frameworks,"
>
> If you use the weights in calculating the residuals then it seems
> you'd have to use them in the total sum of squares too. In the GNU
> R source, the file src/library/stats/R/lm.R has the code which
> calculates R^2 and it definitely takes weights into account for
> both the residuals and the total sum of squares.
Aiee. Thanks for checking, and correcting my off-the-cuff hint.
Dirk
--
Three out of two people have difficulties with fractions.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: computing R-squared with gsl_multifit
2007-09-26 16:20 ` Patrick Alken
@ 2007-10-02 8:35 ` Brian Gough
0 siblings, 0 replies; 7+ messages in thread
From: Brian Gough @ 2007-10-02 8:35 UTC (permalink / raw)
To: gsl-discuss
At Wed, 26 Sep 2007 10:20:05 -0600,
Patrick Alken wrote:
>
> > A question: is the formula the same for both weighted and unweighted
> > fits?
>
> No, unfortunately.
>
> R^2 = 1 - chisq / Sum [ w_i * (y_i - mean(y))^2 ].
>
> In the case of weighted data, the mean(y) is also a weighted mean:
>
> mean(y) = 1/sum [ w_i ] * sum [ w_i y_i ]
>
> I looked at the GNU R source to verify all this.
As a starting point I've added gsl_stats_ss and gsl_stats_wss for
computing the unweighted/weighted sum of squares.
Given chisq and the data this is sufficient for computing R^2 with the
formula above.
--
Brian Gough
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-10-02 8:35 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-25 19:46 computing R-squared with gsl_multifit Patrick Alken
2007-09-26 15:52 ` Brian Gough
2007-09-26 16:18 ` Dirk Eddelbuettel
2007-09-26 16:26 ` Patrick Alken
2007-09-26 16:43 ` Dirk Eddelbuettel
2007-09-26 16:20 ` Patrick Alken
2007-10-02 8:35 ` Brian Gough
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).