public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed
From: "Timothée Flutre" <timflutre@gmail.com>
To: Patrick Alken <patrick.alken@colorado.edu>
Cc: "gsl-discuss@sourceware.org" <gsl-discuss@sourceware.org>
Subject: Re: spearman coefficient
Date: Wed, 29 May 2013 14:44:00 -0000	[thread overview]
Message-ID: <CAGJVmu+u=B+_gQyJu2xpFFa-=dih2YDRY0tmKX_Rr0YsVwffvA@mail.gmail.com> (raw)
In-Reply-To: <51A53336.30801@colorado.edu>

Looks perfect, thanks a lot!

No problem. In fact I'm not using it myself a lot because I prefer
parametric modeling, but I did use it to reproduce results from other
people.

Timothée Flutre


On Tue, May 28, 2013 at 5:44 PM, Patrick Alken
<patrick.alken@colorado.edu> wrote:
> I've added gsl_stats_spearman to the repository and have tested it on a few
> sample datasets. I essentially rewrote the routine using octave and
> numerical recipes as examples, though I rewrote everything from scratch so
> there are no copyright issues.
>
> I added the function gsl_sort_vector2, similar to the numerical recipes
> sort2() function, which eliminates the need to allocate a permutation and
> sort vector. The workspace for the rank vectors is passed directly to the
> function so there is no need to allocate a separate workspace now.
>
> It is possible to write the function to calculate the rank vectors in-place
> in the data vectors, but I opted to keep those inputs untouched to stay
> consistent with the rest of the statistics routines. The user must pass in a
> workspace of size 2*n.
>
> I put the function in statistics/covariance_source.c so it will be defined
> with all the different types (float,double,int,short,etc) and its documented
> in the manual.
>
> I'm sorry I wasn't able to directly use a lot of your code, but I do think
> this implementation is much more consistent with the rest of the library
> design. If you are using this function regularly in your work I would
> appreciate any feedback you can give (ie testing it with a wide range of
> inputs).
>
> Patrick
>
>
> On 05/25/2013 03:25 PM, Timothée Flutre wrote:
>>
>> Hi Patrick,
>>
>> thanks for your detailed reply. (I don't know why I didn't received
>> your email, I had to check the GSL mailing list archive to see it,
>> that's why I'm answering directly to you this time.)
>>
>> About introducing a new workspace, I did it based on your advice from last
>> year:
>> http://sourceware.org/ml/gsl-discuss/2012-q1/msg00011.html
>>
>> I don't have a strong opinion on what is the best, but someone else
>> commented on my code and also thought that it would be better to have
>> a workspace:
>> https://gist.github.com/timflutre/1784199#comment-82458
>>
>> Maybe the code could offer two functions, with or without the
>> workspace? In this case, is there any guidelines to name the
>> functions?
>>
>> I had a look at the implementation in R. The description of the
>> interface is here:
>> http://stat.ethz.ch/R-manual/R-patched/library/stats/html/cor.html).
>>
>> Even though it indicates that the argument "method" can take the value
>> "spearman", I don't see it anymore in the R code and thus I am a bit
>> confused by their implementation:
>> https://github.com/wch/r-source/blob/trunk/src/library/stats/R/cor.R#L21
>>
>> Moreover, the R code calls C code:
>>
>> https://github.com/wch/r-source/blob/trunk/src/library/stats/src/cov.c#L623
>>
>> The file with the C code has several macros and functions to compute
>> covariance or correlation, to handle missing data in different ways,
>> to deal with Pearson, Spearman and Kendall coefficients, etc. All this
>> makes it really hard for me to understand it...
>>
>> Finally, I looked at the algorithm in Numerical Recipes in C, the pdf
>> of the book is available here:
>> www2.units.it/ipl/students_area/imm2/files/Numerical_Recipes.pdf‎
>>
>> However, the GSL web site says that we can't use algorithms from this
>> book because of the non-free license.
>>
>> Also, it seems to me that spear() from Numerical Recipe (pdf page 641)
>> uses the function srt2() (Quicksort with 2 arrays, page 334) which
>> seems to require to allocate another array, "istack". Therefore, at
>> the end, it doesn't seem to me that it's much better than my d and
>> perm vector, which have the advantage of using other functions of the
>> GSL (gsl_sort_vector and gsl_sort_vector_index).
>>
>> But again, I'm really not an expert programmer, in C or any other
>> language. So I tried to see how I could change my code based on what
>> you said but I don't see any obvious ways to do it (except copying the
>> code from Numerical Recipe).
>>
>> If you don't want to include the code as it is into the next release
>> of the GSL, I'm fine with that. Of course, if you have a better
>> understandng of all this and you can explain me what to do, I can try
>> to help.
>>
>> Best,
>>
>> Timothée Flutre
>
>

      reply	other threads:[~2013-05-29 14:44 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAGJVmuL7j3Z1jDJudqyNJXqHHQ9f=g6MJqRykd5LA_0x5PT=xw@mail.gmail.com>
2013-05-28 22:44 ` Patrick Alken
2013-05-29 14:44   ` Timothée Flutre [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGJVmu+u=B+_gQyJu2xpFFa-=dih2YDRY0tmKX_Rr0YsVwffvA@mail.gmail.com' \
    --to=timflutre@gmail.com \
    --cc=gsl-discuss@sourceware.org \
    --cc=patrick.alken@colorado.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).