public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed
* Re: multivariate gaussian distribution (Code)
@ 2003-12-29 10:41 Przemyslaw Sliwa
  0 siblings, 0 replies; 10+ messages in thread
From: Przemyslaw Sliwa @ 2003-12-29 10:41 UTC (permalink / raw)
  To: gsl-discuss

Dear All,

There are several thing I cannot understand in the code. I suppose it has
been designed to handle real random vectors for multivariate normal
distribution. Therefore at least two points:

1. Why did you (Emmanuel) use the Eigenvalue decomposition of the
covariance matrix. This method is extremely inefficient with the gsl
eigenvalues code. One shall (I do) use the Cholesky decomposition of the
covariance matrix in order to compute the lower triangular matrix L of the
form Cov = LL' This procedure is described in several books e.g.
Harville's "Matrix Algebra from a Statistician's perspective".

2. Why do you use this Box Mueller Algorithm? My point is: if you have a
vector X of independent, normally distributed variables (Covariance matrix
equals identity matrix) the product P = LX is always normally distributed
with covariance matrix Cov, since (assuming E(X) = 0) E(PP') = E(LX(LX)')
= E(LXX'L) = LIL' = Cov. Similar to this method one can simulate the whole
family of elliptically countered distributions (like Bessel, generalized
Lapalce, t-distributions).

Did I miss something?

Thanks for explanations,

Kind regards,

Przem



^ permalink raw reply	[flat|nested] 10+ messages in thread
* Re: multivariate gaussian distribution (Code)
@ 2003-12-30  9:47 Emmanuel Benazera
  2003-12-30 10:55 ` Przemyslaw Sliwa
  2003-12-30 11:31 ` Przemyslaw Sliwa
  0 siblings, 2 replies; 10+ messages in thread
From: Emmanuel Benazera @ 2003-12-30  9:47 UTC (permalink / raw)
  To: sliwa; +Cc: gsl-discuss

It is easy to implement the Cholesky method. I'll do it if
people are interested.

Once again though, the R statistical package 
(that is a well-known library) uses the eigenvalues decomposition:
http://rweb.stat.umn.edu/R/library/MASS/html/mvrnorm.html

Please take time to read the messages !

Emmanuel



^ permalink raw reply	[flat|nested] 10+ messages in thread
* Re: multivariate gaussian distribution (Code)
@ 2003-12-29 23:02 Przemyslaw Sliwa
  0 siblings, 0 replies; 10+ messages in thread
From: Przemyslaw Sliwa @ 2003-12-29 23:02 UTC (permalink / raw)
  To: gsl-discuss

Hi Emmanuel,

Ok. First thing - i hardly believe that it was written somewhere that
the eigendecomposition is more stable than the Cholesky one. The
Cholesky decomposition i so stable that it can be used to determine if
the matrix is positive definite or not (see "Numerical Reciepes in
..."). The only problem might arise when the eigenvalues of the matrix
are very close to zero. The NAG uses some special matrix balancing
procedure. This is what is missing in GSL.

I personally use the multivariate normal distribution for generating a
multivariate GARCH model. In my case the multivariate time series Xt has
(conditionally) normal distribution. This means that the covariance matrix
chabges all the time. Generating a time series of length 60 I have to
compute 60 various covariance matices. Since I use the monte carlo I
repeat this experiment many times (several milions). In one run I have to
compute about 1bln of random vectors which are normally distributed. Of
course additionally I have to decompose 1bln of covariance matrices. Now,
imagine I have to use the inefficient EIGNECODE from GSL. He, he, he...
Hundreds of years. And with the quick Cholesky code just couple of hours.

Now "my" algorithm:

1. suppose that one uses the nacr() function in order to generate the
(independent) random vector. Due to the property of the multivariate
normal distribution (if all marginal distributions are independent normal
then the join distribution is multivariate normal too) one just generates
the vector V[k] as

  for(i=0; i<DIMENSION; i++) V[i]=nacr();
Now the vector has the multivariate distribution with the cvariance matrix
given by the isentity matrix.
2. Decompose the desired covariance matrix Cov using the Cholesky
decomposition obtainig the lower triangular matrix L, so that Cov=LL'
3. Now the new vector P = LV is normally distributed with mean 0 and
covariance matrix Cov, formally P~N(0,Cov).

This algorithm can be applied only to the normal distribution since it
uses the above property. It holds only for the multivariate normal
distribution and for other distributions another algorithm must be used.
But I do not know if you are interested in it. Below one piece of code
which could help you to understand the three steps.

Cheeres,

Przem

      int rl;

      do
	{
	  rl++;
	  for (i = 0; i < PERMUTATION; i++)
	    gsl_vector_set (LAST_NU, i,
			    gsl_vector_get (LAST_E,
					    a[i]) * gsl_vector_get (LAST_E,
								    b[i]));

	  for (size_t ii = 0; ii < PERMUTATION; ii++)
	    {
	      gsl_vector_set (vector_temp_1, ii,
			      gsl_matrix_get (A, ii,
					      ii) * gsl_vector_get (LAST_NU,
								    ii));
	      gsl_vector_set (vector_temp_2, ii,
			      gsl_matrix_get (B, ii,
					      ii) * gsl_vector_get (LAST_h,
								    ii));
	      gsl_vector_set (h, ii,
			      gsl_vector_get (omega,
					      ii) +
			      gsl_vector_get (vector_temp_1,
					      ii) +
			      gsl_vector_get (vector_temp_2, ii));
	    }

	  index = 1;

	  for (long unsigned ii = 1; ii <= size; ii++)
	    for (long unsigned jj = ii; jj <= size; jj++, index++)
	      gsl_matrix_set (H, jj - 1, ii - 1,
			      gsl_vector_get (h, index - 1));

	  for (long unsigned ii = 0; ii < size; ii++)
	    for (long unsigned jj = ii; jj < size; jj++)
	      gsl_matrix_set (H, ii, jj, gsl_matrix_get (H, jj, ii));

	  ifail = gsl_linalg_cholesky_decomp (H);
	  assert (ifail == 0);

	  for (i = 0; i < DIMENSION; i++)
	    gsl_vector_set (XI, i, nacr ());

	  gsl_vector_memcpy (E, XI);

	  gsl_blas_dtrmv (CblasLower, CblasNoTrans, CblasNonUnit, H, E);
	  gsl_vector_memcpy (OBSERVED_E, E);

	  for (i = 0; i < PERMUTATION; i++)
	    gsl_vector_set (z, i,
			    (1.0 - lambda) * gsl_vector_get (z,
							     i) +
			    lambda * (gsl_vector_get (OBSERVED_E, a[i]) *
						  gsl_vector_get (OBSERVED_E, b[i])));

	  for (i = 0; i < PERMUTATION; i++)
	    gsl_vector_set (CENTRED, i,
			    gsl_vector_get (z, i) - gsl_vector_get (MU_TAU,
								    i));

	  if (rl <= MAXLENGTH)
	    gsl_blas_dgemv (CblasNoTrans, 1.0, SIGMA_INVERSE[rl - 1],
			    CENTRED, 0.0, vector_temp_1);
	  else
	    gsl_blas_dgemv (CblasNoTrans, 1.0, SIGMA_INVERSE[MAXLENGTH - 1],
			    CENTRED, 0.0, vector_temp_1);

	  gsl_blas_ddot (CENTRED, vector_temp_1, &mahalanobis);

	  if (mahalanobis < 0)
	    printf ("%f\n", mahalanobis);

	  /******ALL SHIFTS************/

	  gsl_vector_memcpy (LAST_h, h);
	  gsl_vector_memcpy (LAST_E, E);

      /***************************/

	}
      while (rl <= 999999);

PS. It simulates one milion of time series with condtional normally
distributed vectors Et.

Sorry for the mess in the code, but I have just used a part of one function.


^ permalink raw reply	[flat|nested] 10+ messages in thread
* Re: multivariate gaussian distribution (Code)
@ 2003-12-29 17:54 Emmanuel Benazera
  0 siblings, 0 replies; 10+ messages in thread
From: Emmanuel Benazera @ 2003-12-29 17:54 UTC (permalink / raw)
  To: sliwa; +Cc: gsl-discuss

Hi Przem,

> 1. Why did you (Emmanuel) use the Eigenvalue decomposition of the
> covariance matrix. This method is extremely inefficient with the gsl
> eigenvalues code. One shall (I do) use the Cholesky decomposition of the
> covariance matrix in order to compute the lower triangular matrix L of the
> form Cov = LL' This procedure is described in several books e.g.
> Harville's "Matrix Algebra from a Statistician's perspective".

As stated before, I read that eigenvalue decomposition was 'stablier' (...)
than Cholesky. However, I'll be interested implementing a faster 
algorithm. I don't have this book at hand. Could you describe this
procedure that uses the Cholesky decomposition ?

> 2. Why do you use this Box Mueller Algorithm? My point is: if you have a
> vector X of independent, normally distributed variables (Covariance matrix
> equals identity matrix) the product P = LX is always normally distributed
> with covariance matrix Cov, since (assuming E(X) = 0) E(PP') = E(LX(LX)')
> = E(LXX'L) = LIL' = Cov. Similar to this method one can simulate the whole
> family of elliptically countered distributions (like Bessel, generalized
> Lapalce, t-distributions).

Przem, I'm not sure I understand your point. The vector of independent variables 
needs to be generated at some point. Therefore the BM algorithm is used,
or the ratio method. Again, I'm not sure I got your point. I think the method
may be used for sampling from several other multi-dimensional distributions. Please
let me know your sources, I'll be happy to implement these algorithms.

Cheers,

Emmanuel

P.S.: Przem, please answer to me as well as to the list. Thanks.


^ permalink raw reply	[flat|nested] 10+ messages in thread
* Re: multivariate gaussian distribution (Code)
@ 2003-12-29 11:52 Przemyslaw Sliwa
  0 siblings, 0 replies; 10+ messages in thread
From: Przemyslaw Sliwa @ 2003-12-29 11:52 UTC (permalink / raw)
  To: gsl-discuss

Sure,

The Cholesky decomposition will fail for matrices which are not positive
definite (imdefinite). LU decomposition is applicable to all matrices.
So the general requirement (symmetric, positive definite) is not ensured
when using LU decomposition.


Cheers

Przem


^ permalink raw reply	[flat|nested] 10+ messages in thread
* multivariate gaussian distribution
@ 2003-12-20 11:10 Emmanuel Benazera
  2003-12-22 14:21 ` Brian Gough
  0 siblings, 1 reply; 10+ messages in thread
From: Emmanuel Benazera @ 2003-12-20 11:10 UTC (permalink / raw)
  To: gsl-discuss

Hi,

I've written a few lines of code that implement a multivariate
gaussian distribution. The function takes the eigenvalues/vectors
decomposition of the covariance matrix as input, and outputs a
random vector. In dimension n, it proceeds to n gsl_ran_gaussian
calls, plus a blas level 2 product. It fits in the GSL framework.

I guess it may be useful to other GSL users. Should it be added
to the library ? I can provide the piece of code + doc + theoretical 
reference.

Regards,

Emmanuel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2003-12-30 11:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-29 10:41 multivariate gaussian distribution (Code) Przemyslaw Sliwa
  -- strict thread matches above, loose matches on Subject: below --
2003-12-30  9:47 Emmanuel Benazera
2003-12-30 10:55 ` Przemyslaw Sliwa
2003-12-30 11:31 ` Przemyslaw Sliwa
2003-12-29 23:02 Przemyslaw Sliwa
2003-12-29 17:54 Emmanuel Benazera
2003-12-29 11:52 Przemyslaw Sliwa
2003-12-20 11:10 multivariate gaussian distribution Emmanuel Benazera
2003-12-22 14:21 ` Brian Gough
2003-12-28 10:29   ` multivariate gaussian distribution (Code) Emmanuel Benazera
2003-12-29 11:41     ` Brian Gough
2003-12-29 16:47       ` Emmanuel Benazera

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).