* Help: working with large gsl_vector
@ 2004-12-11 7:44 Toan T Nguyen
2004-12-13 18:41 ` Linas Vepstas
2004-12-13 18:51 ` Fabrice Rossi
0 siblings, 2 replies; 7+ messages in thread
From: Toan T Nguyen @ 2004-12-11 7:44 UTC (permalink / raw)
To: gsl-discuss
Hi,
I'm using the multidimensional minimization procedure in GSL and have problem
with large vectors. The dimensionality of my vectors is 300000 which means my
index variable is of long integer type. gsl_vector, however, works only with
size_t type only. I couldnot even call gsl_vector_alloc(300000) . It says:
gsl: init_source.c:29: ERROR: vector length n must be positive integer
Default GSL error handler invoked.
Can somebody here give an advice? I'm quite new to GSL and I search the
mailing list archives but could not find an answer?
For clarity, my system is an Athlon64 using SuSE x86_64, gcc 3.3.3
Thanks a lot in advance,
Toan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Help: working with large gsl_vector
2004-12-11 7:44 Help: working with large gsl_vector Toan T Nguyen
@ 2004-12-13 18:41 ` Linas Vepstas
2004-12-13 18:51 ` Fabrice Rossi
1 sibling, 0 replies; 7+ messages in thread
From: Linas Vepstas @ 2004-12-13 18:41 UTC (permalink / raw)
To: Toan T Nguyen; +Cc: gsl-discuss
On Fri, Dec 10, 2004 at 11:51:19PM -0800, Toan T Nguyen was heard to remark:
>
> Hi,
>
> I'm using the multidimensional minimization procedure in GSL and have problem
> with large vectors. The dimensionality of my vectors is 300000 which means my
> index variable is of long integer type. gsl_vector, however, works only with
> size_t type only. I couldnot even call gsl_vector_alloc(300000) . It says:
double-check. size_t is normally a 32-bit int when compiling for a
32-bit libc, and 64 when compiling for a 64-bit libc. You can do
this with a
printf ("size is=%d\n", sizeof (size_t));
which will print the number of bytes.
> gsl: init_source.c:29: ERROR: vector length n must be positive integer
> Default GSL error handler invoked.
which is not to say there isn't a gsl bug, but you probably should do
more due diligence before offering the diagnosis.
--linas
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Help: working with large gsl_vector
2004-12-11 7:44 Help: working with large gsl_vector Toan T Nguyen
2004-12-13 18:41 ` Linas Vepstas
@ 2004-12-13 18:51 ` Fabrice Rossi
2004-12-14 23:03 ` Toan T Nguyen
1 sibling, 1 reply; 7+ messages in thread
From: Fabrice Rossi @ 2004-12-13 18:51 UTC (permalink / raw)
To: gsl-discuss; +Cc: Toan T Nguyen
Toan T Nguyen wrote:
> I'm using the multidimensional minimization procedure in GSL and have problem
> with large vectors. The dimensionality of my vectors is 300000 which means my
> index variable is of long integer type.
Hum. I authored the initial version of the multidimensional minimization
functions which have been improved by others, but I don't think they
have been improved in such a way that optimization in dimension 300 000
is possible. Even for a quadratic function and using a conjugate
gradient algorithm, it will take 3e5 iterations to the algorithm to
complete. As each iteration implies a linear search for minimum that
will use at least around 10 evaluations of your function, your a looking
for 3e6 function evaluation. Unless you have a very special problem, I
would guess that a function evaluation take at least 3e5 operations to
complete, which implies 9e11 operations (a.k.a. 2^39 operations). That's
a lot. And we are talking about a quadratic form, remember ?
Perhaps you could tell us a little bit more about your application ?
Fabrice
PS : Support Vector Machine algorithms have proven that it is indeed
possible to do quadratic optimization under constraints in high
dimensional space (like 10000 dimensions), but they use _very_ complex
methods.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Help: working with large gsl_vector
2004-12-13 18:51 ` Fabrice Rossi
@ 2004-12-14 23:03 ` Toan T Nguyen
2004-12-15 11:12 ` Fabrice Rossi
0 siblings, 1 reply; 7+ messages in thread
From: Toan T Nguyen @ 2004-12-14 23:03 UTC (permalink / raw)
To: Fabrice Rossi; +Cc: gsl-discuss, help-gsl
Hi, thank you for the analysis. Apparently, my analytical calculation is able
to guess the minimum reasonably good, so I need only 20000 iterations to find
the minimum.
However, the strange thing is if I take the final minimized vector and run it
through the conjugate gradient routine again, the energy (my function to
minimize) keeps going down for a few more 1000s iterations (albeit it change
only the 4th significant digit). Do you know what's wrong?
I follow the example in the GSL document and set my success criteria as
status = gsl_multimin_test_gradient (s->gradient, 1e-20);
The initial setup of the vector is
T = gsl_multimin_fdfminimizer_conjugate_pr;
s = gsl_multimin_fdfminimizer_alloc (T, 3*nparts);
gsl_multimin_fdfminimizer_set (s, &energy, x, 0.001, 1e-12);
For clarity, I used dimensionless units so ALL the nearest neighbor distances
and the coupling constants are of the order of unity. nparts is the number of
particles which is about 100000.
Could you give some advises? Thanks a lot in advance.
Toan
On Monday 13 December 2004 10:50, Fabrice Rossi wrote:
> Toan T Nguyen wrote:
> > I'm using the multidimensional minimization procedure in GSL and have
> > problem with large vectors. The dimensionality of my vectors is 300000
> > which means my index variable is of long integer type.
>
> Hum. I authored the initial version of the multidimensional minimization
> functions which have been improved by others, but I don't think they
> have been improved in such a way that optimization in dimension 300 000
> is possible. Even for a quadratic function and using a conjugate
> gradient algorithm, it will take 3e5 iterations to the algorithm to
> complete. As each iteration implies a linear search for minimum that
> will use at least around 10 evaluations of your function, your a looking
> for 3e6 function evaluation. Unless you have a very special problem, I
> would guess that a function evaluation take at least 3e5 operations to
> complete, which implies 9e11 operations (a.k.a. 2^39 operations). That's
> a lot. And we are talking about a quadratic form, remember ?
>
> Perhaps you could tell us a little bit more about your application ?
>
> Fabrice
>
> PS : Support Vector Machine algorithms have proven that it is indeed
> possible to do quadratic optimization under constraints in high
> dimensional space (like 10000 dimensions), but they use _very_ complex
> methods.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Help: working with large gsl_vector
2004-12-14 23:03 ` Toan T Nguyen
@ 2004-12-15 11:12 ` Fabrice Rossi
2004-12-15 12:11 ` Robert G. Brown
0 siblings, 1 reply; 7+ messages in thread
From: Fabrice Rossi @ 2004-12-15 11:12 UTC (permalink / raw)
To: gsl-discuss
Toan T Nguyen wrote:
> Hi, thank you for the analysis. Apparently, my analytical calculation is able
> to guess the minimum reasonably good, so I need only 20000 iterations to find
> the minimum.
Nice!
> However, the strange thing is if I take the final minimized vector and run it
> through the conjugate gradient routine again, the energy (my function to
> minimize) keeps going down for a few more 1000s iterations (albeit it change
> only the 4th significant digit). Do you know what's wrong?
It might be related to the restarting of the algorithm. A conjugate
gradient calculates is descent direction thanks to the gradient at the
current point and to the previously used direction. The first descent
direction is (minus) the gradient alone. When you restart the algorithm,
it will use the gradient rather than the descent direction it should
have used.
> I follow the example in the GSL document and set my success criteria as
>
> status = gsl_multimin_test_gradient (s->gradient, 1e-20);
1e-20 is very very small, maybe to close to machine precision to give
reliable results. But I'm far from being an expert in this domain. What
look strange is that if the algorithm stops because of this criteria, I
don't know why it can continue again after. It would mean that the norm
of the gradient is going up after having reached 1e-20 once. This might
be related to numerical errors.
> The initial setup of the vector is
>
> T = gsl_multimin_fdfminimizer_conjugate_pr;
> s = gsl_multimin_fdfminimizer_alloc (T, 3*nparts);
> gsl_multimin_fdfminimizer_set (s, &energy, x, 0.001, 1e-12);
1e-12 is again quite small.
> For clarity, I used dimensionless units so ALL the nearest neighbor distances
> and the coupling constants are of the order of unity. nparts is the number of
> particles which is about 100000.
>
> Could you give some advises? Thanks a lot in advance.
>
> Toan
I think that you are basically puting the code under some pressure,
because of the size of your problem and of the very low limits you are
using. To my knowledge, it has not been tested under such constraints
and it's therefore difficult to give sensible advice.
Fabrice
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Help: working with large gsl_vector
2004-12-15 11:12 ` Fabrice Rossi
@ 2004-12-15 12:11 ` Robert G. Brown
2004-12-15 17:31 ` Linas Vepstas
0 siblings, 1 reply; 7+ messages in thread
From: Robert G. Brown @ 2004-12-15 12:11 UTC (permalink / raw)
To: Fabrice Rossi; +Cc: gsl-discuss
On Wed, 15 Dec 2004, Fabrice Rossi wrote:
> Toan T Nguyen wrote:
> > Hi, thank you for the analysis. Apparently, my analytical calculation is able
> > to guess the minimum reasonably good, so I need only 20000 iterations to find
> > the minimum.
>
> Nice!
>
> > However, the strange thing is if I take the final minimized vector and run it
> > through the conjugate gradient routine again, the energy (my function to
> > minimize) keeps going down for a few more 1000s iterations (albeit it change
> > only the 4th significant digit). Do you know what's wrong?
>
> It might be related to the restarting of the algorithm. A conjugate
> gradient calculates is descent direction thanks to the gradient at the
> current point and to the previously used direction. The first descent
> direction is (minus) the gradient alone. When you restart the algorithm,
> it will use the gradient rather than the descent direction it should
> have used.
A second possibility is that if you save the final minimized vector at
some precision and reload it, that you are beginning a gradient decent
from a new point in the neighborhood of the old point. This is, in
fact, how deterministic chaos was discovered in a manner of speaking.
If the problem is one with high complexity, this may be enough to push
it to a different local minimum in the particular well where the first
local minimum is found.
Note that I say local minimum. Unless you have a really sound
theoretical basis for assuming a single, smooth global minimum in any
problem with high dimensionality, you should presume as a default that
the topology of the minimization surface is neither monotonic nor
smooth, and that multiple local minima exist. In fact, local minima may
even be "dense" (or nearly so) on the underlying space so that nearly
ANY minimum you find by merely going downhill is a local minimum, and
may not even be a "good" minimum.
Think rugged landscape. Gradient decent methods basically go downhill
from whereever they are started. From many/most starting points, this
will take you down to the nearest footprint in the sand, to a nearby
hole in the ground, to the bottom of a pond. It is relatively unlikely
to take you to the bottom of the deepest ocean trench. And if you move
the starting point over just a little bit, you're as likely to go down
into a DIFFERENT footprint in the sand as back into the same one -- it's
still a local minimum, just a different one.
There are various easy ways to test for this sort of behavior. The
easiest one is to just run the code lots of times with completely random
starting data and see if you go down to the same minimum every time.
Also look at the distribution of minima that you get -- are they all
about the same quality or is there one that is occassionally much better
than the others.
If the latter is the case, you'll have to resort to much more powerful
and complex methods to get a "good" minimum on your space. Most likely
genetic algorithms and/or simulated annealing.
rgb
>
> > I follow the example in the GSL document and set my success criteria as
> >
> > status = gsl_multimin_test_gradient (s->gradient, 1e-20);
>
> 1e-20 is very very small, maybe to close to machine precision to give
> reliable results. But I'm far from being an expert in this domain. What
> look strange is that if the algorithm stops because of this criteria, I
> don't know why it can continue again after. It would mean that the norm
> of the gradient is going up after having reached 1e-20 once. This might
> be related to numerical errors.
>
> > The initial setup of the vector is
> >
> > T = gsl_multimin_fdfminimizer_conjugate_pr;
> > s = gsl_multimin_fdfminimizer_alloc (T, 3*nparts);
> > gsl_multimin_fdfminimizer_set (s, &energy, x, 0.001, 1e-12);
>
> 1e-12 is again quite small.
>
> > For clarity, I used dimensionless units so ALL the nearest neighbor distances
> > and the coupling constants are of the order of unity. nparts is the number of
> > particles which is about 100000.
> >
> > Could you give some advises? Thanks a lot in advance.
> >
> > Toan
>
> I think that you are basically puting the code under some pressure,
> because of the size of your problem and of the very low limits you are
> using. To my knowledge, it has not been tested under such constraints
> and it's therefore difficult to give sensible advice.
>
> Fabrice
>
>
--
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Help: working with large gsl_vector
2004-12-15 12:11 ` Robert G. Brown
@ 2004-12-15 17:31 ` Linas Vepstas
0 siblings, 0 replies; 7+ messages in thread
From: Linas Vepstas @ 2004-12-15 17:31 UTC (permalink / raw)
To: Robert G. Brown; +Cc: Fabrice Rossi, gsl-discuss
On Wed, Dec 15, 2004 at 08:07:38AM -0500, Robert G. Brown was heard to remark:
>
> Note that I say local minimum. Unless you have a really sound
> theoretical basis for assuming a single, smooth global minimum in any
> problem with high dimensionality, you should presume as a default that
> the topology of the minimization surface is neither monotonic nor
> smooth, and that multiple local minima exist. In fact, local minima may
> even be "dense" (or nearly so) on the underlying space so that nearly
> ANY minimum you find by merely going downhill is a local minimum, and
> may not even be a "good" minimum.
i.e the bottom may be fractal. With that many free variables,
it wouldn't be a surprise.
> There are various easy ways to test for this sort of behavior. The
> easiest one is to just run the code lots of times with completely random
> starting data and see if you go down to the same minimum every time.
> Also look at the distribution of minima that you get -- are they all
> about the same quality or is there one that is occassionally much better
> than the others.
>
> If the latter is the case, you'll have to resort to much more powerful
> and complex methods to get a "good" minimum on your space. Most likely
> genetic algorithms and/or simulated annealing.
And if the bottom of the thing is fractal, asking for the minimum might
be "the wrong question".
Just because the problem seems to involve analtyic relations doesn't
mean its not fractal. Weierstrass in 1872 considered the sum
Sum_n=0^\infty a^n sin (\pi x b^n)
which is insanely fractal. Running the sum from n=0 to "merely" n=2000
and looking for the minimum would lead to numerical insanity.
--linas
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-12-15 17:31 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-11 7:44 Help: working with large gsl_vector Toan T Nguyen
2004-12-13 18:41 ` Linas Vepstas
2004-12-13 18:51 ` Fabrice Rossi
2004-12-14 23:03 ` Toan T Nguyen
2004-12-15 11:12 ` Fabrice Rossi
2004-12-15 12:11 ` Robert G. Brown
2004-12-15 17:31 ` Linas Vepstas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).