public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed
* switched sparse format to binary trees
@ 2014-04-28  1:32 Patrick Alken
  2014-04-28 22:35 ` Gerard Jungman
  0 siblings, 1 reply; 16+ messages in thread
From: Patrick Alken @ 2014-04-28  1:32 UTC (permalink / raw)
  To: gsl-discuss

I received a bug report from someone who found that the
gsl_spmatrix_set() function did not properly detect duplicate elements
(ie it could set the same (i,j) matrix element to more than 1 value).

This was related to the linear array scheme I had originally
implemented, which made it very infefficient to find duplicates (ie: a
linear search was required each time).

So I've changed the storage scheme for the triplet representation to a
binary tree (specifically the AVL balanced binary tree). I took code
from the GNU libavl library for the tree manipulation stuff.

This actually makes a huge speed improvement in duplicate detection and
element retrieval for large sparse matrices. There may however be a
performance hit since a new tree node must be allocated each time an
element is added to the matrix. But I believe the tradeoff is worth it.

Anyone using gsl_spmatrix in their code will need to recompile since the
binary format of that struct has changed.

Patrick

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: switched sparse format to binary trees
  2014-04-28  1:32 switched sparse format to binary trees Patrick Alken
@ 2014-04-28 22:35 ` Gerard Jungman
  2014-04-29  1:13   ` Patrick Alken
  0 siblings, 1 reply; 16+ messages in thread
From: Gerard Jungman @ 2014-04-28 22:35 UTC (permalink / raw)
  To: gsl-discuss


On 04/27/2014 07:32 PM, Patrick Alken wrote:

> This actually makes a huge speed improvement in duplicate detection and
> element retrieval for large sparse matrices. There may however be a
> performance hit since a new tree node must be allocated each time an
> element is added to the matrix. But I believe the tradeoff is worth it.


For what it's worth, the right thing to do is probably
to expose the avl allocation strategy to the GSL client
in some way. It could be as simple as an extra argument
to gsl_spmatrix_alloc(), which could be a pointer to a
generic allocator strategy struct (similar or identical
to 'struct libavl_allocator'). This is the most general
solution, which requires that clients who care be able
to write their own allocator.

As an aid, GSL could also supply a replacement allocator
strategy object, maybe based on a pool allocator with
a variable amortization size. This would be in addition
to the default (and now implicit by design) malloc/free
strategy.


This opens a generic issue regarding GSL containers,
which is one of the many reasons I avoid them.
That is, the allocation strategy is not exposed. You can
trace the allocations down to the gsl_block level, where
there is a hard-coded malloc(). Very sad.

This could all be fixed by adding gsl container allocation
functions (parallel to the current xxx_alloc functions)
which take an extra argument (something like pointer to
a struct of function pointers) for the allocation
strategy. The extra argument could be passed down
to the gsl_block level where it is invoked.

On the other hand, all this needs to be studied a bit.
My understanding is that modern linux implementations
will use mmap() for large enough allocations. This
will change the tradeoffs. I have no idea what
other platforms do. It's not just your old
malloc anymore...

--
G. Jungman

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: switched sparse format to binary trees
  2014-04-28 22:35 ` Gerard Jungman
@ 2014-04-29  1:13   ` Patrick Alken
  2014-04-29  8:40     ` Frank Reininghaus
  0 siblings, 1 reply; 16+ messages in thread
From: Patrick Alken @ 2014-04-29  1:13 UTC (permalink / raw)
  To: gsl-discuss

On 04/28/2014 04:35 PM, Gerard Jungman wrote:
> 
> On 04/27/2014 07:32 PM, Patrick Alken wrote:
> 
>> This actually makes a huge speed improvement in duplicate detection and
>> element retrieval for large sparse matrices. There may however be a
>> performance hit since a new tree node must be allocated each time an
>> element is added to the matrix. But I believe the tradeoff is worth it.
> 
> 
> For what it's worth, the right thing to do is probably
> to expose the avl allocation strategy to the GSL client
> in some way. It could be as simple as an extra argument
> to gsl_spmatrix_alloc(), which could be a pointer to a
> generic allocator strategy struct (similar or identical
> to 'struct libavl_allocator'). This is the most general
> solution, which requires that clients who care be able
> to write their own allocator.

Interesting idea...I'll think a bit more on it. Unfortunately I don't
think there is any easy way to preallocate a binary tree. Even if the
user knows ahead of time how many elements they want to add to the
sparse matrix, its not so easy to preallocate the whole tree, since
after each insertion, the tree rebalances itself to minimize its height.

I did do some tests on my own machine, inserting 2.5M elements into a
sparse matrix and profiling the code. I found that the avl allocation of
a new node was insignificant, compared to searching the tree each time
to check for duplicate elements.

I also found a study online where people benchmarked various sparse
matrix storage strategies. The other main contender is to use a hash
table, which is what matlab does for its sparse matrices. For that
approach you need to develop a strategy to deal with hash collisions,
which can happen frequently for huge matrices.

Anyway the study found that AVL trees were much superior to hash tables
in terms of speed. The study is here:

http://link.springer.com/chapter/10.1007%2F978-3-540-25944-2_135

Anyway, its a good idea to give the user more control over the
allocation, so I will think some more on how to do this elegantly.

Patrick

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: switched sparse format to binary trees
  2014-04-29  1:13   ` Patrick Alken
@ 2014-04-29  8:40     ` Frank Reininghaus
  2014-04-29 19:15       ` Gerard Jungman
  0 siblings, 1 reply; 16+ messages in thread
From: Frank Reininghaus @ 2014-04-29  8:40 UTC (permalink / raw)
  To: gsl-discuss

Hi,

2014-04-29 3:13 GMT+02:00 Patrick Alken:
> On 04/28/2014 04:35 PM, Gerard Jungman wrote:
>>
>> On 04/27/2014 07:32 PM, Patrick Alken wrote:
>>
>>> This actually makes a huge speed improvement in duplicate detection and
>>> element retrieval for large sparse matrices. There may however be a
>>> performance hit since a new tree node must be allocated each time an
>>> element is added to the matrix. But I believe the tradeoff is worth it.
>>
>>
>> For what it's worth, the right thing to do is probably
>> to expose the avl allocation strategy to the GSL client
>> in some way. It could be as simple as an extra argument
>> to gsl_spmatrix_alloc(), which could be a pointer to a
>> generic allocator strategy struct (similar or identical
>> to 'struct libavl_allocator'). This is the most general
>> solution, which requires that clients who care be able
>> to write their own allocator.
>
> Interesting idea...I'll think a bit more on it. Unfortunately I don't
> think there is any easy way to preallocate a binary tree. Even if the
> user knows ahead of time how many elements they want to add to the
> sparse matrix, its not so easy to preallocate the whole tree, since
> after each insertion, the tree rebalances itself to minimize its height.

you can create an array of many tree nodes in a single allocation
though. The allocator struct could then store a pointer to this array,
the total size of the array, and the index of the next node that is
not in use yet.

When a new node is inserted into the tree, one would just take a
pointer to the next unused node instead of calling malloc, increase
the index in the allocator struct, and perform a new allocation only
if all nodes are in use. The advantage is that the amortized cost of
this approach is much lower than doing a malloc for each node, and it
will also save memory because malloc adds a certain amount of overhead
to each individual allocation for its internal bookkeeping. And saving
memory for each individual node also means that the entire tree will
become more cache-friendly and thus possibly more performant.

If individual nodes could be deleted at some point, then the allocator
needed some bookkeeping of reusable nodes of its own, but the only
node deletion that can take place is that all nodes are deleted
together, then this would probably be quite straightforward to
implement.

Best regards,
Frank

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: switched sparse format to binary trees
  2014-04-29  8:40     ` Frank Reininghaus
@ 2014-04-29 19:15       ` Gerard Jungman
  2014-04-29 19:52         ` Patrick Alken
  0 siblings, 1 reply; 16+ messages in thread
From: Gerard Jungman @ 2014-04-29 19:15 UTC (permalink / raw)
  To: gsl-discuss

On 04/29/2014 02:39 AM, Frank Reininghaus wrote:

 > you can create an array of many tree nodes in a single allocation
 > though. The allocator struct could then store a pointer to this array,
 > the total size of the array, and the index of the next node that is
 > not in use yet.

Right. This is what I meant by a pool allocator. Just to clarify, if
GSL provides an allocator mechanism, it should be more general than
this, but it is a good idea to provide an implementation of something
like this that a client can just invoke if they want. It also gives
clients an example implementation that they can use to create their
own variations when necessary.

This is why the avl library provides such a mechanism themselves.
They know that some clients really need it.

 > And saving memory for each individual node also means
 > that the entire tree will become more cache-friendly
 > and thus possibly more performant.

Even more important than the memory savings is the control that the
client gains over cache coherency and other memory layout issues.
The main lossage with malloc is that there is only one. So allocations
of different kinds of memory can get interleaved, and it may become
impossible to control your layout and caching. If the allocations
are not exposed, then the client is wedged.

 > If individual nodes could be deleted at some point, then the
 > allocator needed some bookkeeping of reusable nodes of its own,
 > but the only node deletion that can take place is that all nodes
 > are deleted together, then this would probably be quite
 > straightforward to implement.

In practice, you are probably right, the nodes would all
be deleted together. So we do not have to worry about it
too much.

For the sake of completeness though, it is worth pointing
out at least one simple way to handle the general case.
An allocator can implement a lazy-deletion strategy,
where the pool is allowed to fragment, until the
client requests a repacking. You can also trigger
a repacking when a certain fragmentation level is
reached, although this kind of approach (garbage collection)
is somewhat dangerous if the client loses control of when
it happens. In any case, it is not hard to implement this
sort of thing.

Anyway, the main point is that once a general mechanism
exists, the client can create any arbitrarily complicated
scheme for themselves. They get to take control of
allocations, if they need to.


On 04/28/2014 07:13 PM, Patrick Alken wrote:

 > Anyway the study found that AVL trees were much superior to hash
 > tables in terms of speed. The study is here:
 >
 > http://link.springer.com/chapter/10.1007%2F978-3-540-25944-2_135

It's a shame they don't explain why this is happening.
Those large factors are kind of nuts.

--
G. Jungman

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: switched sparse format to binary trees
  2014-04-29 19:15       ` Gerard Jungman
@ 2014-04-29 19:52         ` Patrick Alken
  2014-04-29 22:07           ` Gerard Jungman
  0 siblings, 1 reply; 16+ messages in thread
From: Patrick Alken @ 2014-04-29 19:52 UTC (permalink / raw)
  To: gsl-discuss

Thanks Gerard and Frank,

   I've implemented the block allocation strategy so there are no more 
malloc/free calls during the matrix assembly process. This was made much 
easier by not deleting individual nodes. In principle, individual nodes 
could be deleted if the user sets that matrix element to 0, since the 
point of sparse matrix data structures is to keep track of only non-zero 
elements.

However, I think this is not common, and the simplicity of the memory 
management scheme outweighs the benefit of handling this rare case I think.

For now, the spmatrix routines internally handle all of the block 
allocation, node array pointers, etc. I don't think its really necessary 
to expose this to the general user, since its already about as efficient 
as it will probably get, and most users (like myself) simply want a nice 
easy intuitive interface to build a matrix.

> It's a shame they don't explain why this is happening.
> Those large factors are kind of nuts.

Yes those times for the hash method do seem unreasonably large. I 
haven't been able to find many other papers on this subject for comparison.

On 04/29/2014 01:14 PM, Gerard Jungman wrote:
> On 04/29/2014 02:39 AM, Frank Reininghaus wrote:
>
>   > you can create an array of many tree nodes in a single allocation
>   > though. The allocator struct could then store a pointer to this array,
>   > the total size of the array, and the index of the next node that is
>   > not in use yet.
>
> Right. This is what I meant by a pool allocator. Just to clarify, if
> GSL provides an allocator mechanism, it should be more general than
> this, but it is a good idea to provide an implementation of something
> like this that a client can just invoke if they want. It also gives
> clients an example implementation that they can use to create their
> own variations when necessary.
>
> This is why the avl library provides such a mechanism themselves.
> They know that some clients really need it.
>
>   > And saving memory for each individual node also means
>   > that the entire tree will become more cache-friendly
>   > and thus possibly more performant.
>
> Even more important than the memory savings is the control that the
> client gains over cache coherency and other memory layout issues.
> The main lossage with malloc is that there is only one. So allocations
> of different kinds of memory can get interleaved, and it may become
> impossible to control your layout and caching. If the allocations
> are not exposed, then the client is wedged.
>
>   > If individual nodes could be deleted at some point, then the
>   > allocator needed some bookkeeping of reusable nodes of its own,
>   > but the only node deletion that can take place is that all nodes
>   > are deleted together, then this would probably be quite
>   > straightforward to implement.
>
> In practice, you are probably right, the nodes would all
> be deleted together. So we do not have to worry about it
> too much.
>
> For the sake of completeness though, it is worth pointing
> out at least one simple way to handle the general case.
> An allocator can implement a lazy-deletion strategy,
> where the pool is allowed to fragment, until the
> client requests a repacking. You can also trigger
> a repacking when a certain fragmentation level is
> reached, although this kind of approach (garbage collection)
> is somewhat dangerous if the client loses control of when
> it happens. In any case, it is not hard to implement this
> sort of thing.
>
> Anyway, the main point is that once a general mechanism
> exists, the client can create any arbitrarily complicated
> scheme for themselves. They get to take control of
> allocations, if they need to.
>
>
> On 04/28/2014 07:13 PM, Patrick Alken wrote:
>
>   > Anyway the study found that AVL trees were much superior to hash
>   > tables in terms of speed. The study is here:
>   >
>   > http://link.springer.com/chapter/10.1007%2F978-3-540-25944-2_135
>
> It's a shame they don't explain why this is happening.
> Those large factors are kind of nuts.
>
> --
> G. Jungman
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: switched sparse format to binary trees
  2014-04-29 19:52         ` Patrick Alken
@ 2014-04-29 22:07           ` Gerard Jungman
  2014-05-01 18:46             ` Patrick Alken
  0 siblings, 1 reply; 16+ messages in thread
From: Gerard Jungman @ 2014-04-29 22:07 UTC (permalink / raw)
  To: gsl-discuss

On 04/29/2014 01:52 PM, Patrick Alken wrote:
> However, I think this is not common, and the simplicity of the memory 
> management scheme outweighs the benefit of handling this rare case I 
> think.

Yeah. If a matrix has too many zeros, the client can always repack
it by copying it to a new sparse matrix. This is probably a
reasonable usage scenario.


> For now, the spmatrix routines internally handle all of the block 
> allocation, node array pointers, etc. I don't think its really 
> necessary to expose this to the general user, since its already about 
> as efficient as it will probably get, and most users (like myself) 
> simply want a nice easy intuitive interface to build a matrix.

It's not really about efficiency in the allocations, although that might
effect some people. It's more about control of memory layout.

My guess is that you won't see any difference in the efficiency
of simple test code. You can run the numbers and see what the
differences are; it would be interesting to see in any case.

The real problem is that a client may need to adjust their layout
to satisfy cache requirements or for other reasons.

In principle, all that's needed is an extra gsl_spmatrix_alloc_xxx()
function which takes one extra argument, which is the allocation strategy.
People who need it can use it. Those who don't care can use the function
with the simpler prototype. But it should be part of a general design
for controlling allocation in all the GSL containers.

Even just for spmatrix, there is more than just the issue of managing
the tree node lifetimes. There are issues of data locality in accessing
the tree. For example, is it better to store the numerical data
close to the node data or is it ok to be farther away?
In general, there is no one good answer to this question.
It depends on the traversal pattern, the operations performed,
and the platform. I have seen some strange and surprising answers
to questions like this on different hardware. I don't understand
the spmatrix design well-enough to tell what choices you have
made, so I'll trust that you have thought about these things.

In the end, only the client knows the answer to the layout question.
That's why they need some flexibility in the design.


I have argued in the past that the GSL containers are brain-damaged
in several ways. The absence of client control over allocation and
layout is one of them. As always, I take at least half the
responsibility for this situation, since I was one of two
people who was there "on that day". I would just like to
see these questions discussed in light of the mistakes
of the past... and of the intervening 16 years of experience.

--
G. Jungman

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: switched sparse format to binary trees
  2014-04-29 22:07           ` Gerard Jungman
@ 2014-05-01 18:46             ` Patrick Alken
  2014-05-01 23:21               ` Gerard Jungman
  0 siblings, 1 reply; 16+ messages in thread
From: Patrick Alken @ 2014-05-01 18:46 UTC (permalink / raw)
  To: gsl-discuss

Gerard,

   I was thinking a little more about your suggestion and just wanted to 
clarify something. As I understand it, what you are proposing is to 
define a new "allocator" structure such as:

typedef struct
{
   void *malloc(size_t size, void *param);
   void free(void *block, void *param);
} gsl_allocator;

And then for every gsl_xxx_alloc function, we should make a new function 
gsl_xxx_alloc_m() which would be:

gsl_xxx_alloc_m(normal_args, gsl_allocator *allocator, void 
*allocator_param)

and all malloc/free calls would then be replaced by calls to 
allocator->malloc and allocator->free, passing along allocator_param 
each time which can contain the bookkeeping structure needed by the 
user's own implementation of malloc/free.

In this scenario, all gsl data structures would need an additional 
pointer to store the gsl_allocator desired by the user. The default 
gsl_xxx_alloc() functions would use a stub allocator which are just 
malloc()/free() wrappers.

For this implementation, every gsl module would use the same 
gsl_allocator framework. I wanted to confirm with you that this is what 
you intended? Or are you thinking of module-specific allocators since 
you were focusing a lot on the spmatrix design?

It seems like a reasonable idea to me, and if we are going to do it, it 
should be done before the 2.0 release, since every gsl_workspace will 
need to be modified. I'm not certain how much demand there will be for 
something like this, since I suspect most users won't want to write 
their own malloc/free implementations, but who knows :)

Patrick

On 04/29/2014 04:07 PM, Gerard Jungman wrote:
> On 04/29/2014 01:52 PM, Patrick Alken wrote:
>> However, I think this is not common, and the simplicity of the memory
>> management scheme outweighs the benefit of handling this rare case I
>> think.
> Yeah. If a matrix has too many zeros, the client can always repack
> it by copying it to a new sparse matrix. This is probably a
> reasonable usage scenario.
>
>
>> For now, the spmatrix routines internally handle all of the block
>> allocation, node array pointers, etc. I don't think its really
>> necessary to expose this to the general user, since its already about
>> as efficient as it will probably get, and most users (like myself)
>> simply want a nice easy intuitive interface to build a matrix.
> It's not really about efficiency in the allocations, although that might
> effect some people. It's more about control of memory layout.
>
> My guess is that you won't see any difference in the efficiency
> of simple test code. You can run the numbers and see what the
> differences are; it would be interesting to see in any case.
>
> The real problem is that a client may need to adjust their layout
> to satisfy cache requirements or for other reasons.
>
> In principle, all that's needed is an extra gsl_spmatrix_alloc_xxx()
> function which takes one extra argument, which is the allocation strategy.
> People who need it can use it. Those who don't care can use the function
> with the simpler prototype. But it should be part of a general design
> for controlling allocation in all the GSL containers.
>
> Even just for spmatrix, there is more than just the issue of managing
> the tree node lifetimes. There are issues of data locality in accessing
> the tree. For example, is it better to store the numerical data
> close to the node data or is it ok to be farther away?
> In general, there is no one good answer to this question.
> It depends on the traversal pattern, the operations performed,
> and the platform. I have seen some strange and surprising answers
> to questions like this on different hardware. I don't understand
> the spmatrix design well-enough to tell what choices you have
> made, so I'll trust that you have thought about these things.
>
> In the end, only the client knows the answer to the layout question.
> That's why they need some flexibility in the design.
>
>
> I have argued in the past that the GSL containers are brain-damaged
> in several ways. The absence of client control over allocation and
> layout is one of them. As always, I take at least half the
> responsibility for this situation, since I was one of two
> people who was there "on that day". I would just like to
> see these questions discussed in light of the mistakes
> of the past... and of the intervening 16 years of experience.
>
> --
> G. Jungman
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: switched sparse format to binary trees
  2014-05-01 18:46             ` Patrick Alken
@ 2014-05-01 23:21               ` Gerard Jungman
  2014-05-02  1:48                 ` GSL containers (switched sparse format to binary trees) Gerard Jungman
  0 siblings, 1 reply; 16+ messages in thread
From: Gerard Jungman @ 2014-05-01 23:21 UTC (permalink / raw)
  To: gsl-discuss

On 05/01/2014 12:46 PM, Patrick Alken wrote:
> I was thinking a little more about your suggestion and just wanted to 
> clarify something. As I understand it, what you are proposing is to 
> define a new "allocator" structure such as:
>
> typedef struct
> {
>   void *malloc(size_t size, void *param);
>   void free(void *block, void *param);
> } gsl_allocator;

That's the idea. Just like the avl allocator mechanism.

> And then for every gsl_xxx_alloc function, we should make a new 
> function gsl_xxx_alloc_m() which would be:
>
> gsl_xxx_alloc_m(normal_args, gsl_allocator *allocator, void 
> *allocator_param)

We need to think about the allocator parameter. There is
probably a design issue here. It is hard to insure that
the client uses the correct parameter if there are
several "pools" or "heaps" in existence. So the
state data represented by the parameter is
somewhat dangerous.

There may even be deeper semantic issues in keeping track
of state data like that, even though it seems attractive
to encapsulate the state in some explicit way. It has to
do with the fact that an an allocator resource is sort
of a singleton. Even if you have different ones floating
around, each one is probably doing a unique job, just
like the unique malloc/free heap.

Notice that the avl library allocator abstraction
avoids any notion of allocator state.

A client can always hide their state data behind a functional
interface. So they could easily create a "heap" for vectors,
and a "heap" for matrices, or whatever, each with their
own malloc. So it may be best to avoid any explicit
notion of state and let the client do what they want
behind the functions.

> and all malloc/free calls would then be replaced by calls to 
> allocator->malloc and allocator->free

Yup.

> In this scenario, all gsl data structures would need an additional 
> pointer to store the gsl_allocator desired by the user.

This seems correct, in order to match the allocation call with the
correct free call.

But there may be more to it as well. Basically, any function
which currently does a malloc or free needs to be thought
about, probably requiring an additional interface also
taking an allocator parameter. It is clearly possible
and probably not too difficult to do the operations
in the right order, swap the allocator pointers,
and move on, so that the container is always in
a consistent state with regard to its allocator.

> The default gsl_xxx_alloc() functions would use a stub allocator which 
> are just malloc()/free() wrappers.

Yup.

> For this implementation, every gsl module would use the same 
> gsl_allocator framework. I wanted to confirm with you that this is 
> what you intended? Or are you thinking of module-specific allocators 
> since you were focusing a lot on the spmatrix design?

Most of the containers have a homogeneous structure, basically
using one layer of heap allocation, for their data blocks.
When we get to something like spmatrix, there are really
at least two distinct allocation needs, for the data
and for the tree nodes. So it may well be correct to
design a more semantically rich allocation framework
for those containers.

It may be as simple as providng two allocators,
one for the nodes, one for the data. But maybe
more is needed, to allow control of the relative
locality of the nodes and data, as I wondered
earlier.

Other places in GSL may need some custom tailoring
as well. Basically, I will just grep for all the malloc
and free calls currently, and estimate what needs to
be done. Anybody interested in this should do the
same. Then we can discuss.

> It seems like a reasonable idea to me, and if we are going to do it, 
> it should be done before the 2.0 release, since every gsl_workspace 
> will need to be modified.

Exactly.

> I'm not certain how much demand there will be for something like this, 
> since I suspect most users won't want to write their own malloc/free 
> implementations, but who knows :)

It is hard to say. But let's think about what it might entail
and weigh the costs and benefits when we know more.

--
G. Jungman

^ permalink raw reply	[flat|nested] 16+ messages in thread

* GSL containers (switched sparse format to binary trees)
  2014-05-01 23:21               ` Gerard Jungman
@ 2014-05-02  1:48                 ` Gerard Jungman
  2014-05-02  1:56                   ` Gerard Jungman
  0 siblings, 1 reply; 16+ messages in thread
From: Gerard Jungman @ 2014-05-02  1:48 UTC (permalink / raw)
  To: gsl-discuss


Here is an example of the sort of thing that
troubles me.

http://stackoverflow.com/questions/15988836/better-memory-management-for-gsl-gsl-vector

It is a reasonable discussion of exactly the sort of
use case that keeps coming up. A couple independent
issues are raised here.

First is the question of allocation/deallocation overhead,
which I think is the smaller of the issues, although it is
important for people who need a lot of small vectors.
This is the standard example justifying use of a
custom pool allocator.

Someone suggested using a single large gsl_block, effectively
using that as a pool. This is not stupid, although it only
works for some kinds of codes, where the allocation and
hand-off of memory can be done in a central place, like
at application startup time or something. At least that
is how I think of that usage case. It appears that the
OP was in one of the more de-centralized situations,
where this would not work nicely.

Ignore the associated comment about the "price you pay"
when using a language without built-in garbage collection.
Garbage collection (or lack thereof) is not the problem.
But that is a separate discussion. See Stroustrup's comments
about the correct use of the heap, in various talks lately.

Second is the issue of value-semantics for containers. This
is very tricky, possibly the deepest problem of all. Notice
how the OP created a custom allocation function which returns
a gsl_vector by value, not by pointer. That may or may not be
the right choice for him, but it is interesting that he chose
to do it that way.

Value-semantics is probably the most important issue for
container design, probably for several reasons. At the
top of my list is the desire to get things onto the
stack whenever possible. The fact that gsl_vector
(the struct itself, not the data) is off the heap
is an annoyance.

And it's the kind of annoyance that may be no problem
for some people and a show-stopper for others. It's
too hard to tell in advance, since it is too hard
to predict how modern hardware will treat your
code. Stack-based computation helps, and linear
traversal of data structures helps a lot, because
the pre-fetcher will bring in the pages you need
before you actually need them, as long as it can
predict which ones are needed next.

The heap tends to gum everything up on modern architectures,
unless you are careful. And taking that kind of care may
require taking over the alloc/free control from the
standard library.


So, the specific point raised by this value-semantics
business, and its effect on heap confusion, is this:
what to do about the fact that the GSL container
structs are always off the heap?


The final post in this short thread may hit the nail on
the head. What he suggests is what I almost always do:
use C++ containers and C++ code as much as possible,
and bang the pointers into GSL functions when necessary.
A bit sad.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: GSL containers (switched sparse format to binary trees)
  2014-05-02  1:48                 ` GSL containers (switched sparse format to binary trees) Gerard Jungman
@ 2014-05-02  1:56                   ` Gerard Jungman
  2014-05-02  8:29                     ` Rhys Ulerich
  2014-05-02 15:02                     ` Patrick Alken
  0 siblings, 2 replies; 16+ messages in thread
From: Gerard Jungman @ 2014-05-02  1:56 UTC (permalink / raw)
  To: gsl-discuss


As before, I apologize for the flood of messages.
But I figure better a few shorter messages than
one long message with multiple sub-topics.

Here is another discussion that I found interesting:

   http://comments.gmane.org/gmane.comp.lib.gsl.general/4385

As usual, the details could be debated. But I found the
following quote to be very telling:

  "Rhys, I did try to use views. They do not help because the
   gsl routines allocate vectors internally and there is not
   much that I can do about it... except for maybe hacking
   gsl and changing gsl_vector_alloc myself."

People seem to be led to this conclusion from multiple
directions.

Another choice quote:

   "The other issue is: the implementation of gsl_vector just
    seems inefficient to me. Looking at the code, it seems like
    a single vector requires 3 calls to malloc and free (one for
    the data, one for the gsl_block, and one for the gsl_vector
    itself). The manual states that the block is there for
    "consistency", and I can see how memory management becomes
    easier with it. But it seems to be a case of generality at
    the expense of performance. Also, the stride and the owner
    flag are part of the gsl_vector object to make it work with
    gsl_views, but then people who never need views pay the
    performance price anyway."



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: GSL containers (switched sparse format to binary trees)
  2014-05-02  1:56                   ` Gerard Jungman
@ 2014-05-02  8:29                     ` Rhys Ulerich
  2014-05-02  8:33                       ` Rhys Ulerich
  2014-05-02 15:02                     ` Patrick Alken
  1 sibling, 1 reply; 16+ messages in thread
From: Rhys Ulerich @ 2014-05-02  8:29 UTC (permalink / raw)
  To: Gerard Jungman; +Cc: gsl-discuss

> People seem to be led to this conclusion from multiple
> directions.

Careful, or you'll talk yourself into something like the Apache
Portable Runtime's
ubiquitous pool argument, apr_pool_t.

- Rhys

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: GSL containers (switched sparse format to binary trees)
  2014-05-02  8:29                     ` Rhys Ulerich
@ 2014-05-02  8:33                       ` Rhys Ulerich
  2014-05-02  8:52                         ` Rhys Ulerich
  0 siblings, 1 reply; 16+ messages in thread
From: Rhys Ulerich @ 2014-05-02  8:33 UTC (permalink / raw)
  To: Gerard Jungman; +Cc: gsl-discuss

(This time with the list copied).

> apr_pool_t

Kidding aside, something like an apr_pool_t or handing about a
talloc-like context is where I think you and Patrick are headed with
this discussion.

- Rhys

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: GSL containers (switched sparse format to binary trees)
  2014-05-02  8:33                       ` Rhys Ulerich
@ 2014-05-02  8:52                         ` Rhys Ulerich
  0 siblings, 0 replies; 16+ messages in thread
From: Rhys Ulerich @ 2014-05-02  8:52 UTC (permalink / raw)
  To: Gerard Jungman; +Cc: gsl-discuss

> a talloc-like context

If you've not read about it before, talloc [1] is quite slick with a
great, succinct tutorial [2].

[1] https://talloc.samba.org/talloc/doc/html/index.html
[2] https://talloc.samba.org/talloc/doc/html/libtalloc__tutorial.html

- Rhys

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: GSL containers (switched sparse format to binary trees)
  2014-05-02  1:56                   ` Gerard Jungman
  2014-05-02  8:29                     ` Rhys Ulerich
@ 2014-05-02 15:02                     ` Patrick Alken
  2014-05-13 21:59                       ` Gerard Jungman
  1 sibling, 1 reply; 16+ messages in thread
From: Patrick Alken @ 2014-05-02 15:02 UTC (permalink / raw)
  To: gsl-discuss

This particular thread seems to be more about dynamic allocations of 
gsl_vector inside the multimin routines - I wasn't aware that was 
happening but in theory it should be possible to rewrite things so there 
are no dynamic allocations.

The first thread you linked seems to have a legitimate reason to reduce 
the malloc calls for gsl_vector, since he's allocating a large number of 
small vectors it seems.

Anyway, Gerard, you have a good grasp of the various issues and what 
needs to be done about it - would you be interested in taking the lead 
on this and designing a nice API for user-defined alloc functions?

Patrick

On 05/01/2014 07:55 PM, Gerard Jungman wrote:
> As before, I apologize for the flood of messages.
> But I figure better a few shorter messages than
> one long message with multiple sub-topics.
>
> Here is another discussion that I found interesting:
>
>     http://comments.gmane.org/gmane.comp.lib.gsl.general/4385
>
> As usual, the details could be debated. But I found the
> following quote to be very telling:
>
>    "Rhys, I did try to use views. They do not help because the
>     gsl routines allocate vectors internally and there is not
>     much that I can do about it... except for maybe hacking
>     gsl and changing gsl_vector_alloc myself."
>
> People seem to be led to this conclusion from multiple
> directions.
>
> Another choice quote:
>
>     "The other issue is: the implementation of gsl_vector just
>      seems inefficient to me. Looking at the code, it seems like
>      a single vector requires 3 calls to malloc and free (one for
>      the data, one for the gsl_block, and one for the gsl_vector
>      itself). The manual states that the block is there for
>      "consistency", and I can see how memory management becomes
>      easier with it. But it seems to be a case of generality at
>      the expense of performance. Also, the stride and the owner
>      flag are part of the gsl_vector object to make it work with
>      gsl_views, but then people who never need views pay the
>      performance price anyway."
>
>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: GSL containers (switched sparse format to binary trees)
  2014-05-02 15:02                     ` Patrick Alken
@ 2014-05-13 21:59                       ` Gerard Jungman
  0 siblings, 0 replies; 16+ messages in thread
From: Gerard Jungman @ 2014-05-13 21:59 UTC (permalink / raw)
  To: gsl-discuss

On 05/02/2014 09:02 AM, Patrick Alken wrote:
> Anyway, Gerard, you have a good grasp of the various issues and what 
> needs to be done about it - would you be interested in taking the lead 
> on this and designing a nice API for user-defined alloc functions?

Sorry for the late reply; I had a busy week.

I will start looking into this. I also want to do some research
on the various problems people may be having with the containers,
notably the various "performance" tests where GSL containers and/or
linear algebra come out badly. It's been a while since I've seen
these sorts of comparisons, but they are out there somewhere.

--
G. Jungman

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-05-13 21:59 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-28  1:32 switched sparse format to binary trees Patrick Alken
2014-04-28 22:35 ` Gerard Jungman
2014-04-29  1:13   ` Patrick Alken
2014-04-29  8:40     ` Frank Reininghaus
2014-04-29 19:15       ` Gerard Jungman
2014-04-29 19:52         ` Patrick Alken
2014-04-29 22:07           ` Gerard Jungman
2014-05-01 18:46             ` Patrick Alken
2014-05-01 23:21               ` Gerard Jungman
2014-05-02  1:48                 ` GSL containers (switched sparse format to binary trees) Gerard Jungman
2014-05-02  1:56                   ` Gerard Jungman
2014-05-02  8:29                     ` Rhys Ulerich
2014-05-02  8:33                       ` Rhys Ulerich
2014-05-02  8:52                         ` Rhys Ulerich
2014-05-02 15:02                     ` Patrick Alken
2014-05-13 21:59                       ` Gerard Jungman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).