GSL containers: was Re: [Help-gsl] Linear least squares, webpages and the next release

public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed

* GSL containers: was Re: [Help-gsl] Linear least squares, webpages and the next release
       [not found]   ` <562E432D.9050002@colorado.edu>
@ 2015-11-01 21:00     ` John D Lamb
  2015-11-03 20:56       ` Gerard Jungman
  0 siblings, 1 reply; 7+ messages in thread
From: John D Lamb @ 2015-11-01 21:00 UTC (permalink / raw)
  To: Patrick Alken, gsl-discuss, Gerard Jungman

On 26/10/15 15:13, Patrick Alken wrote:
>> Yes. Iâ€™d be happy to look at redesign of the GSL containers. Whatâ€™s
>> needed?
>>
>
> There was a discussion on gsl-discuss some time back, see:
>
> https://www.sourceware.org/ml/gsl-discuss/2014-q2/
>
> Gerard may have already done some work on this, or have some ideas on a
> good starting point, so I suggest getting in touch with him too (cc'd).

Iâ€™d forgotten the detail of that discussion.

I can think of a way to change the gsl block/vector/matrix alloc 
functions to be more efficient. In essence it is a pool allocator.
It would keep a record, for each power k of two up to some limit n, of 
the number of blocks allocated for sizes 2^k to 2^{k+1} together with a 
capacity (also a power of two) for blocks of that range. These would 
form linked lists of allocated and unallocated blocks. Given a request 
for a new block, if an unallocated one was available, it would be 
allocated. Otherwise the capacity would be doubled. When a block is 
freed, memory is only deallocated if no more than a quarter of capacity 
is used or if no blocks are used.

This idea needs more input.

I canâ€™t think of a good way to create gsl_vectors and the like in stack 
memory. Of course, it is always possible to create a struct and 
initialise it.

I also donâ€™t know of a good, easy solution to the problem of constness.

-- 
John D Lamb

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GSL containers: was Re: [Help-gsl] Linear least squares, webpages and the next release
  2015-11-01 21:00     ` GSL containers: was Re: [Help-gsl] Linear least squares, webpages and the next release John D Lamb
@ 2015-11-03 20:56       ` Gerard Jungman
  2015-11-04 20:35         ` Patrick Alken
  0 siblings, 1 reply; 7+ messages in thread
From: Gerard Jungman @ 2015-11-03 20:56 UTC (permalink / raw)
  To: gsl-discuss

The problem with the GSL containers is that the allocation
strategy is hard-wired into the implementation. Adding
another allocation strategy is not the answer, it just
adds to the problem.

Put another way, the real problem is that the GSL interfaces
are written in terms of gsl_vector and gsl_matrix objects,
which forces clients to carry along the allocation baggage
whenever they want to use a GSL interface.

The interfaces (every function that currently takes a
const gsl_vector * for example) should be re-written in
terms of a view type which is independent of allocation
strategies. A view type should be basically a thin
wrapper over a pointer and a length for vectors.
For multi-array objects it would be a thin wrapper
over a pointer with an addressing/indexing interface.

The current "view" types are brain-damaged because
they stand the design on its head. This is because
they were added as an afterthought to the existing
heap-allocated vector/matrix objects, at the time
when this was all being implemented. Badly done.

At the implementation level, the other main problem
is the way the types are "composed" using indirections.
This is awful and leads to the multiple allocations and
other heap-centric lossage that has been discussed
recently. Both views and full-fledged containers
should be composed in a value-centric and
stack-friendly way.

And preferably in a way that allows them to interoperate
and keep const-correctness. I believe this is all possible.

--
G. Jungman

On 11/01/2015 02:00 PM, John D Lamb wrote:
> On 26/10/15 15:13, Patrick Alken wrote:
>>> Yes. Iâ€™d be happy to look at redesign of the GSL containers. Whatâ€™s
>>> needed?
>>>
>>
>> There was a discussion on gsl-discuss some time back, see:
>>
>> https://www.sourceware.org/ml/gsl-discuss/2014-q2/
>>
>> Gerard may have already done some work on this, or have some ideas on a
>> good starting point, so I suggest getting in touch with him too (cc'd).
>
>
> Iâ€™d forgotten the detail of that discussion.
>
> I can think of a way to change the gsl block/vector/matrix alloc 
> functions to be more efficient. In essence it is a pool allocator.
> It would keep a record, for each power k of two up to some limit n, of 
> the number of blocks allocated for sizes 2^k to 2^{k+1} together with 
> a capacity (also a power of two) for blocks of that range. These would 
> form linked lists of allocated and unallocated blocks. Given a request 
> for a new block, if an unallocated one was available, it would be 
> allocated. Otherwise the capacity would be doubled. When a block is 
> freed, memory is only deallocated if no more than a quarter of 
> capacity is used or if no blocks are used.
>
> This idea needs more input.
>
> I canâ€™t think of a good way to create gsl_vectors and the like in 
> stack memory. Of course, it is always possible to create a struct and 
> initialise it.
>
> I also donâ€™t know of a good, easy solution to the problem of constness.
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GSL containers: was Re: [Help-gsl] Linear least squares, webpages and the next release
  2015-11-03 20:56       ` Gerard Jungman
@ 2015-11-04 20:35         ` Patrick Alken
  2015-11-04 23:06           ` Gerard Jungman
  0 siblings, 1 reply; 7+ messages in thread
From: Patrick Alken @ 2015-11-04 20:35 UTC (permalink / raw)
  To: gsl-discuss

Hi Gerard,

   So if I understand correctly, restricting the discussion to 
gsl_vector for now, the issue is that in order for users to use the 
various GSL vector routines, they must first call gsl_vector_alloc, and 
then copy their data into the gsl_vector object, and then use the 
routine they want.

   The better approach would be to define their vector array however 
they wish, and then get a gsl_vector_view of that array (without needing 
any GSL allocation routines), and then directly use the routine they 
want. This is of course currently possible with gsl_vector_view_array, 
however it would be better to pass the view object directly to the GSL 
function, rather than having to pass &view.vector ?

So to summarize:
   1. All GSL routines which currently take gsl_vector* arguments, 
should be modified to accept gsl_vector_view* instead
   2. gsl_vector_view should be redesigned to be cleaner/simpler <- I'm 
still not completely clear on what this would look like
   3. Ditto for gsl_matrix

Is this a correct assessment of what you're saying? If so, in principle 
I completely agree it can be burdensome on the user to constantly have 
to pass &view.vector, &view.matrix all the time. It would be nice to 
simply pass 'view' instead.

Patrick

On 11/03/2015 01:56 PM, Gerard Jungman wrote:
>
> The problem with the GSL containers is that the allocation
> strategy is hard-wired into the implementation. Adding
> another allocation strategy is not the answer, it just
> adds to the problem.
>
> Put another way, the real problem is that the GSL interfaces
> are written in terms of gsl_vector and gsl_matrix objects,
> which forces clients to carry along the allocation baggage
> whenever they want to use a GSL interface.
>
> The interfaces (every function that currently takes a
> const gsl_vector * for example) should be re-written in
> terms of a view type which is independent of allocation
> strategies. A view type should be basically a thin
> wrapper over a pointer and a length for vectors.
> For multi-array objects it would be a thin wrapper
> over a pointer with an addressing/indexing interface.
>
> The current "view" types are brain-damaged because
> they stand the design on its head. This is because
> they were added as an afterthought to the existing
> heap-allocated vector/matrix objects, at the time
> when this was all being implemented. Badly done.
>
> At the implementation level, the other main problem
> is the way the types are "composed" using indirections.
> This is awful and leads to the multiple allocations and
> other heap-centric lossage that has been discussed
> recently. Both views and full-fledged containers
> should be composed in a value-centric and
> stack-friendly way.
>
> And preferably in a way that allows them to interoperate
> and keep const-correctness. I believe this is all possible.
>
> -- 
> G. Jungman
>
>
> On 11/01/2015 02:00 PM, John D Lamb wrote:
>> On 26/10/15 15:13, Patrick Alken wrote:
>>>> Yes. Iâ€™d be happy to look at redesign of the GSL containers. Whatâ€™s
>>>> needed?
>>>>
>>>
>>> There was a discussion on gsl-discuss some time back, see:
>>>
>>> https://www.sourceware.org/ml/gsl-discuss/2014-q2/
>>>
>>> Gerard may have already done some work on this, or have some ideas on a
>>> good starting point, so I suggest getting in touch with him too (cc'd).
>>
>>
>> Iâ€™d forgotten the detail of that discussion.
>>
>> I can think of a way to change the gsl block/vector/matrix alloc 
>> functions to be more efficient. In essence it is a pool allocator.
>> It would keep a record, for each power k of two up to some limit n, 
>> of the number of blocks allocated for sizes 2^k to 2^{k+1} together 
>> with a capacity (also a power of two) for blocks of that range. These 
>> would form linked lists of allocated and unallocated blocks. Given a 
>> request for a new block, if an unallocated one was available, it 
>> would be allocated. Otherwise the capacity would be doubled. When a 
>> block is freed, memory is only deallocated if no more than a quarter 
>> of capacity is used or if no blocks are used.
>>
>> This idea needs more input.
>>
>> I canâ€™t think of a good way to create gsl_vectors and the like in 
>> stack memory. Of course, it is always possible to create a struct and 
>> initialise it.
>>
>> I also donâ€™t know of a good, easy solution to the problem of constness.
>>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GSL containers: was Re: [Help-gsl] Linear least squares, webpages and the next release
  2015-11-04 20:35         ` Patrick Alken
@ 2015-11-04 23:06           ` Gerard Jungman
  2015-11-04 23:16             ` Patrick Alken
  0 siblings, 1 reply; 7+ messages in thread
From: Gerard Jungman @ 2015-11-04 23:06 UTC (permalink / raw)
  To: gsl-discuss

On 11/04/2015 01:35 PM, Patrick Alken wrote:
> Hi Gerard,
>
>   So if I understand correctly, restricting the discussion to 
> gsl_vector for now, the issue is that in order for users to use the 
> various GSL vector routines, they must first call gsl_vector_alloc, 
> and then copy their data into the gsl_vector object, and then use the 
> routine they want.
>
>   The better approach would be to define their vector array however 
> they wish, and then get a gsl_vector_view of that array (without 
> needing any GSL allocation routines), and then directly use the 
> routine they want. This is of course currently possible with 
> gsl_vector_view_array, however it would be better to pass the view 
> object directly to the GSL function, rather than having to pass 
> &view.vector ?

Yes. One of these is required.

But it's more than making the interfaces nicer. The current
design is upside-down. The view types are actually implemented
in terms of the alloced types. Strange, but true.

This leads to all sorts of bad design elements, like the
'owner' data element in the vectors. Why do vectors need
an 'owner' member? Apparently because they might actually
be part of a view and not own their data. This is just wrong.

Some users have reported on the mailing list that they get
around this mess by jamming a pointer and size into a
gsl_vector, setting 'owner' to zero, and getting on with
life. For these workarounds, the 'owner' flag is a lucky
circumstance. But this is clearly nuts. This need should
be filled by explicit support from the types and not
by hackery.

As part of a fix, the types should have much cleaner
and orthogonal relationships, and the interfaces
should not carry this implied dependency on the
semantics of the heap-allocated types, whether
people can get around it by hackery or not.

> So to summarize:
>   1. All GSL routines which currently take gsl_vector* arguments, 
> should be modified to accept gsl_vector_view* instead

But with a correctly designed thinner view type.

> 2. gsl_vector_view should be redesigned to be cleaner/simpler <- I'm 
> still not completely clear on what this would look like

Make the views the "fundamental" types and have the fat
types essentially inherit from (or export an interface for)
the non-const view types.

> 3. Ditto for gsl_matrix

And, while we are at it, multi-arrays too, which could be
provided with not much extra effort. Though, of course,
no current GSL interfaces depend on such things, since
they don't exist in the main code base yet.

Finally, the implementation of the fat container types
is also brain-damaged because of the "composition by
indirection" design. For example, gsl_vector and gsl_matrix
carry pointers to gsl_block. This leads to an unnecessary
chain of allocations, as has also been discussed recently
on the mailing list.

The compositions, to the extent they are needed, should
be more value-centric. Avoid these annoying indirections.

And the construction idiom for the fat containers should
be more value-centric. Returning gsl_vector indirectly
at construction is yet another unnecessary indirection
and heap allocation.

This obviously requires a complete re-design. But these
aspects of the work are not hard. It's not hard to get
these things right, it just requires some free time
and a willingness to break all the interfaces
that currently touch the containers.

--
G. Jungman

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GSL containers: was Re: [Help-gsl] Linear least squares, webpages and the next release
  2015-11-04 23:06           ` Gerard Jungman
@ 2015-11-04 23:16             ` Patrick Alken
       [not found]               ` <563AB455.1020706@lanl.gov>
  0 siblings, 1 reply; 7+ messages in thread
From: Patrick Alken @ 2015-11-04 23:16 UTC (permalink / raw)
  To: gsl-discuss

I think this is a good idea. Would you be able/willing to design a new 
gsl_vector_view structure? Even something very preliminary which we 
could iterate a bit to get it to a nice mature status. I, or someone 
else, could then run with it and begin the tedious work of converting 
all the existing interfaces.

Patrick

On 11/04/2015 04:06 PM, Gerard Jungman wrote:
> On 11/04/2015 01:35 PM, Patrick Alken wrote:
>> Hi Gerard,
>>
>>   So if I understand correctly, restricting the discussion to 
>> gsl_vector for now, the issue is that in order for users to use the 
>> various GSL vector routines, they must first call gsl_vector_alloc, 
>> and then copy their data into the gsl_vector object, and then use the 
>> routine they want.
>>
>>   The better approach would be to define their vector array however 
>> they wish, and then get a gsl_vector_view of that array (without 
>> needing any GSL allocation routines), and then directly use the 
>> routine they want. This is of course currently possible with 
>> gsl_vector_view_array, however it would be better to pass the view 
>> object directly to the GSL function, rather than having to pass 
>> &view.vector ?
>
> Yes. One of these is required.
>
> But it's more than making the interfaces nicer. The current
> design is upside-down. The view types are actually implemented
> in terms of the alloced types. Strange, but true.
>
> This leads to all sorts of bad design elements, like the
> 'owner' data element in the vectors. Why do vectors need
> an 'owner' member? Apparently because they might actually
> be part of a view and not own their data. This is just wrong.
>
> Some users have reported on the mailing list that they get
> around this mess by jamming a pointer and size into a
> gsl_vector, setting 'owner' to zero, and getting on with
> life. For these workarounds, the 'owner' flag is a lucky
> circumstance. But this is clearly nuts. This need should
> be filled by explicit support from the types and not
> by hackery.
>
> As part of a fix, the types should have much cleaner
> and orthogonal relationships, and the interfaces
> should not carry this implied dependency on the
> semantics of the heap-allocated types, whether
> people can get around it by hackery or not.
>
>> So to summarize:
>>   1. All GSL routines which currently take gsl_vector* arguments, 
>> should be modified to accept gsl_vector_view* instead
>
> But with a correctly designed thinner view type.
>
>> 2. gsl_vector_view should be redesigned to be cleaner/simpler <- I'm 
>> still not completely clear on what this would look like
>
> Make the views the "fundamental" types and have the fat
> types essentially inherit from (or export an interface for)
> the non-const view types.
>
>> 3. Ditto for gsl_matrix
>
> And, while we are at it, multi-arrays too, which could be
> provided with not much extra effort. Though, of course,
> no current GSL interfaces depend on such things, since
> they don't exist in the main code base yet.
>
>
> Finally, the implementation of the fat container types
> is also brain-damaged because of the "composition by
> indirection" design. For example, gsl_vector and gsl_matrix
> carry pointers to gsl_block. This leads to an unnecessary
> chain of allocations, as has also been discussed recently
> on the mailing list.
>
> The compositions, to the extent they are needed, should
> be more value-centric. Avoid these annoying indirections.
>
> And the construction idiom for the fat containers should
> be more value-centric. Returning gsl_vector indirectly
> at construction is yet another unnecessary indirection
> and heap allocation.
>
> This obviously requires a complete re-design. But these
> aspects of the work are not hard. It's not hard to get
> these things right, it just requires some free time
> and a willingness to break all the interfaces
> that currently touch the containers.
>
> -- 
> G. Jungman
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

[parent not found: <563AB455.1020706@lanl.gov>]

* Re: GSL containers: was Re: [Help-gsl] Linear least squares, webpages and the next release
       [not found]               ` <563AB455.1020706@lanl.gov>
@ 2015-11-05  4:41                 ` Patrick Alken
  2015-11-07 14:09                   ` John D Lamb
  0 siblings, 1 reply; 7+ messages in thread
From: Patrick Alken @ 2015-11-05  4:41 UTC (permalink / raw)
  To: gsl-discuss

I agree that stuff is very confusing. I also wonder how much of it is
needed. In my work I've only ever used the double and complex versions
of gsl_vector/gsl_matrix. Most of the routines in GSL only work with
double vectors, with a few extended to complex double (linalg,eigen for
example).

Do we really need gsl_vector's for int/short/uint/long/etc?

On 11/04/2015 06:43 PM, Gerard Jungman wrote:
> On 11/04/2015 04:16 PM, Patrick Alken wrote:
>> I think this is a good idea. Would you be able/willing to design a
>> new gsl_vector_view structure? Even something very preliminary which
>> we could iterate a bit to get it to a nice mature status. I, or
>> someone else, could then run with it and begin the tedious work of
>> converting all the existing interfaces.
>>
>> Patrick
>
> The main practical problem is all the preprocessor "template"
> nonsense. I cannot think straight with all that junk hanging
> over everything. And I assume it would be impossible to have
> coherent discussions about it as well.
>
> I could make a reference implementation for a fixed type
> like 'double' and leave all the generalization to the very
> last step, once the design is entirely fixed. Of course, I
> would have to be careful about hidden assumptions which
> break for other types.
>
> If anybody can think of a better way to do the generalizations,
> besides the current preprocessor implementation, that would be
> great. I have nothing against the preprocessor, but the current
> framework makes it nearly impossible to actually read the code.
>
> The current impl is like traits classes, but where the traits
> are all defined in a very rudimentary way in "templates_on.h".
> It is impossible to tell what traits are available, which ones
> are required, and how the separate traits are associated with
> the basic types at the point where the macros are invoked.
>
> -- 
> G. Jungman
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GSL containers: was Re: [Help-gsl] Linear least squares, webpages and the next release
  2015-11-05  4:41                 ` Patrick Alken
@ 2015-11-07 14:09                   ` John D Lamb
  0 siblings, 0 replies; 7+ messages in thread
From: John D Lamb @ 2015-11-07 14:09 UTC (permalink / raw)
  To: gsl-discuss

On 05/11/15 04:41, Patrick Alken wrote:
> I agree that stuff is very confusing. I also wonder how much of it is
> needed. In my work I've only ever used the double and complex versions
> of gsl_vector/gsl_matrix. Most of the routines in GSL only work with
> double vectors, with a few extended to complex double (linalg,eigen for
> example).
>
> Do we really need gsl_vector's for int/short/uint/long/etc?

I have used gsl_vector_long on occasion.

> On 11/04/2015 06:43 PM, Gerard Jungman wrote:
>> On 11/04/2015 04:16 PM, Patrick Alken wrote:
>>> I think this is a good idea. Would you be able/willing to design a
>>> new gsl_vector_view structure? Even something very preliminary which
>>> we could iterate a bit to get it to a nice mature status. I, or
>>> someone else, could then run with it and begin the tedious work of
>>> converting all the existing interfaces.
>>>
>>> Patrick
>>
>> If anybody can think of a better way to do the generalizations,
>> besides the current preprocessor implementation, that would be
>> great. I have nothing against the preprocessor, but the current
>> framework makes it nearly impossible to actually read the code.

Ruby? Python? In the absence of C++ templates I usually use one or 
other. Python is likely more readable to C programmers.

>> The current impl is like traits classes, but where the traits
>> are all defined in a very rudimentary way in "templates_on.h".
>> It is impossible to tell what traits are available, which ones
>> are required, and how the separate traits are associated with
>> the basic types at the point where the macros are invoked.

Iâ€™m happy to help if someone can suggest a basic workable design. I can 
certainly change the alloc functions to use a pool allocator internally. 
And, since I mostly use C++, I can see the logic of initialising vector 
structs. But I canâ€™t think of a good design that wonâ€™t break a lot of 
user code. One of the features of the GSL is that you canâ€™t alloc a 
vector of size 0 without writing a custom allocator. It would be easy to 
change gsl_vector_alloc. But I canâ€™t even think how to do that and 
guarantee not to break user code.

-- 
John D Lamb

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-11-07 14:09 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <56293649.8010009@colorado.edu>
     [not found] ` <562BA530.7090508@johndlamb.net>
     [not found]   ` <562E432D.9050002@colorado.edu>
2015-11-01 21:00     ` GSL containers: was Re: [Help-gsl] Linear least squares, webpages and the next release John D Lamb
2015-11-03 20:56       ` Gerard Jungman
2015-11-04 20:35         ` Patrick Alken
2015-11-04 23:06           ` Gerard Jungman
2015-11-04 23:16             ` Patrick Alken
     [not found]               ` <563AB455.1020706@lanl.gov>
2015-11-05  4:41                 ` Patrick Alken
2015-11-07 14:09                   ` John D Lamb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).