questions about block/vector/matrix

public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed

* questions about block/vector/matrix
@ 2009-10-05 10:12 Gerard Jungman
  2009-10-16 20:12 ` Brian Gough
  0 siblings, 1 reply; 4+ messages in thread
From: Gerard Jungman @ 2009-10-05 10:12 UTC (permalink / raw)
  To: gsl-discuss

** Specific questions/comments about blocks, vectors, matrices.

 Q1. Why do the vector and matrix structs have a block member at all?
     At first I thought it was using some block functions in the
     implementation (like for i/o), but it is not. The block
     member seems pointless. See the following question.

     Furthermore, from the standpoint of object design, the vector
     concept does not inherit from the block concept, since a vector
     is strided and a block is not. They are incompatible concepts.
     If blocks have no semantic reason to be in vectors, and if
     they are not used in a integral way in the implmentation,
     then they should be removed.

 Q2. What is the meaning of functions gsl_vector_alloc_from_block(),
     gsl_vector_alloc_from_vector(), etc.?

     gsl_vector_alloc_from_block() seems to be constructing a view of
     an underlying data segment. As such, it conflicts with the
     semantics of views. There should be only one view semantic.
     Also, it generates a needless coupling between the block
     component and the vector component.

     gsl_vector_alloc_from_vector() is similar, though it is more
     appropriate to say it is constructing a slice of a vector.
     Again, the semantics are confused.

     The suffix '_alloc' is confusing, since it is not clear what is
     being alloced. Obviously the struct itself is being alloced, since
     it is returned by pointer. But what about the data?

     The same questions and comments apply to the functions
     gsl_matrix_alloc_from_block(), gsl_matrix_alloc_from_matrix(),
     gsl_vector_alloc_row_from_matrix(),
gsl_vector_alloc_col_from_matrix().


 Q3. Why do we have functions like gsl_matrix_row(),
gsl_matrix_diagonal(), etc,
     and yet no support for general slicing operations? These functions
should
     be simple wrappers over a more general functionality.


 Q4. Why do views export a different interface?

     There are many operations on vectors that I cannot apply to vector
     views, but which would make perfect sense for views. These include
     obvious things like min(). max(), scale(), etc. They also include
     the i/o functions. Writing and reading from and to view objects is
     a perfectly well-defined notion.

     The view design lacks any notion of genericity, lacks coherence,
     and is essentially useless as it stands.


 Q5. How many people use gsl_vector_uchar.h? Or gsl_vector_short.h?
     Or gsl_matrix_ushort.h? Etc. Just curious.


** Recommendations for block/vector/matrix

  R1. Remove the view types.
  R2. Redesign vector and matrix so that they subsume the view concept.
  R3. Decouple vector and matrix from the block component.

  These changes can be done in a source-code compatible way, by
  creating a type equivalence between views and the underlying
  type and retaining the current view function interfaces
  (with new implementations and marked as deprecated).

  R4. Make explicit the vector and matrix semantics. This is more
      a documentation/design goal than a code goal. See my
      tentative design summary elsewhere.


** Recommendations for a multi_array functionality

  R5. Create a parallel container component (tentatively
gsl_marray_...),
      which supports arbitrary rank indexed views of data segments.
      Support a general model for slicing.

  R6. Create simple wrappers which allow viewing multi-arrays of the
      approriate shape as vectors or matrices and vice-versa. There are
      semantic constraints which prevent a multi-array from "being"
      a vector or matrix, so there will always be distinct types.

      Technical issue: Figure out how to do this without creating a
      cross-coupling dependency between the components. i.e. make
      sure the design is levelized.


** Recommended further steps

  R7. Implement generic slicing for vectors, using the model from
      the new gsl_marray. This is not difficult since slicing
      vectors has limited semantics.

  R8. Implement generic slicing of matrices, using the model from
      the new gsl_marray. A slice of a matrix may not necessarily
      "be" a matrix, even it looks like a sub-matrix. Semantic
      constraints will need to be expressed and enforced.


** Status

  I have begun most of these steps myself, in an attempt to sort out
  the requirements and work through different designs.

  R1, R2, R3: I have begun this process, on a private branch.
              Currently waiting for some answers/discussion
              regarding the questions above. 

  R4: Done as part of tentative design.

  R5: Mostly done.

  R6, R7, R8: Not started. Awaiting R1-R3.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: questions about block/vector/matrix
  2009-10-05 10:12 questions about block/vector/matrix Gerard Jungman
@ 2009-10-16 20:12 ` Brian Gough
  2009-10-19 21:37   ` Gerard Jungman
  0 siblings, 1 reply; 4+ messages in thread
From: Brian Gough @ 2009-10-16 20:12 UTC (permalink / raw)
  To: Gerard Jungman; +Cc: gsl-discuss

At Sun, 04 Oct 2009 20:03:16 -0600,
Gerard Jungman wrote:
> 
> ** Specific questions/comments about blocks, vectors, matrices.
> 
>  Q1. Why do the vector and matrix structs have a block member at
>  all?
>      At first I thought it was using some block functions in the
>      implementation (like for i/o), but it is not. The block member
>      seems pointless. See the following question.

As mentioned, it was to have a struct which is the same as C++ valarray.

>      Furthermore, from the standpoint of object design, the vector
>      concept does not inherit from the block concept, since a vector
>      is strided and a block is not. They are incompatible concepts.
>      If blocks have no semantic reason to be in vectors, and if they
>      are not used in a integral way in the implmentation, then they
>      should be removed.

I agree, there is no inheritance relationship between them.  However
there is a relationship between the vector and the block if vectors
are allowed to grow/shrink -- the size of the block places a limit on
the maximum size on the vector.  This is not something we currently
use, but that is what I had in mind at the time.

>  Q2. What is the meaning of functions gsl_vector_alloc_from_block(),
>      gsl_vector_alloc_from_vector(), etc.?
>      gsl_vector_alloc_from_block() seems to be constructing a view
>      of an underlying data segment. As such, it conflicts with the
>      semantics of views. There should be only one view semantic.

These functions create a new gsl_vector object (on the heap)
referencing the same memory as an existing vector or block.  Note that
the gsl_vector object is only the metadata (size, stride, etc) and
does not include the block of memory, just the point to it.

There's a potential confusion over terminology, a gsl_vector is
actually a "view" of a gsl_block (which it usually owns, but not
always) and a gsl_vector_view is also a "view" - the difference is
that gsl_vector is on the heap and gsl_vector_view is on the stack
(and never owns the block).  As discussed earlier, the version on the
stack has a different type because of limitations of const (we have to
define both gsl_vector_const_view and gsl_vector_view when it is on
the stack whereas the heap version, as a pointer, can be used as const
gsl_vector * or gsl_vector *)

If there was a way to put a gsl_vector object on the stack directly
without causing const violations we would not need to wrap it in a
gsl_vector_view struct. 

>      gsl_vector_alloc_from_vector() is similar, though it is more
>      appropriate to say it is constructing a slice of a vector.
>      Again, the semantics are confused.
> 
>      The suffix '_alloc' is confusing, since it is not clear what is
>      being alloced. Obviously the struct itself is being alloced,
>      since it is returned by pointer. But what about the data?

Yes, it is a bit confusing.  The constraint was that people should be
able to write

     gsl_vector * v = gsl_vector_alloc(n)

and have it just work, whereas it would be more logical to use

    gsl_block * b = gsl_block_alloc(N)
    gsl_vector * v = gsl_vector_view (b, stride, n)  [or something like that]

which is cumbersome for the typical usage.

>  Q3. Why do we have functions like gsl_matrix_row(),
> gsl_matrix_diagonal(), etc, and yet no support for general slicing
> operations? These functions should be simple wrappers over a more
> general functionality.

What would be the interface for a general slicing operation for the
existing matrix type?  I have no objection to adding it, the functions
we have were added on an as-needed basis.

>  Q4. Why do views export a different interface?
> 
>      There are many operations on vectors that I cannot apply to
>      vector views, but which would make perfect sense for
>      views. These include obvious things like min(). max(), scale(),
>      etc. They also include the i/o functions. Writing and reading
>      from and to view objects is a perfectly well-defined notion.

Maybe I have misunderstood this question but all vector and matrix
operations are supported on views, by calling them as &view.vector or
&view.matrix.  A gsl_vector_view is a struct which contains one thing
- a gsl_vector, so all operations on it are definitely supported,
including i/o.  Same for matrices.  Without this, the views would
indeed be useless.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: questions about block/vector/matrix
  2009-10-16 20:12 ` Brian Gough
@ 2009-10-19 21:37   ` Gerard Jungman
  2009-10-23 21:28     ` Brian Gough
  0 siblings, 1 reply; 4+ messages in thread
From: Gerard Jungman @ 2009-10-19 21:37 UTC (permalink / raw)
  To: gsl-discuss

On Fri, 2009-10-16 at 21:06 +0100, Brian Gough wrote:
> At Sun, 04 Oct 2009 20:03:16 -0600,
> Gerard Jungman wrote:
> >  Q1. Why do the vector and matrix structs have a block member at
> >  all?
> >      At first I thought it was using some block functions in the
> >      implementation (like for i/o), but it is not. The block member
> >      seems pointless. See the following question.
> 
> As mentioned, it was to have a struct which is the same as C++ valarray.

It is worth mentioning at this point that std::valarray is a
recognized failure. The implementations could never live up
to the intentions. See many discussions on the web, including
an admission from the designer of valarray that it is a
failure. It is best not to invoke valarray as a
justification for any design choice; the cognoscenti
will be snickering. See the discussion at the end of
the following post:

 http://www.oonumerics.org/oon/oon-list/archive/0493.html

That said, I don't understand what it is about std::valarray that
we want to replicate. The utility of gsl_block is simply that it
remembers the size of its contiguous non-strided data segment.
Ok. Fine. Are we excited yet?

And propagating blocks into the other components seems unnecessary.
The only functions that are actually used by other components are
the ..._raw() functions, which are completely independent of the
block struct. This is good in the end; it makes it very easy to
expunge blocks from those implementations.

> >      Furthermore, from the standpoint of object design, the vector
> >      concept does not inherit from the block concept, since a vector
> >      is strided and a block is not. They are incompatible concepts.
> >      If blocks have no semantic reason to be in vectors, and if they
> >      are not used in a integral way in the implmentation, then they
> >      should be removed.
> 
> I agree, there is no inheritance relationship between them.  However
> there is a relationship between the vector and the block if vectors
> are allowed to grow/shrink -- the size of the block places a limit on
> the maximum size on the vector.  This is not something we currently
> use, but that is what I had in mind at the time.

Huh?

> >  Q2. What is the meaning of functions gsl_vector_alloc_from_block(),
> >      gsl_vector_alloc_from_vector(), etc.?
> >      gsl_vector_alloc_from_block() seems to be constructing a view
> >      of an underlying data segment. As such, it conflicts with the
> >      semantics of views. There should be only one view semantic.
> 
> These functions create a new gsl_vector object (on the heap)
> referencing the same memory as an existing vector or block.  Note that
> the gsl_vector object is only the metadata (size, stride, etc) and
> does not include the block of memory, just the point to it.

This is a confusing use of vector. The notions are wrapped up
on each other, like worms in the can. Vectors have blocks,
but they don't always own them (or do they?). Vector views have
vectors, which have blocks, but they never own them. And the blocks
sitting inside are never used for their "block-ness"; they
seem to provide only a useless level of indirection.

The C idiom

  struct A {
   struct B b;
  };

is generally used to express the notion that A inherits from B.
For example, the C standard guarantees that it is always possible
in this situation to cast from A to B. The alignments are always
correct.

With this in mind, our structs say that gsl_vector_view inherits
from gsl_vector. In my mind, this is precisely backwards. What is
the intention? Is it "inherits from" or "is implemented in terms of"?

> There's a potential confusion over terminology, a gsl_vector is
> actually a "view" of a gsl_block (which it usually owns, but not
> always) and a gsl_vector_view is also a "view" - the difference is
> that gsl_vector is on the heap and gsl_vector_view is on the stack

But this is all confused because the vector member of gsl_vector_view
contains a block pointer. There is simply no way to tell the intention
of the design by looking at the structs. 

Furthermore, you have conflated the logically separate notions
of "being a view" and "being on the stack". The design should
not do this. For example: Why are vectors "on the heap"?
Answer: because they have this block pointer inside, which
had to be allocated. Why? I don't know. The block could have simply
been a member (by value) and then it could have been on the stack.
Again, the blocks seem to provide only an extraneous level of
indirection. In fact, the block inside vectors does nothing
to support the semantics or the implementation of vector.
It cannot support the semantics, since the concepts are
incompatible, and it does not support the implementation,
since none of the block functions are used there.

> (and never owns the block).  As discussed earlier, the version on the
> stack has a different type because of limitations of const (we have to
> define both gsl_vector_const_view and gsl_vector_view when it is on
> the stack whereas the heap version, as a pointer, can be used as const
> gsl_vector * or gsl_vector *)
> 
> If there was a way to put a gsl_vector object on the stack directly
> without causing const violations we would not need to wrap it in a
> gsl_vector_view struct.

As I have said before, this is backwards.
Look at the design of boost::multi_array. The fundamental
underlying type is the const view type. On top of that is
the non-const view type. Finally, the type which "owns"
it's data segment sits on top of these.

There are three needed meta-types. In boost::multi_array terminology
these meta-types are
  const_multi_array_ref
  multi_array_ref
  multi_array

Just looking at that list, the semantic relationships are clear.
Looking at ours, the semantics are not clear. Furthermore,
examining our structs only adds to the confusion.

> >      gsl_vector_alloc_from_vector() is similar, though it is more
> >      appropriate to say it is constructing a slice of a vector.
> >      Again, the semantics are confused.
> > 
> >      The suffix '_alloc' is confusing, since it is not clear what is
> >      being alloced. Obviously the struct itself is being alloced,
> >      since it is returned by pointer. But what about the data?
> 
> Yes, it is a bit confusing.  The constraint was that people should be
> able to write
> 
>      gsl_vector * v = gsl_vector_alloc(n)
> 
> and have it just work, whereas it would be more logical to use
> 
>     gsl_block * b = gsl_block_alloc(N)
>     gsl_vector * v = gsl_vector_view (b, stride, n)  [or something like that]
> 
> which is cumbersome for the typical usage.

This does not address the point, which is that the name of a
function like gsl_vector_alloc_from_vector() is inappropriate,
since it is not clear what is being "alloced".

> >  Q3. Why do we have functions like gsl_matrix_row(),
> > gsl_matrix_diagonal(), etc, and yet no support for general slicing
> > operations? These functions should be simple wrappers over a more
> > general functionality.
> 
> What would be the interface for a general slicing operation for the
> existing matrix type?  I have no objection to adding it, the functions
> we have were added on an as-needed basis.

General slicing of matrices (or any rank>1 type) requires a
new type, the multi-array type, to handle the indexing.
As discussed before, matrices themselves are constrained
versions of the rank 2 case, because of the requirement
that the fast index access be contiguous. This means that
slicing operations on matrices, which are intended to
produce matrices, must themselves be constrained.

> >  Q4. Why do views export a different interface?
> > 
> >      There are many operations on vectors that I cannot apply to
> >      vector views, but which would make perfect sense for
> >      views. These include obvious things like min(). max(), scale(),
> >      etc. They also include the i/o functions. Writing and reading
> >      from and to view objects is a perfectly well-defined notion.
> 
> Maybe I have misunderstood this question but all vector and matrix
> operations are supported on views, by calling them as &view.vector or
> &view.matrix.

But this whole design is cumbersome. You've wedged yourself into
a corner with the implementation of these view types, and the
whole thing is not properly level-ized. Think about what happens
if you want to make a view of a view, etc. Also note that the
data access is needlessly non-generic because of the extra
level of containment; I cannot say view.data[], I have to
say view.vector.data[]. These interface problems are
obviously related.

> A gsl_vector_view is a struct which contains one thing
> - a gsl_vector, so all operations on it are definitely supported,
> including i/o.  Same for matrices.  Without this, the views would
> indeed be useless.

In the current design they are not useless. However, they
are meta-useless, because the design is grotesque.

--
G. Jungman

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: questions about block/vector/matrix
  2009-10-19 21:37   ` Gerard Jungman
@ 2009-10-23 21:28     ` Brian Gough
  0 siblings, 0 replies; 4+ messages in thread
From: Brian Gough @ 2009-10-23 21:28 UTC (permalink / raw)
  To: Gerard Jungman; +Cc: gsl-discuss

At Mon, 19 Oct 2009 14:11:26 -0600,
Gerard Jungman wrote:
At Mon, 19 Oct 2009 14:11:26 -0600,
Gerard Jungman wrote:
> It is worth mentioning at this point that std::valarray is a
> recognized failure. The implementations could never live up
> to the intentions. See many discussions on the web, including
> an admission from the designer of valarray that it is a
> failure. It is best not to invoke valarray as a
> justification for any design choice; 

Yes, gsl_block was written in 1999, so I am just describing the reason
it was done like that at the time.  

While it's not essential, I don't think it's a bad principle to have
wrapper around malloc.

> This does not address the point, which is that the name of a
> function like gsl_vector_alloc_from_vector() is inappropriate,
> since it is not clear what is being "alloced".

That is an old undocumented function, not intended for actual use as
far as I'm aware.  Ideally it should have been removed.  The header
files no doubt contain bits of old junk.

The best description of the vectors and matrices is in the manual,
where I've written what should be the clearest description of how they
are meant to be used and some simple examples of the view functions.

The challenge for any scheme is getting reasonable const behavior in
C.  If that problem can be solved better then everything else follows
more easily.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-10-23 21:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-05 10:12 questions about block/vector/matrix Gerard Jungman
2009-10-16 20:12 ` Brian Gough
2009-10-19 21:37   ` Gerard Jungman
2009-10-23 21:28     ` Brian Gough

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).