** Multi-array.

   A multi-array is a data structure managing a reference to
   a data segment (think double * as an origin) and enabling
   indexed access to the data. Indexing is controlled by
   an indexing map, which translates indices into an
   offset in the underlying data segment. The number
   of indices (dimension of the domain of the indexing map),
   is called the rank of the indexing map.

   The data segment may be externally controlled, or internally
   controlled by the multi-array. In essence, "control" means
   control of the allocation/deallocation process for the segment.

   No notion of view is needed. When the data is internally
   controlled, we can think of the multi-array as "original".
   When the data is externally controlled, we can think of
   the multi-array as a view. But this is not a precise way
   of thinking. More precisely, the life-cycle of the data
   is simply independent of the indexing semantics, and the
   need for a separate notion of view evaporates.

   This lack of a separate view type is not shared by some other
   designs. For example, boost introduces a distinct reference
   type boost::multi_array_ref, in addition to boost:multi_array.
   The reason for this choice is not clear. It may be for absolutely
   optimal efficiency. Certain choices are easier to make if one
   separates these types; however, my current feeling is that
   the extra type is not justified.

   If we decide to introduce separate view types later, we must
   insure that it is done in a way which does not force other
   interfaces in the library to grow combinatorially. And
   without doubt, the extra types must have the same generic
   interfaces as the non-view types; this is one area where
   GSL fails miserably.


** Indexing map.

   An indexing map translates an index vector to an offset. It is
   specified by a shape. At a minimum, shapes include the dimension
   of each index (number of indices addressed). When a shape allows
   an index in place i to have only one value, that index place
   is called "invariate". If it can have no values (null set), it is
   called "null". Otherwise (the normal case) it is called "variate".

   The number of indices (dimension of the domain of the indexing map),
   is called the "static rank" or simply "rank" of the map. The
   number of variate indices in a given map is called the
   "dynamic rank" or "effective rank" of the map.

   Examples using C-array notation:

     double x[3][4][5];  // static rank = 3, dynamic rank = 3

     double x[3][1][5];  // static rank = 3, dynamic rank = 2
                         // [3] and [5] are variate, [1] is invariate


** Structure of indexing maps.

   We choose the most general affine form for indexing maps.
   Therefore, the offset is calculated as
   
     offset = sum(n, index[n] * stride[n])

   The data segment is stored as a pointer to the origin of
   the data. Therefore, the most general map calculation
   takes the affine form

     location = origin + sum(n, index[n] * stride[n]).


** Slicing as composition of indexing maps.

   Let i[N] be an element of the index set for a rank N indexing map.
   Let j[M] be an element of the index set for a rank M indexing map.

   Let delta[N][M] be an integer-valued matrix and b[N] an
   integer-valued vector. Then the following formula which
   maps M-dimensional indices to N-dimensional indices
   defines a rank M index map from any rank-N index map.

     i[n] = b[n] + sum(m, delta[n][m] * j[m])

   Notice the functoriality. "Old" indices are determined in
   terms of "new" indices. In other language, the new indices
   are pulled-back to the old context. This is the correct
   functoriality to allow arbitrary chained composition of
   index maps, and therefore slices of slices, to any order.

   It should be clear that the index map generated this way
   is always affine, and all affine indexing maps can be
   generated in this way.

   Therefore, a general slice of an affine indexing map
   is specified by a vector b[] and matrix delta[][], of
   the indicated dimensions.

   The reader can convince himself that any form of strided
   access is supported in this way. For example, it is equally
   easy to slice out the super-diagonal of a matrix as it is to
   slice out a row or column, or for that matter to slice out
   any regularly strided sub-matrix, etc.

   The underlying direct data access, by computation of an
   offset, is itself an affine indexing map, specialized to
   the case where the "old" index is one-dimensional. So
   the design space of affine indexing maps is closed.


** Multi-array implementation requirements.

 * Indexing calculations must be as fast as possible. The indexing
   arithmetic should be inlined when possible. The shape data
   structure should enable the fastest possible computation,
   consistent with general strided access.

   This suggests that different rank objects be of different static
   types, since this is the most straight-forward way to insure that
   the correct indexing calculation is chosen at compile-time. This
   design corresponds to the usual designs (such as in GSL), where
   vectors and matrices (for example) are different static types.
   This choice also corresponds to the design of boost::multi_array,
   where the rank is a template parameter and therefore a
   compile-time constant which generates a separate type.

 * The shape of a multi-array should be dynamic. This corresponds
   to the usual design, such as in GSL, where the dimensions of 
   a matrix (for example) are run-time constants, not compile-time
   constants.

 * It should be possible to construct multi-arrays without
   heap allocations. For example:

     double data[64];
     gsl_marray_2 ma_8x8 = gsl_marray_2_build(data, 8, 8);

   Placing such objects on the stack has good implications
   for efficiency and is very natural from the client's point
   of view. For example, in C++, one simply writes

    std::vector<double> v;

   which places the object on the stack as an automatic variable.


** Multi-array implementation choices. These are open for debate,
   but these are the choices I have made. No loss of functional
   generality is implied by these choices, only some loss of
   syntactic generality.

 * Multi-array indexing conforms precisely to C-style array
   indexing. Shapes are completely specified by dimensions and
   strides. This means that indices always "start" at 0, and
   end at dimension-1 (inclusive).

 * The notion of storage order does not enter. The "fast" index
   is always to the right, conforming to C array semantics. See
   the discussion of matrices below.


** Vector.

   A vector manages a data segment and supports indexed access
   to the data through a rank one indexing map. However,
   vectors may be required to satisfy extra constraints, not
   compatible with rank one multi-arrays. Also, they may provide
   other capability, such as indexing which "starts" at some
   non-zero value. Personally, I have no desire for this extra
   sort of functionality. It would only add overhead to an
   otherwise clean design. My first inlination is to
   implement vectors as follows:

       typedef  gsl_marray_1  gsl_vector; // DONE

   However, the reality may not be quite as simple as this in
   the end. However it is done, I think it is a good idea
   to preserve the name "vector" in the design, although
   it will probably imply some significant boiler-plate
   in function declarations, since C does not allow us
   to simply inherit the member function namespace
   of another class. SIGH.


** Matrix.

   Matrices are clearly different from rank two multi-arrays.
   There are necessary extra capabilities, and there are also
   constraints, in order to make them conform to the notions
   used in linear algebra packages.

   The extra capability corresponds to the identification of "row"
   and "column" indices. Multi-array semantics are governed by data
   locality, so the notion of "fast" index is immutable. But a matrix
   may either identify its rows with fast index traversal (C order)
   or its columns with fast index traversal (Fortran order). Matrices
   will need to remember which and to adjust their indexing computations
   accordingly.

   The extra constraint on matrices, in comparison to multi-arrays,
   is the need for the fast index to traverse the data with stride 1.
   For example, BLAS interfaces allow for vectors of arbitrary
   stride, but do not allow for a matrix "fast index" stride.

   Example:
     Let A be a 3x3 matrix. Let B1 be a 2x2 matrix defined as a
     view (sub-matrix) of A defined by the x's in the following
     diagram:
     
       x x -
       - - -
       x x -

     Then B1 is a matrix, in the BLAS sense. It can be passed
     to BLAS functions without further reshaping. To be exact,
     it would be passed with origin equal to the given origin
     and with tda = 6 and dimensions (2,2).

     Let B2 be the 2x2 sub-matrix view defined by the following
     diagram.
     
      x - x
      - - -
      x - x

     Then B2 is _not_ a BLAS compatible matrix, since there is no
     parameter in the BLAS interfaces which allows for the non-unit
     stride in the fast index. Some reshaping, involving data
     motion/copy, would be required to pass this view to BLAS
     functions.

   Therefore, I do not think that generic rank two multi-arrays can
   be identified with matrices. Rather, a lightweight conversion
   from rank-two multi-arrays to matrices must be provided, and this
   conversion must fail at run-time if the multi-array fast index
   stride is not unity.


** Tensor

  No explicit notion of tensor currently exists, beyond that
  expressed in the tensor/ extension package for GSL, by
  Jordi Burguet-Castell.

  The Burguet-Castell package does not meet the requirements
  layed out above, mainly because it looks and works too much
  like the GSL types, having been modelled on them for obvious
  reasons. In particular, it allows the rank to be determined
  at run-time. This may or may not be desirable, depending on
  how we want tensors to interact withy multi-arrays.

  However, we draw some motivation from the Burguet-Castell package.
  A tensor is defined to be a multi-array where the shape is constrained
  to have all dimensions equal.

  Like in the case of matrices, we encounter some small technical
  problems in the identification with multi-arrays. Since the
  shape of a multi-array is dynamic, the compatibility of
  a given multi-array with the tensor concept is question
  that must be deferred to run-time.

  The question of the role of multi-arrays in the implementation
  of tensors remains open, as it does for matrices. But we do
  require a certain level of interoperability, perhaps expressed
  as light-weight conversion functions which fail to convert a
  multi-array to a tensor if the shape is not consistent. This
  seems like the most likely solution. It also allows the tensor
  implementation to be independent of the high-level multi-array
  interfaces, which strikes me as the right way to do it, from
  both a general design point of view and an efficiency point of
  view.