APL-style array indexing in Kawa

public inbox for kawa@sourceware.org
 help / color / mirror / Atom feed

* APL-style array indexing in Kawa - a sketch
@ 2015-08-14  1:59 Per Bothner
  2015-08-17 20:24 ` Jamison Hope
  0 siblings, 1 reply; 6+ messages in thread
From: Per Bothner @ 2015-08-14  1:59 UTC (permalink / raw)
  To: Kawa mailing list


I recently changed the various vector classes to use an index mapper:
Every vector has two fields: a raw Java array buffer, as well as
an IntSequence mapper which maps vector indexes to buffer indexes.
This allows slices, rotations, permutations, etc, using the
same same buffer:
   (VEC I) => index I in VEC ; function-call notation for indexeing
   (set! (VEC i) VALUE) - update value
   (VEC [i <: J]) => slice of VEC ; same buffer, new index mapper
   (VEC [i0 i1 ... in]) ==> [(VEC i0) (VEC i1) ... (VEC in)]
       - but re-using the VEC buffer with a new index mapper
   (set! (VEC [i <: J]) VALUE) - replace elements (possible re-size)

Kawa also supports multi-dimensional arrays, based on SRFI-25, and you
can use the same function-call notation:
   (ARR X Y)
   (set! (ARR X Y) VALUE)
SRFI-25 (and Kawa) has functions for slices, transpose, rotations, etc.
However, I'd like to have APL-style array-as-index for general arrays:
   (ARR [A <: B] Y [c <: D])
   (ARR [5 2 6] [2 3])
   (ARR IARR1 IARR2 IARR3)

The bulk of this note shows (tersely) the types, classes, and
basic representation of generalized arrays.  It focuses on the
differences from the version of gnu.lists in the Kawa trunk.
The actual Scheme API (in terms of functions) is a later discussion;
initially we will prioritize array construction plus indexing.

* Is this a feature you'd find exciting and/or use it?
* Would your friends or co-workers find it exciting and/or use it?
* I know computers and data structures.  I don't know matrixes or math, much.
   Would this approach cause problems if trying to to do high-performace
   array/matrix manipulations, or interfacing to other libraries?
* Would you be interested in helping out?


/*
Scheme types:

array[T] - implementationType: AbstractArray or ObjectArray
   array0[T] extends array[T]
   array1[T] extends array[T]
   ...

The arrayN types for various N are all "erased" to plain array in the JVM,
though it is stored in the class file using the kawa.SourceType annotation.
It means we know that the rank of the array N, which means we can
catch some errors (such as wrong number of arguments) and
maybe generate better code.

sequence[T] ..
vector[T] extends sequence[T], array1[T]
   vector[uint] == u32vector, etc
*/

// Equivalent Scheme type: array
public interface Array { // Existing
     public E get(int[] indexes);
     // Add the following, mainly for performance:
     public E get();
     public E get(int i);
     public E get(int i, int j);
     public E get(int i, int j, int... rest);
}

// Scheme: array[int]
public interface IntArray extends Array<Integer> {
     // These throw an exception if the rank() differs from argument count.
     public int intAt();
     public int intAt(int arg1);
     public int intAt(int arg1, int arg2);
     public int intAt(int arg1, int arg2, int arg3, int... argrest);
     public int intAt(int[] args);
     default public Integer get() { return Integer.valueOf(intAt()); }
     default public Integer get(int i) { return Integer.valueOf(intAt(i)); }
     // etc
}

// Scheme sequence[int]
public interface IntSequence
     extends IntArray
             // Possibly also extends Sequence<Integer>
{
     // It might be preferable for this inteface to not add any
     // methods beyond those of IntArray, to avoid some casts.
     // At least for commonly used method of SimpleVector.
     int size();
}

public abstract class IndirectIndexable<E>
     extends AbstractSequence<E>
             // No longer implements Sequence<E>
{
     protected IntArray indexes; // Changed from IntSequence.
     // Otherwise basically unchanged
}

public abstract class AbstractArray<E>
     // replaces GeneralArray
     extends IndirectIndexable<E>, implements Array<E>
{
     protected abstract E getBuffer(int index);
     public E get() { return getBuffer(indexes.getAt()); }
     public E get(int i) { return getBuffer(indexes.getAt(i)); }
     public E get(int i, int j) { return getBuffer(indexes.getAt(i,j)); }
     public E get(int i, int j, int k, int... rest) {
         return getBuffer(indexes.getAt(i, j, k, rest))); }
}

// Scheme: array[Object]
public class ObjectArray<E> extends AbstractArray<Object> {
     Object[] data;
     protected Object getBuffer(int j) { return (E) data[i]; }
     public class ObjectArray(Object[] data, IntArray indexes) {
         this.data = data;
         this.indexes = indexes;
     }
}

// Scheme: array[float] (maybe add alias: f32array)
public class F32Array extends AbstractArray<Float> {
     float[] data;
     public floatAt() { return data[indexes.intAt()]; }
     public floatAt(int i) { return data[indexes.intAt(i)]; }
     public floatAt(int i, int j) { return data[indexes.intAt(i, )]; }
     public float floatAt(int[] args) {
          int effIndex = indexes.intAt(args);
          return data[effIndex];
     }
     public final Float getBuffer(int index) {
         return Float.valueOf(data[index]);
     }
}
abstract class IntArray<E> extends AbstractArray<E> {
     int[] data;
     public intAt() { return data[indexes.intAt()]; }
     public intAt(int i) { return data[indexes.intAt(i)]; }
     public intAt(int i, int j) { return data[indexes.intAt(i, j)]; }
     public intAt(int i, int j, int k, int rest) {
         return data[indexes.intAt(i, j, k, rest)];
     }
     public float intAt(int[] args) {
         switch (args.length) {
         case 0: return intAt();
         case 1: return intAt(args[0]);
         case 2: return intAt(args[0], args[1]);
         default:
             int[] rest = args.length == 3 ? IntVector.empty
                 : Arrays.copyOfRange(args. 3, args.length-3);
             return intAt(args[0], args[1], arg2[2], rest);
         }
     }
}

// Scheme: array[int]
public class S32Array extends IntVector<Integer> implements IntArray {
     public final Integer getBuffer(int index) {
         return Integer.valueOf(data[index]);
     }
}

// Scheme: array[uint]
public class U32Array extends IntVector<UInt> {
     public final UInt getBuffer(int index) {
         return UInt.valueOf(data[index]);
     }
}

public interface SimpleVector<E> extends Array<E>, Sequence<E> {
     default public int rank() { return 1; }
}

// Scheme: vector[E] == array1[E]
public class FVector<E> extends ObjectArray<E> implement SimpleVector<E> {
     public FVector(Object[] data, IntSequence indexes) {
         super(data, indexes);
     }
}

// Scheme: f32vector == vector[float] == array1[float]
public class F32Vector extends F32Array implement SimpleVector<Float> {
}

// Scheme: s32vector == vector[int] == array1[int]
public class S32Vector extends U32Array implement SimpleVector<Integer> {
}

public class ArrayMapper implements IntArray {
     // The length of all these must match rank.
     int[] dimensions;
     int[] strides;
     int[] lowBounds;
     int offset;
     int rank;
     public void checkRank(int r) { if (rank!=r) throw new Exception(); }
     static int adjust(int offset, int low, int len, int stride, int index) {
         if (index < low || (index -= low) >= len)
           throw new IndexOutOfBoundsException();
         return offset + strides * index;
     }
     public intAt() {
         checkRank(0);
         return return offset;
     }
     public intAt(int i) {
         checkRank(1);
         return adjust(offset, lowBounds[0], dimensions[0], strides[0], i);
     }
     public intAt(int i, int j) {
         checkRank(2);
         int r = offset;
         r = adjust(r, lowBounds[0], dimensions[0], strides[0], i);
         r = adjust(r, lowBounds[1], dimensions[1], strides[1], j);
         return r;
     }
     public intAt(int i, int j, int k, int... rest) {
         int rlength = rest.length;
         checkRank(3+rlength);
         int r = offset;
         r = adjust(r, lowBounds[0], dimensions[0], strides[0], i);
         r = adjust(r, lowBounds[1], dimensions[1], strides[1], j);
         r = adjust(r, lowBounds[2], dimensions[2], strides[2], k);
         for (int d = 0;  d < rlength; d++)
             r = adjust(r, lowBounds[3+d], dimensions[3+d], strides[3+d],
                        rest[d]);
         return r;
     }
}

// Mightbe reasonable to have separate ArrayMapper1, ArrayMapper2,
// ArrayMapper3 classes as optimizations when rank is 1, 2 or 3.
public class ArrayMapper1 implements IntSequence {
     int lowBound;
     int dimension;
     int stride;
     int offset;
     public intAt(int i) {
         return ArrayMapper.adjust(offset, lowBound, dimension, strides, i);
     }
}

class Arrays {
     // Existing methods have to be re-written.
     public static Array make(Array shape, Object value) {
         ArrayMapper mapper = createFrom(shape);
         int total = maper.size(); // ???
         Object[] data= new Object[total];
         if (value != null) {
             for (int i = total;  --i >= 0; )
                 data[i] = value;
         }
         return new ObjectArray(data, mapper);
     }

     /** Used to implement (ARR I1 I2 ... In).
      * For example if ARR is an ObjectArray, we return
      * new ObjectArray(ARR.data, compose(ARR.indexes, I1, I2, ..., In)).
      * Note we want to optimize if all I1...In are ArrayMapper
      * (or convertible to ArrayMapper, such as scalar or Range),
      * since in that case the result can be an ArrayMapper.
      */
     public static IntArray compose(IntArray base, IntArray... indexes) {
         int n = indexes.length;
         int r = 0;
         for (int i = 0; i < n; i++)
             r += indexes.rank();
         final int rank = r;
         // This code is doing it lazily.
         // We should probably pre-calculate a S32Array instead
         return new IntArray() {
             public int rank() { return rank; }
             public int intAt(int[] args) {
                 checkRank(args.length);
                 int[] bargs = new int[base.rank()];
                 int sofar = 0;
                 for (int i = 0;  i < n; i++) {
                     IntArray ind = indexes[i];
                     int rank = ind.rank();
                     switch (irank) {
                     case 0:
                         bargs[i] = ind.intAt();
                         break;
                     case 1:
                         bargs[i] = ind.intAt(args[sofar]);
                         break;
                     case 2:
                         bargs[i] = ind.intAt(args[sofar], args[sofar+1]);
                         break;
                     case 3: // should also optimize this case
                     default:
                         int[] tmp = new int[ind.rank()];
                         for (int j = 0; j < ind.rank(); j++)
                             tmp[j] = args[sofar+j]
                                 bargs[i] = ind.intAt(tmp);
                     }
                     sofar += irank;
                 }
                 return base.intAt(bargs);
             }};
     }
}

-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: APL-style array indexing in Kawa - a sketch
  2015-08-14  1:59 APL-style array indexing in Kawa - a sketch Per Bothner
@ 2015-08-17 20:24 ` Jamison Hope
  2015-08-17 22:48   ` Per Bothner
  0 siblings, 1 reply; 6+ messages in thread
From: Jamison Hope @ 2015-08-17 20:24 UTC (permalink / raw)
  To: Kawa mailing list

On Aug 13, 2015, at 9:59 PM, Per Bothner <per@bothner.com> wrote:

> 
> I recently changed the various vector classes to use an index mapper:
> Every vector has two fields: a raw Java array buffer, as well as
> an IntSequence mapper which maps vector indexes to buffer indexes.
> This allows slices, rotations, permutations, etc, using the
> same same buffer:
>  (VEC I) => index I in VEC ; function-call notation for indexeing
>  (set! (VEC i) VALUE) - update value
>  (VEC [i <: J]) => slice of VEC ; same buffer, new index mapper
>  (VEC [i0 i1 ... in]) ==> [(VEC i0) (VEC i1) ... (VEC in)]
>      - but re-using the VEC buffer with a new index mapper
>  (set! (VEC [i <: J]) VALUE) - replace elements (possible re-size)

In this last example, VALUE is a sequence and not an atom, right?  So:

(define v (make-vector 5 0)) ;; -> #(0 0 0 0 0)

(set! (v [2 <: 4]) 1) ;; error!

(set! (v [2 <: 4]) #(1 1)) ;; -> v is now #(0 0 1 1 0)

(set! (v [2 <: 4]) #(1)) ;; -> v is now #(0 0 1 0)

> Kawa also supports multi-dimensional arrays, based on SRFI-25, and you
> can use the same function-call notation:
>  (ARR X Y)
>  (set! (ARR X Y) VALUE)
> SRFI-25 (and Kawa) has functions for slices, transpose, rotations, etc.
> However, I'd like to have APL-style array-as-index for general arrays:
>  (ARR [A <: B] Y [c <: D])
>  (ARR [5 2 6] [2 3])

So the way to read this is that it constructs a new 3x2 2D array which
shares backing store with ARR, and its six elements in row-major order
are (ARR 5 2), (ARR 5 3), (ARR 2 2), (ARR 2 3), (ARR 6 2), and (ARR 6 3),
right?

Is there some clever notation for a range meaning "from here to the end,
wherever that is"?  (VEC [i <: ]) doesn't work;
(VEC [i <: (length VEC)]) works, but seems verbose.

Matlab (which uses 1-based indexing) lets you do things like A(2:end) to
extract all-but-the-first element.  Also, for dimensions for which you
want the whole thing, instead of writing "1:end" you can use ":" as
shorthand:  A(:,3) means the whole third column (regardless of the
number of rows), and A(3,:) means the whole third row (regardless of the
number of columns).

>  (ARR IARR1 IARR2 IARR3)
> 
> The bulk of this note shows (tersely) the types, classes, and
> basic representation of generalized arrays.  It focuses on the
> differences from the version of gnu.lists in the Kawa trunk.
> The actual Scheme API (in terms of functions) is a later discussion;
> initially we will prioritize array construction plus indexing.
> 
> * Is this a feature you'd find exciting and/or use it?

I can definitely think of math-ish things to do with something like this.

If I have a 4x4 matrix T representing a spatial transformation, then
(T [0 <: 3] [0 <: 3]) would be the upper-left 3x3 submatrix, which would
be a rotation matrix that magically stays in sync with the original
transform data.  That could be handy.

The values at [(T 0 3) (T 1 3) (T 2 3)] are the translation vector.
Using this new notation, that's *almost* the same as (T [0 <: 3] [3]),
except that the latter gives me a 3x1 2D matrix rather than a 3-element
1D vector.

Will there be some simple way to flatten the result to reduce its rank
so I can omit the index that would always be 0?

> * Would your friends or co-workers find it exciting and/or use it?

They may, if I tell them to.. most likely they would use it unknowingly
by using some program that I wrote (there aren't a lot of Schemers or
Lispers -- or Java-ers, really -- here).

> * I know computers and data structures.  I don't know matrixes or math, much.
>  Would this approach cause problems if trying to to do high-performace
>  array/matrix manipulations, or interfacing to other libraries?

It's kind of hard to say.  It depends whether the added indirection of
the index mapper introduces more overhead than the gain from not
allocating a new data store and copying over the values at array
creation time, right?  It certainly *seems* like it would be a win, at
least for arrays larger than some size N (wherever the size of the data
starts to outgrow the size of the index mapper object).

What would be the idiomatic way to do a deep(?) copy, to break the
connection with the original data buffer?  One situation I can think of
where the sharing could cause trouble would be passing the sub-vector to
something like an in-place sort function, where we might not actually
want to overwrite the parent array's data.

> * Would you be interested in helping out?

I'll certainly try it out and let you know if stuff doesn't work, but I
don't know how much help I would be with the actual implementation.

--
Jamison Hope
The PTR Group
www.theptrgroup.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: APL-style array indexing in Kawa - a sketch
  2015-08-17 20:24 ` Jamison Hope
@ 2015-08-17 22:48   ` Per Bothner
  2015-08-18  3:37     ` Jamison Hope
  0 siblings, 1 reply; 6+ messages in thread
From: Per Bothner @ 2015-08-17 22:48 UTC (permalink / raw)
  To: kawa

On 08/17/2015 01:24 PM, Jamison Hope wrote:
> On Aug 13, 2015, at 9:59 PM, Per Bothner <per@bothner.com> wrote:
>
>>
>> I recently changed the various vector classes to use an index mapper:
>> Every vector has two fields: a raw Java array buffer, as well as
>> an IntSequence mapper which maps vector indexes to buffer indexes.
>> This allows slices, rotations, permutations, etc, using the
>> same same buffer:
>>   (VEC I) => index I in VEC ; function-call notation for indexeing
>>   (set! (VEC i) VALUE) - update value
>>   (VEC [i <: J]) => slice of VEC ; same buffer, new index mapper
>>   (VEC [i0 i1 ... in]) ==> [(VEC i0) (VEC i1) ... (VEC in)]
>>       - but re-using the VEC buffer with a new index mapper
>>   (set! (VEC [i <: J]) VALUE) - replace elements (possible re-size)
>
> In this last example, VALUE is a sequence and not an atom, right?  So:
>
> (define v (make-vector 5 0)) ;; -> #(0 0 0 0 0)
>
> (set! (v [2 <: 4]) 1) ;; error!
>
> (set! (v [2 <: 4]) #(1 1)) ;; -> v is now #(0 0 1 1 0)
>
> (set! (v [2 <: 4]) #(1)) ;; -> v is now #(0 0 1 0)

Right.

I'd like to have some "fill" syntax or idiom:  Repeat this item or sequences
as necessary.  One option is to define infinite cyclic arrays as a
useful-in-itself data type, and then add the rule that if the VALUE
is infinite we truncate it to the size of the selected left-hand slice.

>> Kawa also supports multi-dimensional arrays, based on SRFI-25, and you
>> can use the same function-call notation:
>>   (ARR X Y)
>>   (set! (ARR X Y) VALUE)
>> SRFI-25 (and Kawa) has functions for slices, transpose, rotations, etc.
>> However, I'd like to have APL-style array-as-index for general arrays:
>>   (ARR [A <: B] Y [c <: D])
>>   (ARR [5 2 6] [2 3])
>
> So the way to read this is that it constructs a new 3x2 2D array which
> shares backing store with ARR, and its six elements in row-major order
> are (ARR 5 2), (ARR 5 3), (ARR 2 2), (ARR 2 3), (ARR 6 2), and (ARR 6 3),
> right?

Correct.

> Is there some clever notation for a range meaning "from here to the end,
> wherever that is"?  (VEC [i <: ]) doesn't work;

That is indeed the clever notation, and indeed it doesn't work for silly
reasons involving int vs integer.  I'll fix it.

> (VEC [i <: (length VEC)]) works, but seems verbose.

Agreed.

> Matlab (which uses 1-based indexing) lets you do things like A(2:end) to
> extract all-but-the-first element.  Also, for dimensions for which you
> want the whole thing, instead of writing "1:end" you can use ":" as
> shorthand:  A(:,3) means the whole third column (regardless of the
> number of rows), and A(3,:) means the whole third row (regardless of the
> number of columns).

We could define [<:] as a supported syntax (see $bracket-list$ in syntax.scm).
It would be relative easy to define 0 as the default lower bound, but that
doesn't quite do what you want.  We could define [<;] as a special magic value
that gets coerced to a specific range depending on context.

>> * Is this a feature you'd find exciting and/or use it?
>
> I can definitely think of math-ish things to do with something like this.
>
> If I have a 4x4 matrix T representing a spatial transformation, then
> (T [0 <: 3] [0 <: 3]) would be the upper-left 3x3 submatrix, which would
> be a rotation matrix that magically stays in sync with the original
> transform data.  That could be handy.
>
> The values at [(T 0 3) (T 1 3) (T 2 3)] are the translation vector.
> Using this new notation, that's *almost* the same as (T [0 <: 3] [3]),
> except that the latter gives me a 3x1 2D matrix rather than a 3-element
> 1D vector.
>
> Will there be some simple way to flatten the result to reduce its rank
> so I can omit the index that would always be 0?

Why wouldn't you just use: (T [0 <: 3] 3) ?

We will have functions that reshape and transform arrays.
I believe this particular use-case can be handled by SRFI-25's share-array.
We do have the APL family of languages for inspiration ...

>> * I know computers and data structures.  I don't know matrixes or math, much.
>>   Would this approach cause problems if trying to to do high-performace
>>   array/matrix manipulations, or interfacing to other libraries?
>
> It's kind of hard to say.  It depends whether the added indirection of
> the index mapper introduces more overhead than the gain from not
> allocating a new data store and copying over the values at array
> creation time, right?  It certainly *seems* like it would be a win, at
> least for arrays larger than some size N (wherever the size of the data
> starts to outgrow the size of the index mapper object).

Spacewise, the difference is trivial.  In the proposal look at the ArrayMapper.java
combined with (say) F32Array, compared with the old GeneralArray combined with
a SimpleVector. Basically, it's a wash.  Of course you could write a simple Matrix class
that just allocates a data array and some bounds, and that would take slightly
less space and be slightly faster.  Speed is more of a factor, since we may
be adding some extra indirection and lookups.

Note an array-mapper is just an interface, and can be quite complex.
For example Range.IntRange just has 3 int fields; we can think of it
as an optimized version of the ArrayMapper class for rank=1.

> What would be the idiomatic way to do a deep(?) copy, to break the
> connection with the original data buffer?

It seems reasonable to offer a 1-argument array-copy function.
In addition to copying (necessary parts of) the data array,
it would simplify the IntArray interface to an ArrayMapper.

Note that array-copy is a special case of:
   (make-array-from-function SHAPE FUNCTION)
since a rank-N array can be treated as N-ary function.

>One situation I can think of
> where the sharing could cause trouble would be passing the sub-vector to
> something like an in-place sort function, where we might not actually
> want to overwrite the parent array's data.

We definitely need copying as well as sharing.

-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: APL-style array indexing in Kawa - a sketch
  2015-08-17 22:48   ` Per Bothner
@ 2015-08-18  3:37     ` Jamison Hope
  2015-08-18  4:12       ` Per Bothner
  0 siblings, 1 reply; 6+ messages in thread
From: Jamison Hope @ 2015-08-18  3:37 UTC (permalink / raw)
  To: kawa

On Aug 17, 2015, at 6:47 PM, Per Bothner <per@bothner.com> wrote:

> We could define [<:] as a supported syntax (see $bracket-list$ in syntax.scm).
> It would be relative easy to define 0 as the default lower bound, but that
> doesn't quite do what you want.  We could define [<;] as a special magic value
> that gets coerced to a specific range depending on context.

Hmm can we pick something that won't confuse paredit mode?

> Why wouldn't you just use: (T [0 <: 3] 3) ?

Oh, right.  I wasn't sure whether you could mix array indices and
scalar indices like that, but that certainly simplifies it.

> It seems reasonable to offer a 1-argument array-copy function.
> In addition to copying (necessary parts of) the data array,
> it would simplify the IntArray interface to an ArrayMapper.
> 
> Note that array-copy is a special case of:
>  (make-array-from-function SHAPE FUNCTION)
> since a rank-N array can be treated as N-ary function.

Yeah.  It occurs to me that this general function->array concept
overlaps quite a bit with expression templates from C++.  (This is a bit
off-topic w.r.t. APL style, but oh well.)  In addition to doing this
index mapping to create arrays from other arrays by permuting elements,
it would be useful to have a mechanism to construct an array that keeps
the same indices but invokes a function on each element.  So, perhaps
something like:

(array-map (lambda (x) (* 2 x)) ARR) => a new array where each element
is doubled in value.

(array-map + ARR1 ARR2) => a new array representing the elementwise sum
of the elements of ARR1 and ARR2.

(I'm imagining that the array-map call wouldn't allocate a new data
buffer, but would just close over the supplied function and array(s),
lazily applying the function to array elements as needed.)

--
Jamison Hope
The PTR Group
www.theptrgroup.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: APL-style array indexing in Kawa - a sketch
  2015-08-18  3:37     ` Jamison Hope
@ 2015-08-18  4:12       ` Per Bothner
  2015-08-18  4:43         ` Jamison Hope
  0 siblings, 1 reply; 6+ messages in thread
From: Per Bothner @ 2015-08-18  4:12 UTC (permalink / raw)
  To: kawa



On 08/17/2015 08:37 PM, Jamison Hope wrote:
> On Aug 17, 2015, at 6:47 PM, Per Bothner <per@bothner.com> wrote:
>
>> We could define [<:] as a supported syntax (see $bracket-list$ in syntax.scm).
>> It would be relative easy to define 0 as the default lower bound, but that
>> doesn't quite do what you want.  We could define [<;] as a special magic value
>> that gets coerced to a specific range depending on context.
>
> Hmm can we pick something that won't confuse paredit mode?

I'm not familiar with paredit - but why would [<:] confuse it?

> (array-map (lambda (x) (* 2 x)) ARR) => a new array where each element
> is doubled in value.
>
> (array-map + ARR1 ARR2) => a new array representing the elementwise sum
> of the elements of ARR1 and ARR2.
>
> (I'm imagining that the array-map call wouldn't allocate a new data
> buffer, but would just close over the supplied function and array(s),
> lazily applying the function to array elements as needed.)

"Lazily" is confusing here.  I think array-map should eager, in the
same way vector-map is.  What it should do:
(1) Figure out the shape of the result.  (The same as the argument, if
a single argument; otherwise they may be "broadcast" in the Racket sense.)
(2) Allocate an Object[] buffer for the result whose length is the
element-count of the shape.
(3) Iterate of the shape (which is the same as iterating over the buffer);
extract the corresponding argument array value using the same (or broadcast)
indexes; call the function.

http://docs.racket-lang.org/math/array_pointwise.html

The Racket array functions is another place to look for inspiration.
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: APL-style array indexing in Kawa - a sketch
  2015-08-18  4:12       ` Per Bothner
@ 2015-08-18  4:43         ` Jamison Hope
  0 siblings, 0 replies; 6+ messages in thread
From: Jamison Hope @ 2015-08-18  4:43 UTC (permalink / raw)
  To: kawa@sourceware.org list

On Aug 18, 2015, at 12:11 AM, Per Bothner <per@bothner.com> wrote:

> On 08/17/2015 08:37 PM, Jamison Hope wrote:
>> On Aug 17, 2015, at 6:47 PM, Per Bothner <per@bothner.com> wrote:
>> 
>>> We could define [<:] as a supported syntax (see $bracket-list$ in syntax.scm).
>>> It would be relative easy to define 0 as the default lower bound, but that
>>> doesn't quite do what you want.  We could define [<;] as a special magic value
>>> that gets coerced to a specific range depending on context.
>> 
>> Hmm can we pick something that won't confuse paredit mode?
> 
> I'm not familiar with paredit - but why would [<:] confuse it?

Oh that was supposed to be a colon.  OK nevermind then, [<:] sounds good.
[>:] might also be useful, such that (VEC [>:]) reverses the sequence.

>> (array-map (lambda (x) (* 2 x)) ARR) => a new array where each element
>> is doubled in value.
>> 
>> (array-map + ARR1 ARR2) => a new array representing the elementwise sum
>> of the elements of ARR1 and ARR2.
>> 
>> (I'm imagining that the array-map call wouldn't allocate a new data
>> buffer, but would just close over the supplied function and array(s),
>> lazily applying the function to array elements as needed.)
> 
> "Lazily" is confusing here.  I think array-map should eager, in the
> same way vector-map is.  What it should do:
> (1) Figure out the shape of the result.  (The same as the argument, if
> a single argument; otherwise they may be "broadcast" in the Racket sense.)
> (2) Allocate an Object[] buffer for the result whose length is the
> element-count of the shape.
> (3) Iterate of the shape (which is the same as iterating over the buffer);
> extract the corresponding argument array value using the same (or broadcast)
> indexes; call the function.
> 
> http://docs.racket-lang.org/math/array_pointwise.html
> 
> The Racket array functions is another place to look for inspiration.

Perhaps "map" is the wrong name to use.  My motivation for wanting it to
be lazy is to be able to compose these things without causing a bunch of
temporary buffers to get allocated needlessly.

For instance, suppose * is redefined such that (* ATOM SEQ) performs
(array-map (lambda (x) (* ATOM x)) SEQ), and - is redefined such that
(- SEQ1 SEQ2) performs (array-map - SEQ1 SEQ2).

Then suppose we have an assignment such as

(define x (array-copy (* alpha (- u v))))

where alpha is a number and u and v (and x) are sequences.  If array-map
is lazy, then the iteration only needs to happen a single time, with
each element (x i) being effectively evaluated as
(* alpha (- (u i) (v i))).  If array-map is eager, then the iteration
must happen three times, first for the subtraction, then for the
multiplication, and then for the copy, allocating data buffers each
time.

https://en.wikipedia.org/wiki/Expression_templates

Like I said, maybe this should be called something other than "map".
Although SRFI-42 called its thing stream-map, so there's precedent
for lazy mapping.

--
Jamison Hope
The PTR Group
www.theptrgroup.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-08-18  4:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-14  1:59 APL-style array indexing in Kawa - a sketch Per Bothner
2015-08-17 20:24 ` Jamison Hope
2015-08-17 22:48   ` Per Bothner
2015-08-18  3:37     ` Jamison Hope
2015-08-18  4:12       ` Per Bothner
2015-08-18  4:43         ` Jamison Hope

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).