From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <kawa-return-5908-listarch-kawa=sources.redhat.com@sourceware.org>
Received: (qmail 71760 invoked by alias); 14 Aug 2015 01:59:34 -0000
Mailing-List: contact kawa-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <kawa.sourceware.org>
List-Subscribe: <mailto:kawa-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/kawa/>
List-Post: <mailto:kawa@sourceware.org>
List-Help: <mailto:kawa-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: kawa-owner@sourceware.org
Received: (qmail 71615 invoked by uid 89); 14 Aug 2015 01:59:33 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_40,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2
X-HELO: aibo.runbox.com
Received: from aibo.runbox.com (HELO aibo.runbox.com) (91.220.196.211) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Fri, 14 Aug 2015 01:59:31 +0000
Received: from [10.9.9.207] (helo=mailfront03.runbox.com)	by bars.runbox.com with esmtp (Exim 4.71)	(envelope-from <per@bothner.com>)	id 1ZQ4HH-0003M6-6a	for kawa@sourceware.org; Fri, 14 Aug 2015 03:59:27 +0200
Received: from 70-36-239-58.dsl.dynamic.fusionbroadband.com ([70.36.239.58] helo=toshie.bothner.com)	by mailfront03.runbox.com with esmtpsa  (uid:757155 )  (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16)	(Exim 4.76)	id 1ZQ4H4-0006Vi-SV	for kawa@sourceware.org; Fri, 14 Aug 2015 03:59:15 +0200
To: Kawa mailing list <kawa@sourceware.org>
From: Per Bothner <per@bothner.com>
Subject: APL-style array indexing in Kawa - a sketch
Message-ID: <55CD4B6A.8010606@bothner.com>
Date: Fri, 14 Aug 2015 01:59:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
X-SW-Source: 2015-q3/txt/msg00025.txt.bz2


I recently changed the various vector classes to use an index mapper:
Every vector has two fields: a raw Java array buffer, as well as
an IntSequence mapper which maps vector indexes to buffer indexes.
This allows slices, rotations, permutations, etc, using the
same same buffer:
   (VEC I) => index I in VEC ; function-call notation for indexeing
   (set! (VEC i) VALUE) - update value
   (VEC [i <: J]) => slice of VEC ; same buffer, new index mapper
   (VEC [i0 i1 ... in]) ==> [(VEC i0) (VEC i1) ... (VEC in)]
       - but re-using the VEC buffer with a new index mapper
   (set! (VEC [i <: J]) VALUE) - replace elements (possible re-size)

Kawa also supports multi-dimensional arrays, based on SRFI-25, and you
can use the same function-call notation:
   (ARR X Y)
   (set! (ARR X Y) VALUE)
SRFI-25 (and Kawa) has functions for slices, transpose, rotations, etc.
However, I'd like to have APL-style array-as-index for general arrays:
   (ARR [A <: B] Y [c <: D])
   (ARR [5 2 6] [2 3])
   (ARR IARR1 IARR2 IARR3)

The bulk of this note shows (tersely) the types, classes, and
basic representation of generalized arrays.  It focuses on the
differences from the version of gnu.lists in the Kawa trunk.
The actual Scheme API (in terms of functions) is a later discussion;
initially we will prioritize array construction plus indexing.

* Is this a feature you'd find exciting and/or use it?
* Would your friends or co-workers find it exciting and/or use it?
* I know computers and data structures.  I don't know matrixes or math, much.
   Would this approach cause problems if trying to to do high-performace
   array/matrix manipulations, or interfacing to other libraries?
* Would you be interested in helping out?


/*
Scheme types:

array[T] - implementationType: AbstractArray or ObjectArray
   array0[T] extends array[T]
   array1[T] extends array[T]
   ...

The arrayN types for various N are all "erased" to plain array in the JVM,
though it is stored in the class file using the kawa.SourceType annotation.
It means we know that the rank of the array N, which means we can
catch some errors (such as wrong number of arguments) and
maybe generate better code.

sequence[T] ..
vector[T] extends sequence[T], array1[T]
   vector[uint] == u32vector, etc
*/

// Equivalent Scheme type: array
public interface Array { // Existing
     public E get(int[] indexes);
     // Add the following, mainly for performance:
     public E get();
     public E get(int i);
     public E get(int i, int j);
     public E get(int i, int j, int... rest);
}

// Scheme: array[int]
public interface IntArray extends Array<Integer> {
     // These throw an exception if the rank() differs from argument count.
     public int intAt();
     public int intAt(int arg1);
     public int intAt(int arg1, int arg2);
     public int intAt(int arg1, int arg2, int arg3, int... argrest);
     public int intAt(int[] args);
     default public Integer get() { return Integer.valueOf(intAt()); }
     default public Integer get(int i) { return Integer.valueOf(intAt(i)); }
     // etc
}

// Scheme sequence[int]
public interface IntSequence
     extends IntArray
             // Possibly also extends Sequence<Integer>
{
     // It might be preferable for this inteface to not add any
     // methods beyond those of IntArray, to avoid some casts.
     // At least for commonly used method of SimpleVector.
     int size();
}

public abstract class IndirectIndexable<E>
     extends AbstractSequence<E>
             // No longer implements Sequence<E>
{
     protected IntArray indexes; // Changed from IntSequence.
     // Otherwise basically unchanged
}

public abstract class AbstractArray<E>
     // replaces GeneralArray
     extends IndirectIndexable<E>, implements Array<E>
{
     protected abstract E getBuffer(int index);
     public E get() { return getBuffer(indexes.getAt()); }
     public E get(int i) { return getBuffer(indexes.getAt(i)); }
     public E get(int i, int j) { return getBuffer(indexes.getAt(i,j)); }
     public E get(int i, int j, int k, int... rest) {
         return getBuffer(indexes.getAt(i, j, k, rest))); }
}

// Scheme: array[Object]
public class ObjectArray<E> extends AbstractArray<Object> {
     Object[] data;
     protected Object getBuffer(int j) { return (E) data[i]; }
     public class ObjectArray(Object[] data, IntArray indexes) {
         this.data = data;
         this.indexes = indexes;
     }
}

// Scheme: array[float] (maybe add alias: f32array)
public class F32Array extends AbstractArray<Float> {
     float[] data;
     public floatAt() { return data[indexes.intAt()]; }
     public floatAt(int i) { return data[indexes.intAt(i)]; }
     public floatAt(int i, int j) { return data[indexes.intAt(i, )]; }
     public float floatAt(int[] args) {
          int effIndex = indexes.intAt(args);
          return data[effIndex];
     }
     public final Float getBuffer(int index) {
         return Float.valueOf(data[index]);
     }
}
abstract class IntArray<E> extends AbstractArray<E> {
     int[] data;
     public intAt() { return data[indexes.intAt()]; }
     public intAt(int i) { return data[indexes.intAt(i)]; }
     public intAt(int i, int j) { return data[indexes.intAt(i, j)]; }
     public intAt(int i, int j, int k, int rest) {
         return data[indexes.intAt(i, j, k, rest)];
     }
     public float intAt(int[] args) {
         switch (args.length) {
         case 0: return intAt();
         case 1: return intAt(args[0]);
         case 2: return intAt(args[0], args[1]);
         default:
             int[] rest = args.length == 3 ? IntVector.empty
                 : Arrays.copyOfRange(args. 3, args.length-3);
             return intAt(args[0], args[1], arg2[2], rest);
         }
     }
}

// Scheme: array[int]
public class S32Array extends IntVector<Integer> implements IntArray {
     public final Integer getBuffer(int index) {
         return Integer.valueOf(data[index]);
     }
}

// Scheme: array[uint]
public class U32Array extends IntVector<UInt> {
     public final UInt getBuffer(int index) {
         return UInt.valueOf(data[index]);
     }
}

public interface SimpleVector<E> extends Array<E>, Sequence<E> {
     default public int rank() { return 1; }
}

// Scheme: vector[E] == array1[E]
public class FVector<E> extends ObjectArray<E> implement SimpleVector<E> {
     public FVector(Object[] data, IntSequence indexes) {
         super(data, indexes);
     }
}

// Scheme: f32vector == vector[float] == array1[float]
public class F32Vector extends F32Array implement SimpleVector<Float> {
}

// Scheme: s32vector == vector[int] == array1[int]
public class S32Vector extends U32Array implement SimpleVector<Integer> {
}

public class ArrayMapper implements IntArray {
     // The length of all these must match rank.
     int[] dimensions;
     int[] strides;
     int[] lowBounds;
     int offset;
     int rank;
     public void checkRank(int r) { if (rank!=r) throw new Exception(); }
     static int adjust(int offset, int low, int len, int stride, int index) {
         if (index < low || (index -= low) >= len)
           throw new IndexOutOfBoundsException();
         return offset + strides * index;
     }
     public intAt() {
         checkRank(0);
         return return offset;
     }
     public intAt(int i) {
         checkRank(1);
         return adjust(offset, lowBounds[0], dimensions[0], strides[0], i);
     }
     public intAt(int i, int j) {
         checkRank(2);
         int r = offset;
         r = adjust(r, lowBounds[0], dimensions[0], strides[0], i);
         r = adjust(r, lowBounds[1], dimensions[1], strides[1], j);
         return r;
     }
     public intAt(int i, int j, int k, int... rest) {
         int rlength = rest.length;
         checkRank(3+rlength);
         int r = offset;
         r = adjust(r, lowBounds[0], dimensions[0], strides[0], i);
         r = adjust(r, lowBounds[1], dimensions[1], strides[1], j);
         r = adjust(r, lowBounds[2], dimensions[2], strides[2], k);
         for (int d = 0;  d < rlength; d++)
             r = adjust(r, lowBounds[3+d], dimensions[3+d], strides[3+d],
                        rest[d]);
         return r;
     }
}

// Mightbe reasonable to have separate ArrayMapper1, ArrayMapper2,
// ArrayMapper3 classes as optimizations when rank is 1, 2 or 3.
public class ArrayMapper1 implements IntSequence {
     int lowBound;
     int dimension;
     int stride;
     int offset;
     public intAt(int i) {
         return ArrayMapper.adjust(offset, lowBound, dimension, strides, i);
     }
}

class Arrays {
     // Existing methods have to be re-written.
     public static Array make(Array shape, Object value) {
         ArrayMapper mapper = createFrom(shape);
         int total = maper.size(); // ???
         Object[] data= new Object[total];
         if (value != null) {
             for (int i = total;  --i >= 0; )
                 data[i] = value;
         }
         return new ObjectArray(data, mapper);
     }

     /** Used to implement (ARR I1 I2 ... In).
      * For example if ARR is an ObjectArray, we return
      * new ObjectArray(ARR.data, compose(ARR.indexes, I1, I2, ..., In)).
      * Note we want to optimize if all I1...In are ArrayMapper
      * (or convertible to ArrayMapper, such as scalar or Range),
      * since in that case the result can be an ArrayMapper.
      */
     public static IntArray compose(IntArray base, IntArray... indexes) {
         int n = indexes.length;
         int r = 0;
         for (int i = 0; i < n; i++)
             r += indexes.rank();
         final int rank = r;
         // This code is doing it lazily.
         // We should probably pre-calculate a S32Array instead
         return new IntArray() {
             public int rank() { return rank; }
             public int intAt(int[] args) {
                 checkRank(args.length);
                 int[] bargs = new int[base.rank()];
                 int sofar = 0;
                 for (int i = 0;  i < n; i++) {
                     IntArray ind = indexes[i];
                     int rank = ind.rank();
                     switch (irank) {
                     case 0:
                         bargs[i] = ind.intAt();
                         break;
                     case 1:
                         bargs[i] = ind.intAt(args[sofar]);
                         break;
                     case 2:
                         bargs[i] = ind.intAt(args[sofar], args[sofar+1]);
                         break;
                     case 3: // should also optimize this case
                     default:
                         int[] tmp = new int[ind.rank()];
                         for (int j = 0; j < ind.rank(); j++)
                             tmp[j] = args[sofar+j]
                                 bargs[i] = ind.intAt(tmp);
                     }
                     sofar += irank;
                 }
                 return base.intAt(bargs);
             }};
     }
}

-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/