From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gsl-discuss-return-5914-listarch-gsl-discuss=sources.redhat.com@sourceware.org>
Received: (qmail 97553 invoked by alias); 7 Feb 2016 19:32:31 -0000
Mailing-List: contact gsl-discuss-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gsl-discuss.sourceware.org>
List-Subscribe: <mailto:gsl-discuss-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gsl-discuss/>
List-Post: <mailto:gsl-discuss@sourceware.org>
List-Help: <mailto:gsl-discuss-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gsl-discuss-owner@sourceware.org
Received: (qmail 97532 invoked by uid 89); 7 Feb 2016 19:32:30 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.0 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 spammy=dox
X-HELO: mail-ig0-f171.google.com
Received: from mail-ig0-f171.google.com (HELO mail-ig0-f171.google.com) (209.85.213.171)
 by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Sun, 07 Feb 2016 19:32:28 +0000
Received: by mail-ig0-f171.google.com with SMTP id 5so45929215igt.0
        for <gsl-discuss@sourceware.org>; Sun, 07 Feb 2016 11:32:27 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20130820;
        h=x-gm-message-state:mime-version:in-reply-to:references:date
         :message-id:subject:from:to:cc:content-type;
        bh=flga1D4n5OKlDU3VKZfJdMfkkIbQ6hHUf/7/jy9f6FM=;
        b=FPN1tPbcwAk4WoYrhckTjQFC00qW2dTN7A1gcZLYcYZM9TExPuFYr51Szi60YOJ3PY
         YbZgM0x/w80tG1UkpZ4A70NX5V1av9r+51mB6q5C+HPAYY7J5nazZHALkjK4UGLlOVrQ
         UK2iO0oOmxMFmhf8dCJ3qszjGw69LR/5u2pflS9wvKwIeB+nuD5yBxorMjnSJlX6g1yi
         DKHOLiBJRMtReRo/8wtLJfErUPB21mh8a3Vxyb89uwNIqNValJIWH+w3lB3zFZOTJFS2
         1FGLK1S4IBzYaMayLr4WcWq0I7Bszm1iRd5ZVvau8KT6mnXLyfxt99e0uxYu8Uhpzzcn
         eG2Q==
X-Gm-Message-State: AG10YOQiOfcDXLARHeoooP1Y74YZcY7c/5MNk67mkxhL9rQktldSEquL+9VDzzoYCG+n3TdHc+mdnwomn9ixjQ==
MIME-Version: 1.0
X-Received: by 10.50.6.104 with SMTP id z8mr20673263igz.58.1454873546136; Sun,
 07 Feb 2016 11:32:26 -0800 (PST)
Received: by 10.79.118.141 with HTTP; Sun, 7 Feb 2016 11:32:26 -0800 (PST)
In-Reply-To: <56B77E13.1000306@colorado.edu>
References: <CAMWWPT3uJj4Vrn7ut6+F18gY===zd6+1r1UJhz0hcCj--zwtdg@mail.gmail.com>
	<CAMWWPT3Y=x-vYphaV+2gHPb9WZqEfYdDxs4T0KqBq987VjvDQA@mail.gmail.com>
	<569E6C33.1090505@colorado.edu>
	<CAMWWPT2kMe=R0qUz4P6n_xzG+oW-8pOFwCtBuaritZ5MF6jzAg@mail.gmail.com>
	<569EA1A9.2080101@colorado.edu>
	<CAMWWPT2d=PnK1cexcZ+OLQAYJudRk9QLGac-Wcn33OaES5UjJA@mail.gmail.com>
	<56B689B1.5090005@colorado.edu>
	<CAMWWPT1mog0HrviL1tMo=f-rrSc2PhamWDGg7ZYLrVnnqkY3ng@mail.gmail.com>
	<56B77E13.1000306@colorado.edu>
Date: Sun, 07 Feb 2016 19:32:00 -0000
Message-ID: <CAMWWPT2KTLjufNdwW=y-xaJ1dKkxDNQwgat4DB4ZwKDrK_fZSA@mail.gmail.com>
Subject: Re: Sparse matrix extension
From: Alexis Tantet <alexis.tantet@gmail.com>
To: Patrick Alken <alken@colorado.edu>
Cc: "gsl-discuss@sourceware.org" <gsl-discuss@sourceware.org>
Content-Type: text/plain; charset=UTF-8
X-SW-Source: 2016-q1/txt/msg00008.txt.bz2

Hi Patrick,

On Sun, Feb 7, 2016 at 6:25 PM, Patrick Alken <alken@colorado.edu> wrote:
> On 02/06/2016 06:03 PM, Alexis Tantet wrote:
>> Hi Patrick,
>>
>> Thanks for working on it. Following, some comments:
>>
>> On Sun, Feb 7, 2016 at 1:02 AM, Patrick Alken <alken@colorado.edu> wrote:
>>> Hello,
>>>
>>>   I've been working my way through your patch. There is a new branch on
>>> the git called 'sparse' with all the current changes. You've done a lot
>>> of work on your patch so its taking me a while to get through all of it.
>>>
>>>   The main functionality of CRS is in there, along with basic stuff like
>>> memcpy, transpose_memcpy, transpose. Also dgemv works (and hence also
>>> the GMRES linear solver).
>> Since the GMRES is already implemented, do you think it would be worth
>> using the Arnoldi method to find eigenvalues and eigenvectors of
>> sparse non-hermitian matrices?
>
> If you think you can make a nice implementation, go for it. We could add
> it to the gsl_eigen interface.
>

The basic elements of the Arnoldi iteration and the eigenvalue solver
for dense non-hermitian matrices are there.
The question is then which restarting method to use, e.g. the
implicite restart of ARPACK.
I'll see what I can do.

>>
>>> I did some initial work on the fprintf/fscanf
>>> for the triplet case - for the compressed matrices, I am wondering if we
>>> should stick to a standard format such as Harwell-Boeing, I'm not sure
>>> at the moment. We might be able to avoid this issue by making a binary
>>> fwrite/fread interface which just directly writes the 3 arrays to disk.
>> I agree that a standard format would be best but I am not aware of any
>> dominant format.
>> Is Harwell-Boeing one of them?
>> It seems important to me to have both binary and formatted input/output
>> (for coherence with gsl_matrix and because formatted files are more
>> handy to share).
>> Also, it is important to have some header indicating how many elements
>> should be read
>> and what is the dimension of the matrix.
>
> I'm not aware of a standard format either, I'll try and look into this
> further to come up with a good solution.
>
>>
>>> Things I still need to add for the CRS storage:
>>>
>>> 1. spmatrix_add
>>> 2. spblas_dgemm
>>> 3. fprintf/fscanf
>>> 4. transpose - for now I'll keep it the way you coded it, but I think it
>>> would be better to keep the matrix in the same storage format during a
>>> transpose, ie: the user might want to pass A^T to an external solver
>>> library which requires CCS and they might not expect the transpose
>>> routine to change it to CRS. I did a quick search for in-place transpose
>>> algorithms in CCS/CRS and they do appear to exist though I haven't had
>>> time to implement it yet.
>> Fine, but then I think it is important to add a separate function, say
>> "transpose_switch_major",
>> for which the transposition is performed by just changing the major
>> (i.e. row->col or col->row)
>> since this is the cheapest way to transpose while some libraries may
>> work equally well with CRS or CCS.
>
> Ok I like this idea too since it is very efficient.
>
>>
>>> I removed the inner/outer idx stuff from gsl_spmatrix - its not
>>> necessary and also I'd like to avoid modifying the gsl_spmatrix struct
>>> since any changes would break binary compatibility for the next release.
>> That makes perfect sense, but then you need to test each time for the
>> major to know what is the size of the pointer?
>
> I am just making separate routines for CCS and CRS (ie: gsl_spmatrix_ccs
> and gsl_spmatrix_crs to do triplet -> compressed). I think its simpler
> this way.
>
>>
>>> Also your modification to gsl_spmatrix_set - I see what you're trying to
>>> do but I think its better to add a new function:
>>>
>>> double *gsl_spmatrix_ptr(gsl_spmatrix *m, size_t i, size_t j)
>>>
>>> which returns a pointer to the data array for element (i,j), and the
>>> user can then add whatever value they want (this is how its done with
>>> gsl_matrix and gsl_vector). I think it should be straightforward to add
>>> this function, possibly even for the compressed storage too.
>> Then I would also add a compressed matrix constructor (or an option) in which
>> gsl_spmatrix_ptr is used to sum duplicate entries instead of gsl_spmatrix_set.
>> This possibility is offered by some other libraries such as Eigen
>> and is for example useful to build matrices counting (e.g. transitions).
>
> I'm not sure what you mean. I've added a new function gsl_spmatrix_ptr
> to the git, which as far as I can tell does exactly what your
> sum_duplicate flag does. It searches the matrix for an (i,j) element,
> and if found returns a pointer. If not found a null pointer is returned.
> This makes it easy for the user to modify A(i,j) after it has been added
> to the matrix. Are you thinking of something else? Can you point me to
> the Eigen routine?
>

What I meant is to have the equivalent of gsl_spmatrix_compress,
with the difference that gsl_spmatrix_ptr is used instead of gsl_spmatrix_set,
so has to build the compressed matrix from triplets, summing the
duplicates, instead of replacing them.
This is what is done here :
The http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b

Best,
Alexis

>>
>>> Anyway, just wanted to let you know things are progressing (though a
>>> little slowly).
>> Thanks a lot! Would you like me to work on something in particular?
>>
>> Best,
>> Alexis
>>
>>> Patrick
>>>
>>> On 01/20/2016 11:31 AM, Alexis Tantet wrote:
>>>> Hi Patrick,
>>>>
>>>> I've cloned gsl.git, made my modifications to spmatrix/ and spblas/
>>>> and pushed them to master at:
>>>> https://github.com/atantet/gsl
>>>>
>>>> This time, I've tried to modify the code as little as possible. I've
>>>> usually added comments labelled "AT" with each modification.
>>>> I also have a few questions about the code which I've labelled "?AT".
>>>> Finally, I've added the corresponding tests to both test.c (successful
>>>> on my machine).
>>>>
>>>> I see that duplicates are handled for the triplet format by replacing.
>>>> Thus, I removed the function taking care of sorting and summing
>>>> duplicates that I had added to _compress (the trick was to transpose
>>>> with copy to reorder first, transpose back and remove the duplicates
>>>> then, taken from Eigen C++). Instead, I have added an option to sum
>>>> the duplicates (useful e.g. when counting links of a graph or
>>>> transitions of a Markov chain). One may need the third option of
>>>> repeating the triplets, but then avl_insert should also be modified.
>>>>
>>>> However, if I've understood well, no sorting of the inner indices is
>>>> performed right now. This could be problematic in the future (e.g. in
>>>> functions such as _equal).
>>>>
>>>> For _spdgemm for CRS matrices, I've used the trick (A B)^t = B^t A^t
>>>> to convert the problem to CSC. I've tested it successfully but you may
>>>> not like the coding style.
>>>>
>>>> I hope that helps,
>>>> Best,
>>>> Alexis
>>>>
>>>> On Tue, Jan 19, 2016 at 9:50 PM, Patrick Alken <alken@colorado.edu> wrote:
>>>>> On 01/19/2016 12:55 PM, Alexis Tantet wrote:
>>>>>> Hi Patrick,
>>>>>>
>>>>>> Thanks for the quick reply! I wanted to be sure that this contribution
>>>>>> is useful before to spend time on the merging with the latest version.
>>>>>> I will create the gsl.git repository and work on it during the week.
>>>>>>
>>>>>> I had already had a look at the documentation but did not know about
>>>>>> the iterative solvers (a link between each modules would be useful).
>>>>>> My contribution indeed fits in the sparse matrix module + the update
>>>>>> of the dgemm and dgemv functions to support CRS (an update may also be
>>>>>> needed for the solvers).
>>>>> The solvers only call dgemv (as far as I remember) so they shouldn't need an
>>>>> update once dgemv is updated.
>>>>>
>>>>>> I have also developed a simple C++ object allowing to use gsl_spmatrix
>>>>>> as a user-defined matrix in ARPACK++ (a maintained fork of ARPACK++
>>>>>> can be found at https://github.com/m-reuter/arpackpp), allowing to
>>>>>> avoid having to use other libraries such as superLU. It could be
>>>>>> useful to others, maybe as an extension. Now that I think about it,
>>>>>> the iterative solvers could also be used to support the shift and
>>>>>> invert modes (see ARPACK++ documentation). What do you think (I could
>>>>>> work on it)?
>>>>> I've never used ARPACK, but if you want to make an extension to interface
>>>>> GSL/ARPACK its fine with me - such an extension would never be added to the
>>>>> main GSL code since GSL tries to be as standalone as possible.
>>>>>
>>>>>
>>>>>> If you have major comments, the sooner the better, so that I can work
>>>>>> on them while merging.
>>>>>>
>>>>>> Thank you for your interest,
>>>>>> Alexis
>>>>>>
>>>>>> On Tue, Jan 19, 2016 at 6:02 PM, Patrick Alken <alken@colorado.edu> wrote:
>>>>>>> Hi Alexis,
>>>>>>>
>>>>>>>    This looks like very good work! Adding compressed row storage has been
>>>>>>> on
>>>>>>> my todo list for a while. The 'gslsp' extension is unfortunately very out
>>>>>>> of
>>>>>>> date, and the current git contains newer code (including a GMRES
>>>>>>> iterative
>>>>>>> linear solver). I removed the gslsp extension from the web page a while
>>>>>>> back
>>>>>>> to reflect this. You can browse the latest manual to see the current
>>>>>>> sparse
>>>>>>> matrix capabilities (http://www.gnu.org/software/gsl/manual/gsl-ref.pdf)
>>>>>>> -
>>>>>>> there are 3 chapters: sparse matrices, sparse blas and sparse linear
>>>>>>> algebra
>>>>>>> - it looks like your contributions will fit into the sparse matrices
>>>>>>> chapter.
>>>>>>>
>>>>>>>    Would you be able to verify that your changes are compatible with the
>>>>>>> current gsl.git repository? This will make it much easier for me to merge
>>>>>>> everything into the git when ready. It would be best if you made a new
>>>>>>> branch of gsl.git, and add your changes so I can then pull them from
>>>>>>> github
>>>>>>> or somewhere. I will try to find some time in the next few days to look
>>>>>>> over
>>>>>>> your code.
>>>>>>>
>>>>>>> Thanks again,
>>>>>>> Patrick
>>>>>>>
>>>>>>>
>>>>>>> On 01/19/2016 09:43 AM, Alexis Tantet wrote:
>>>>>>>> Dear GSLers,
>>>>>>>>
>>>>>>>> As a scientist rather than a developer, I have developed an extension
>>>>>>>> of the sparse matrix module (CRS, I/O, manipulation, see below), which
>>>>>>>> I have tested. These modifications conserve the structure of the
>>>>>>>> original module and be useful for a large number of sparse matrices
>>>>>>>> users.
>>>>>>>>
>>>>>>>> I'm not familiar with the contributing process here. My repository can
>>>>>>>> be found there:
>>>>>>>> https://github.com/atantet/gslsp
>>>>>>>> Unfortunately, I did not know of the gsl.git repository and I forked it
>>>>>>>> froml:
>>>>>>>> https://github.com/drjerry/gslsp ,
>>>>>>>> which seems to be a bit older than gsl.git.
>>>>>>>>
>>>>>>>> How can I push/merge to gsl.git ? Should it be as an update or another
>>>>>>>> extension? Is it necessary to adapt to the newest version of the code
>>>>>>>> ?
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Alexis Tantet
>>>>>>>>
>>>>>>>> CHANGES.md:
>>>>>>>>
>>>>>>>> Extension of the sparse matrix module of GSL
>>>>>>>>
>>>>>>>> ===================================
>>>>>>>>
>>>>>>>> Introduction
>>>>>>>> ------------
>>>>>>>>
>>>>>>>> Usages of sparse matrices are numerous in scientific computing.
>>>>>>>> When numerical linear algebra problems become large, sparse
>>>>>>>> matrices become necessary to avoid memory overload and unnecessary
>>>>>>>> computations, at the cost of element access and matrix construction.
>>>>>>>> As a result, most large scale linear solvers or eigen solvers perform
>>>>>>>> on sparse matrices.
>>>>>>>>
>>>>>>>> Fortunately, a very useful sparse matrix module has recently been
>>>>>>>> introduced to GSL.
>>>>>>>> However, important features are still lacking, such has
>>>>>>>> Compressed Row Storage (CRS) matrices, input/output functions and
>>>>>>>> other matrix properties and manipulation functions.
>>>>>>>> This new version attempts to address this, conserving the original
>>>>>>>> structure of the module and conventions.
>>>>>>>>
>>>>>>>> Major changes
>>>>>>>> -------------
>>>>>>>>
>>>>>>>> * Add CRS format and update functions manipulating compressed matrices :
>>>>>>>>        - additional flag GSL_SPMATRIX_CRS and macro GSLSP_ISMATRIX (
>>>>>>>> gsl_spmatrix.h )
>>>>>>>>        - additional members innerSize and outerSize used to iterate
>>>>>>>> matrix elements ( gsl_spmatrix.h )
>>>>>>>>        - rename some variables for coherence ( gsl_spmatrix.h , *.c )
>>>>>>>>        - update all functions on compressed matrices ( *.c )
>>>>>>>> * Allow to sum duplicate elements when compressing ( spcompress.c ) :
>>>>>>>>        - modify gsl_spmatrix_compress
>>>>>>>>        - add    gsl_spmatrix_sum_duplicate
>>>>>>>> * CCS <-> CRS and fast transpose inplace in spswap.c :
>>>>>>>>        - add gsl_spmatrix_switch_major
>>>>>>>>        - add gsl_spmatrix_transpose
>>>>>>>> * Add printing and scanning functions in spio.c :
>>>>>>>>        - add gsl_spmatrix_fprintf
>>>>>>>>        - add gsl_spmatrix_fscanf
>>>>>>>> * Add manipulation functions in spmanip.c (particularly useful for
>>>>>>>> Markov chain transition matrices) :
>>>>>>>>        - add gsl_spmatrix_get_rowsum : get vector of sum over row
>>>>>>>> elements
>>>>>>>>        - add gsl_spmatrix_get_colsum : get vector of sum over column
>>>>>>>> elements
>>>>>>>>        - add gsl_spmatrix_div_rows   : divide all elements of each row
>>>>>>>> by a vector element
>>>>>>>>        - add gsl_spmatrix_div_cols   : divide all elements of each
>>>>>>>> column by a vector element
>>>>>>>> * Add test functions in atprop.c :
>>>>>>>>        - add gsl_spmatrix_gt_elements : greater than test for each
>>>>>>>> matrix
>>>>>>>> element
>>>>>>>>        - add gsl_spmatrix_ge_elements : greater or equal than test for
>>>>>>>> each matrix element
>>>>>>>>        - add gsl_spmatrix_lt_elements : lower than test for each matrix
>>>>>>>> element
>>>>>>>>        - add gsl_spmatrix_le_elements : lower or equal than test for
>>>>>>>> each matrix element
>>>>>>>>        - add gsl_spmatrix_any         : test if any non-zero element in
>>>>>>>> matrix
>>>>>>>>
>>>>>>>> Other minor changes have been made, such as error tests.
>>>>>>>> test.c has also been updated to test new features.
>>>>
>>
>>
>


-- 
Alexis Tantet