Re: Sparse matrix extension

public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed

From: Patrick Alken <alken@colorado.edu>
To: "gsl-discuss@sourceware.org" <gsl-discuss@sourceware.org>
Subject: Re: Sparse matrix extension
Date: Sat, 13 Feb 2016 19:42:00 -0000	[thread overview]
Message-ID: <56BF86FC.8010509@colorado.edu> (raw)
In-Reply-To: <CAMWWPT0a77RXNkg=a=Y8fJoxbYKpz-7tgvsXGPQNd4X_e=pNag@mail.gmail.com>

Ok I've added the MatrixMarket stuff to fprintf/fscanf. I also merged
all the new sparse matrix stuff into the master branch on git.

I've added almost your whole patch, except for all the stuff in
spmanip.c and also the gt_elements() routines in spprop.c. I'm not sure
these routines belong in a general purpose matrix library....although I
can possibly see a need for the row/col multiplication stuff, since we
can't easily make vector views of rows/columns of sparse matrices..

I'll need to think some more about this.

Also I haven't yet modified dgemm, because I'm hesitant to use the
dynamic allocation method in your patch. Also I found a sparse blas
document on netlib which tries to outline a standard and I realize our
sparse blas routines don't adhere to this standard...so I may try to
redesign everything to follow that document.

Anyway thanks a lot for all your work on this.

Patrick

On 02/12/2016 03:42 AM, Alexis Tantet wrote:
> I corrected _fscanf. There was an error when reading the comment header.
>
> On Wed, Feb 10, 2016 at 4:55 PM, Patrick Alken <alken@colorado.edu> wrote:
>> I'm in favor of simplicity and easy-parsing, so matrix market sounds good to
>> me. I'll take a look at your latest code in the next few days.
>>
>> Patrick
>>
>>
>> On 02/10/2016 06:16 AM, Alexis Tantet wrote:
>>> Hi Patrick,
>>>
>>> Regarding the file format for sparse matrices, the one I have coded
>>> actually happen to be the coordinate format implemented by Matrix
>>> Market (the platform to share test data such as sparse matrices), with
>>> the addition of a matrix type header:
>>> http://math.nist.gov/MatrixMarket/formats.html
>>>
>>> It is also written that "Harwell-Boeing" is the most commonly used
>>> sparse matrix format, but that:
>>> "Unfortunately the HB specification is somewhat complex difficult to
>>> parse from languages other than Fortran, biased in favor of compressed
>>> column representation and not easily extensible. Some of these factors
>>> may have impeded the more widespread use of the HB format"
>>>
>>> It seems to me that complying to the Matrix Market coordinate format
>>> would be the right choice, in terms of ease of implementation,
>>> compliance to other packages and user-friendliness. I could update the
>>> print/scan functions accordingly (mostly handling the header). What do
>>> you think?
>>>
>>> Best,
>>> Alexis
>>>
>>>
>>>
>>>
>>> On Mon, Feb 8, 2016 at 1:59 AM, Alexis Tantet <alexis.tantet@gmail.com>
>>> wrote:
>>>> Ok, my mistake, now I see where I got confused.
>>>> I had in mind to add all the elements first to the triplets and only
>>>> while converting to compressed sum up the duplicates.
>>>> While, indeed, if there's a way you can sum up the duplicates directly
>>>> while adding them to the triplet matrix (thanks to _ptr), this is more
>>>> handy and efficient.
>>>>
>>>> Thanks for the clarification,
>>>> Alexis
>>>>
>>>> On Sun, Feb 7, 2016 at 10:34 PM, Patrick Alken <alken@colorado.edu>
>>>> wrote:
>>>>> By design, gsl_spmatrix_set won't allow you to do this.
>>>>>
>>>>> If you add element (i, j, x) and then later to try add element (i, j,
>>>>> y), gsl_spmatrix_set will detect that there exists an element in the (i,
>>>>> j) spot and it will simply change x to y - the value of x will be
>>>>> overwritten by y. This is the same behavior as gsl_matrix_set.
>>>>>
>>>>> So no duplicates are allowed by design. If you have such an application
>>>>> where you want to keep track of duplicates, you could do the following:
>>>>>
>>>>> double *ptr = gsl_spmatrix_ptr(m, i, j);
>>>>> if (ptr)
>>>>>    *ptr += x; /* sum duplicate values */
>>>>> else
>>>>>     gsl_spmatrix_set(m, i, j, x);   /* initalize to x */
>>>>>
>>>>> On 02/07/2016 01:31 PM, Alexis Tantet wrote:
>>>>>> I'm not sure I got your last point. I have the following situation in
>>>>>> mind:
>>>>>>
>>>>>> Start to construct a transition matrix in triplet format, adding one
>>>>>> element after another.
>>>>>> In this particular example, each element is one count of a transition
>>>>>> from (state, box, etc.) i to j,
>>>>>> so I add elements  (i, j, 1) to the triplet object, with possibly
>>>>>> duplicates.
>>>>>> What happen to these duplicates in the binary tree?
>>>>>>
>>>>>> Eventually, when I compress to CRS or CCS, I would like the duplicates
>>>>>> to be summed up, so that element (i, j) counts transitions from i to j
>>>>>> (and no duplicates exist after compression).
>>>>>>
>>>>>> Is this more clear?
>>>>>>
>>>>>> On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken <alken@colorado.edu>
>>>>>> wrote:
>>>>>>> Hi Alexis,
>>>>>>>
>>>>>>>>> I'm not sure what you mean. I've added a new function
>>>>>>>>> gsl_spmatrix_ptr
>>>>>>>>> to the git, which as far as I can tell does exactly what your
>>>>>>>>> sum_duplicate flag does. It searches the matrix for an (i,j)
>>>>>>>>> element,
>>>>>>>>> and if found returns a pointer. If not found a null pointer is
>>>>>>>>> returned.
>>>>>>>>> This makes it easy for the user to modify A(i,j) after it has been
>>>>>>>>> added
>>>>>>>>> to the matrix. Are you thinking of something else? Can you point me
>>>>>>>>> to
>>>>>>>>> the Eigen routine?
>>>>>>>>>
>>>>>>>> What I meant is to have the equivalent of gsl_spmatrix_compress,
>>>>>>>> with the difference that gsl_spmatrix_ptr is used instead of
>>>>>>>> gsl_spmatrix_set,
>>>>>>>> so has to build the compressed matrix from triplets, summing the
>>>>>>>> duplicates, instead of replacing them.
>>>>>>>> This is what is done here :
>>>>>>>> The
>>>>>>>> http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Alexis
>>>>>>> I'm not sure why a user would ever need to do this. The whole point of
>>>>>>> the binary tree structure in the triplet storage is to efficiently
>>>>>>> find
>>>>>>> duplicate entries, so that if a user tries to call gsl_spmatrix_set on
>>>>>>> an element which is already been previously set, it can find that
>>>>>>> element with a binary search (rather than linearly searching the
>>>>>>> arrays)
>>>>>>> and change the value of that element.
>>>>>>>
>>>>>>> Therefore, the way the triplet storage is designed, there is will
>>>>>>> never
>>>>>>> be a duplicate element in the triplet arrays. All of the (i[n],j[n])
>>>>>>> will be unique for each n <= nz.
>>>>>>>
>>>>>>> Am I missing something?
>>>>>>>
>>>>>>> Patrick
>>>>>>
>>>>
>>>> --
>>>> Alexis Tantet
>>>
>>>
>
>

next prev parent reply	other threads:[~2016-02-13 19:42 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAMWWPT3uJj4Vrn7ut6+F18gY===zd6+1r1UJhz0hcCj--zwtdg@mail.gmail.com>
2016-01-19 16:43 ` Alexis Tantet
2016-01-19 17:02   ` Patrick Alken
2016-01-19 19:55     ` Alexis Tantet
2016-01-19 20:50       ` Patrick Alken
2016-01-20 18:31         ` Alexis Tantet
2016-02-07  0:03           ` Patrick Alken
2016-02-07  1:03             ` Alexis Tantet
2016-02-07 17:25               ` Patrick Alken
2016-02-07 19:32                 ` Alexis Tantet
2016-02-07 20:14                   ` Patrick Alken
2016-02-07 20:31                     ` Alexis Tantet
2016-02-07 21:34                       ` Patrick Alken
2016-02-08  0:59                         ` Alexis Tantet
2016-02-10 13:16                           ` Alexis Tantet
2016-02-10 13:48                             ` Alexis Tantet
2016-02-10 15:56                             ` Patrick Alken
2016-02-12 10:43                               ` Alexis Tantet
2016-02-13 19:42                                 ` Patrick Alken [this message]
2016-02-14 11:06                                   ` Alexis Tantet
2016-02-14 18:11                                     ` Patrick Alken
2016-02-14 18:25                                       ` Brian Gladman
2016-02-15  5:10                                         ` Patrick Alken
2016-02-15 11:09                                           ` Brian Gladman
     [not found]                                             ` <CAMWWPT0J9ENRZjJHLO=cxot4DGdSLer+n2HkBVnhFnO0oiVV8g@mail.gmail.com>
2016-02-15 13:56                                               ` Alexis Tantet
2016-02-15 17:17                                                 ` Patrick Alken

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56BF86FC.8010509@colorado.edu \
    --to=alken@colorado.edu \
    --cc=gsl-discuss@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).