From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 74376 invoked by alias); 12 Feb 2016 10:43:05 -0000 Mailing-List: contact gsl-discuss-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gsl-discuss-owner@sourceware.org Received: (qmail 74238 invoked by uid 89); 12 Feb 2016 10:43:04 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 spammy=market X-HELO: mail-ig0-f177.google.com Received: from mail-ig0-f177.google.com (HELO mail-ig0-f177.google.com) (209.85.213.177) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 12 Feb 2016 10:43:01 +0000 Received: by mail-ig0-f177.google.com with SMTP id hb3so7112302igb.0 for ; Fri, 12 Feb 2016 02:43:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=t+krhRXVpD2wNmdv9lUbxVgXhv/5pJg90qdwnI4BRn0=; b=Wo4dPY5HgoVHStmVRUFvWQDx6XTy8CTUwUSlM7g3l/0r4yOpuOSVGrauHhwPc14WEW Q08JLRpDeeTAVbTUaHHf5fDr2MhYyTdHywyiGqEu64If20FPDdZ9RKPXZmoAb2ZJNhmB 4nwnUZnK+NTfTL9cS/YoA3onlx5p6oc2vhL3N19VfH81beiv5AdjEP7ly+tqVOw6ItJm Lud/T+89IifF49wDzYyCR4CnWzMJW5P18r3MkfeeqJaYAxvbmThRZ1qVtgX6s55QKlHQ K0LFdCzHoktSeWqqa/juzIKWLCoJY1tKvEjaC2iZWdXbl+uormYQnE7dkxVm9IEuMljT DQyg== X-Gm-Message-State: AG10YOTxdhJ1jEHRMv9AXe49yDlR4MizRMV3hz7CzUCYFkmzuI2aNUYkp9zHu+pykY4W66qwW3Z2+oBjL9Ok7A== MIME-Version: 1.0 X-Received: by 10.50.79.196 with SMTP id l4mr2233119igx.11.1455273779065; Fri, 12 Feb 2016 02:42:59 -0800 (PST) Received: by 10.79.118.141 with HTTP; Fri, 12 Feb 2016 02:42:58 -0800 (PST) In-Reply-To: <56BB5D84.20804@colorado.edu> References: <569E6C33.1090505@colorado.edu> <569EA1A9.2080101@colorado.edu> <56B689B1.5090005@colorado.edu> <56B77E13.1000306@colorado.edu> <56B7A59D.5040707@colorado.edu> <56B7B85C.10508@colorado.edu> <56BB5D84.20804@colorado.edu> Date: Fri, 12 Feb 2016 10:43:00 -0000 Message-ID: Subject: Re: Sparse matrix extension From: Alexis Tantet To: Patrick Alken Cc: "gsl-discuss@sourceware.org" Content-Type: text/plain; charset=UTF-8 X-SW-Source: 2016-q1/txt/msg00016.txt.bz2 I corrected _fscanf. There was an error when reading the comment header. On Wed, Feb 10, 2016 at 4:55 PM, Patrick Alken wrote: > I'm in favor of simplicity and easy-parsing, so matrix market sounds good to > me. I'll take a look at your latest code in the next few days. > > Patrick > > > On 02/10/2016 06:16 AM, Alexis Tantet wrote: >> >> Hi Patrick, >> >> Regarding the file format for sparse matrices, the one I have coded >> actually happen to be the coordinate format implemented by Matrix >> Market (the platform to share test data such as sparse matrices), with >> the addition of a matrix type header: >> http://math.nist.gov/MatrixMarket/formats.html >> >> It is also written that "Harwell-Boeing" is the most commonly used >> sparse matrix format, but that: >> "Unfortunately the HB specification is somewhat complex difficult to >> parse from languages other than Fortran, biased in favor of compressed >> column representation and not easily extensible. Some of these factors >> may have impeded the more widespread use of the HB format" >> >> It seems to me that complying to the Matrix Market coordinate format >> would be the right choice, in terms of ease of implementation, >> compliance to other packages and user-friendliness. I could update the >> print/scan functions accordingly (mostly handling the header). What do >> you think? >> >> Best, >> Alexis >> >> >> >> >> On Mon, Feb 8, 2016 at 1:59 AM, Alexis Tantet >> wrote: >>> >>> Ok, my mistake, now I see where I got confused. >>> I had in mind to add all the elements first to the triplets and only >>> while converting to compressed sum up the duplicates. >>> While, indeed, if there's a way you can sum up the duplicates directly >>> while adding them to the triplet matrix (thanks to _ptr), this is more >>> handy and efficient. >>> >>> Thanks for the clarification, >>> Alexis >>> >>> On Sun, Feb 7, 2016 at 10:34 PM, Patrick Alken >>> wrote: >>>> >>>> By design, gsl_spmatrix_set won't allow you to do this. >>>> >>>> If you add element (i, j, x) and then later to try add element (i, j, >>>> y), gsl_spmatrix_set will detect that there exists an element in the (i, >>>> j) spot and it will simply change x to y - the value of x will be >>>> overwritten by y. This is the same behavior as gsl_matrix_set. >>>> >>>> So no duplicates are allowed by design. If you have such an application >>>> where you want to keep track of duplicates, you could do the following: >>>> >>>> double *ptr = gsl_spmatrix_ptr(m, i, j); >>>> if (ptr) >>>> *ptr += x; /* sum duplicate values */ >>>> else >>>> gsl_spmatrix_set(m, i, j, x); /* initalize to x */ >>>> >>>> On 02/07/2016 01:31 PM, Alexis Tantet wrote: >>>>> >>>>> I'm not sure I got your last point. I have the following situation in >>>>> mind: >>>>> >>>>> Start to construct a transition matrix in triplet format, adding one >>>>> element after another. >>>>> In this particular example, each element is one count of a transition >>>>> from (state, box, etc.) i to j, >>>>> so I add elements (i, j, 1) to the triplet object, with possibly >>>>> duplicates. >>>>> What happen to these duplicates in the binary tree? >>>>> >>>>> Eventually, when I compress to CRS or CCS, I would like the duplicates >>>>> to be summed up, so that element (i, j) counts transitions from i to j >>>>> (and no duplicates exist after compression). >>>>> >>>>> Is this more clear? >>>>> >>>>> On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken >>>>> wrote: >>>>>> >>>>>> Hi Alexis, >>>>>> >>>>>>>> I'm not sure what you mean. I've added a new function >>>>>>>> gsl_spmatrix_ptr >>>>>>>> to the git, which as far as I can tell does exactly what your >>>>>>>> sum_duplicate flag does. It searches the matrix for an (i,j) >>>>>>>> element, >>>>>>>> and if found returns a pointer. If not found a null pointer is >>>>>>>> returned. >>>>>>>> This makes it easy for the user to modify A(i,j) after it has been >>>>>>>> added >>>>>>>> to the matrix. Are you thinking of something else? Can you point me >>>>>>>> to >>>>>>>> the Eigen routine? >>>>>>>> >>>>>>> What I meant is to have the equivalent of gsl_spmatrix_compress, >>>>>>> with the difference that gsl_spmatrix_ptr is used instead of >>>>>>> gsl_spmatrix_set, >>>>>>> so has to build the compressed matrix from triplets, summing the >>>>>>> duplicates, instead of replacing them. >>>>>>> This is what is done here : >>>>>>> The >>>>>>> http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b >>>>>>> >>>>>>> Best, >>>>>>> Alexis >>>>>> >>>>>> I'm not sure why a user would ever need to do this. The whole point of >>>>>> the binary tree structure in the triplet storage is to efficiently >>>>>> find >>>>>> duplicate entries, so that if a user tries to call gsl_spmatrix_set on >>>>>> an element which is already been previously set, it can find that >>>>>> element with a binary search (rather than linearly searching the >>>>>> arrays) >>>>>> and change the value of that element. >>>>>> >>>>>> Therefore, the way the triplet storage is designed, there is will >>>>>> never >>>>>> be a duplicate element in the triplet arrays. All of the (i[n],j[n]) >>>>>> will be unique for each n <= nz. >>>>>> >>>>>> Am I missing something? >>>>>> >>>>>> Patrick >>>>> >>>>> >>> >>> >>> -- >>> Alexis Tantet >> >> >> > -- Alexis Tantet