From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24093 invoked by alias); 10 Feb 2016 15:56:01 -0000 Mailing-List: contact gsl-discuss-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gsl-discuss-owner@sourceware.org Received: (qmail 24071 invoked by uid 89); 10 Feb 2016 15:56:01 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=HX-Microsoft-Antispam-PRVS:sk:CY1PR03, market, H*r:sk:na01-bl, H*r:65.55.169 X-HELO: na01-bl2-obe.outbound.protection.outlook.com Received: from mail-bl2on0126.outbound.protection.outlook.com (HELO na01-bl2-obe.outbound.protection.outlook.com) (65.55.169.126) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA256 encrypted) ESMTPS; Wed, 10 Feb 2016 15:55:58 +0000 Authentication-Results: sourceware.org; dkim=none (message not signed) header.d=none;sourceware.org; dmarc=none action=none header.from=colorado.edu; Received: from palken-co-ll.ngdc.noaa.gov (140.172.179.43) by CY1PR0301MB0908.namprd03.prod.outlook.com (10.160.165.19) with Microsoft SMTP Server (TLS) id 15.1.403.16; Wed, 10 Feb 2016 15:55:54 +0000 Subject: Re: Sparse matrix extension To: Alexis Tantet References: <569E6C33.1090505@colorado.edu> <569EA1A9.2080101@colorado.edu> <56B689B1.5090005@colorado.edu> <56B77E13.1000306@colorado.edu> <56B7A59D.5040707@colorado.edu> <56B7B85C.10508@colorado.edu> CC: "gsl-discuss@sourceware.org" From: Patrick Alken Message-ID: <56BB5D84.20804@colorado.edu> Date: Wed, 10 Feb 2016 15:56:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-ClientProxiedBy: CY1PR0201CA0012.namprd02.prod.outlook.com (25.163.30.150) To CY1PR0301MB0908.namprd03.prod.outlook.com (25.160.165.19) X-MS-Office365-Filtering-Correlation-Id: 52de0147-c9ff-4140-5729-08d33232a857 X-Microsoft-Exchange-Diagnostics: 1;CY1PR0301MB0908;2:PQ+2LJwIfijmJnMP6E44N8b5+r1gXr9vp/0hzlLEbrVTjdGdqvSTC3r0yWOUmeB+8xc/XwKxRCEBVk6wJLRFClG66KfPkWzFNyFMVKlaDinD/JZokZ2MOPcf5myG63qD++yfxw7phrhvnPUmwebhrYXTMmzfOTXZjQuHwOxTUWfZ2OC0u1XzMyOXCmdoQVh3;3:jzzs/k+/T4acYxiNF59suDqe1+Fq9huKkevPInAmzRHXzcg6iqEskQvIM1mU52YSir/qu2rL3GpRiElLkEiov2iXb+sCL2Mp0blZ/9XYWbixuM+akKmwNrdPeNAeL/Ld;25:UBBJjj9hTrPwOPRcGs1MAe58cx2dhDeEwa1koj86Tu7ojU6m/RMRn6PuG7ENWf6Yko0DJjvCwFWXzscMvID2ICODbIeZoFBCuPEVrfBHh8e70bDTjCHevj7HK8KSjRoW81EfFcItA6ydRker9Bx0Zspw1AX3EkUDus006/Sc3e/ILfSzN6CNIfM2Y5llF3i71Dc+xgO50js17RVvxjGYieIx+b3p2MWGU1rKCwykHSCYuhv2+YsOtLsgpws0ZW7Zy7jUyz7W9up8PiRL0fh4kbgyXzAQoLQV9emHw99qmhseh+i2URZOB/qAtyblvtFs X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CY1PR0301MB0908; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(65766998875637); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046);SRVR:CY1PR0301MB0908;BCL:0;PCL:0;RULEID:;SRVR:CY1PR0301MB0908; X-Microsoft-Exchange-Diagnostics: 1;CY1PR0301MB0908;4:M053vr5HulyN9nslHqV8aNjRhCtOIzX0/j/vxxtODcEZYH6EmrZIEQXW7UtwfSZDdkAlUgNiieMU/ubYh0FV+ypp05McxCYwdYI0SwPXpF5eoBOmszXJdejW81ZtFeNAiYJbeCu4BTCYRYoNFJI29efMLhWYlLnZvXrGTtNHGT/BzOI+lAuktfq7uty1gAjhbszJI8VpqqV/92kE0NK/KHRoMFPLnHsmNKtcwFXESC3xaTbCpLT+MzTHSFvr9DG9rzw7Xs2ja/mQEvXl2GJ8qv0OQ0naYLD1mC40w5ieEgzBOiJwjhCBxFKXl2u12qUZyXujeoiz0Bizsl75Un0zIDxkmo3Kr84hA2lMrLT0DHGXT9wUaCd892CCCKpLJSkSOuwKif4+O6CoJO0o0WCHPWzzy3QWCPHNMaj1xmnrJx0= X-Forefront-PRVS: 0848C1A6AA X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(6009001)(479174004)(377454003)(51914003)(15975445007)(189998001)(19580395003)(65816999)(19580405001)(122386002)(5004730100002)(50986999)(76176999)(40100003)(54356999)(2950100001)(4001350100001)(87976001)(110136002)(5001960100002)(3480700003)(23676002)(93886004)(50466002)(5008740100001)(33656002)(42186005)(47776003)(2906002)(1096002)(65806001)(66066001)(65956001)(83506001)(6116002)(92566002)(89122001)(230700001)(75432002)(77096005)(4326007)(90282001)(88552002)(586003)(64126003)(36756003)(3846002);DIR:OUT;SFP:1102;SCL:1;SRVR:CY1PR0301MB0908;H:palken-co-ll.ngdc.noaa.gov;FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtDWTFQUjAzMDFNQjA5MDg7MjM6dVBkTTg2L1hicmZ2cjFXMjNEV0ViZkhk?= =?utf-8?B?YlNpcDVJOEt4VGJtMTBjNm90TGVvZythNkRMVnhYNkZBYWN5OXVNc05lQUpl?= =?utf-8?B?bzJFejN6Vml5TGM5TFlTNjhONGZHNWEzWDhtdHVPejJnQ05ueHVnUmhOaDdW?= =?utf-8?B?L0N3bnNQWFYvRFU1cTE3eVhPY3BMcUVSd015SEtmNEMyRnd1OEQyYjBjSURx?= =?utf-8?B?S0RYVVJQNGFZNUhweHAxV0R6R0dueE4za2xVMWdocVp2TjgyczlxZ0pnUUt3?= =?utf-8?B?ZWVJZWRBQ1lkZWNEdGloSkh1emhKY1RrOVdoVjFmZ2hENjFkOTV5LzJjYXUy?= =?utf-8?B?eHp5Q3NQM2E3Zis1Yk9FdlFZYTFINmV3aXlyRTM2QWhiR2dVL1BqRE9LeHFR?= =?utf-8?B?bStWUzJkYWVlQ3k3Z0V0RThjYjl2SXNLK2k0RnFkZXFSZGk3RWI3aE5XWDJF?= =?utf-8?B?SXJSbTFLdS9KZjBMTDFoQmNtUDdFbTJVM0hYaFhFVTZEOGtKZU5zb2YxM3o1?= =?utf-8?B?dDh3VlVPV21kdHBTVlFKWDA0Mk4zbk5wRXhEZGNrK204Q2krTDg0cWs0TmYv?= =?utf-8?B?L3AyVzBzVWJ1MC96R08yZ2dpaVp5aGR1SEZJbFdFMnlwQklTemtQdDZUKy8z?= =?utf-8?B?WEovR2lFbTZVRExNcU1rNmhINDI5NlRpUU13bm9DUUt6THdkb2FpWHdEV01H?= =?utf-8?B?WHZEMmRWdlMwcVZXWHcrdEFjZWVoL3RqK2hTcUprVldWc2VJdEtIZm9JNnZQ?= =?utf-8?B?QmtkeTQwNmM4TldGUzNOMWI0RkhwdlNTOUVzZlVYSXpUVzNralBaTklNZHIy?= =?utf-8?B?Y0hZV2gxWHkyYmEvNGJWTkE1K0ZSS3NJKzNvVzVNbGZRd1hyNWZTcTV0RGlK?= =?utf-8?B?SVdKMjc2ZjFmd3JyMUJzSHp3bkRnVWlOd05PT1pMUElWcnNFQXRlZDRjS3g0?= =?utf-8?B?ZUtFNzFrUXprcnk2bDRpTzd2ZmhoK0tCeWlWVnFkbTVWNE9leVZUWTJrWWJh?= =?utf-8?B?bXJZSUVvMCtNdUpvbjN5VTVpdkFoV3JGOEttM2h6aXVLNUxiWGlOWWlCK2lp?= =?utf-8?B?anJidVRvSHQ5alpITit5SkJFSXNjQzY4cWlCVkh6cjk4Z295M0xiMHBpQmR3?= =?utf-8?B?alh4YmpERkFrY29aRlQ2Qk5uNXdHUS91MnMxcGZhbmxsbHltdXo1U3R5TmRP?= =?utf-8?B?SlgzLzQ3TERNNVhuTUN5RzM4V1NTZldQckIvbEpVOFd0YjdobVBXWWlVOFNl?= =?utf-8?B?dXhhL1BjT3Y2NVd1T1FtMlovSEdrc0k3NHZGMzBWc25NamdXc2R0Z0VnandD?= =?utf-8?B?NVp3UTdDSE1IaUs1TThNUHVmcDM4VzZDU0pnaGdpdGMzUUxlbUdQRDdMNDFk?= =?utf-8?B?bjVIQlFkSzR4bGMrdU93dXAwNjlTRVM0SnNnK0ZZUVlLZm9OYXNQTk1CS0Rq?= =?utf-8?B?eFBsbHVBLzhOTzB4Tmh4bkVYbjE5bVZIdFlleVpycXFXQUlKdldERXBaZDRu?= =?utf-8?B?Y1Vxbys0cVNJbUxsWi9KVmNxZHZibS91Q3RQVzJidEVaL0k0UGhSTGljV2xL?= =?utf-8?B?YjFydE1TTld5bU1kK2FZcUJwQlZGTUo2bU1leEdPNGFta1NST0tIUFhPaWhY?= =?utf-8?B?NnEyRDc0enFQMkd0Q1ZBZHZYMjVFaWRpR3piL2d0NWE4NWw5a0crZ1Y4R1E9?= =?utf-8?Q?=3D?= X-Microsoft-Exchange-Diagnostics: 1;CY1PR0301MB0908;5:mYGCQwGwkfDo8oeSD46KDUB+9K0T1YmeFwa3gw1aaG7nLPMZIsXipf14O5FW1q5tYVTHpVyLmZOormGO7rXyM1Uu/wj8CgjbpbUs/cgyEexj13B0xEV45bkgbvTpQGdgU4TlVMjAiebFd0osYvvzMA==;24:YNTZhRxL1TeFVZR/wD3IyvYwHuH8fJCQVM8HPdR+spiQmomQRAJ/eHCxVS+3JXrYpbCKZcruGa+9uvc3GIghsScosf4SAiIrnT3EvXshnkM= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: colorado.edu X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Feb 2016 15:55:54.0296 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR0301MB0908 X-SW-Source: 2016-q1/txt/msg00015.txt.bz2 I'm in favor of simplicity and easy-parsing, so matrix market sounds good to me. I'll take a look at your latest code in the next few days. Patrick On 02/10/2016 06:16 AM, Alexis Tantet wrote: > Hi Patrick, > > Regarding the file format for sparse matrices, the one I have coded > actually happen to be the coordinate format implemented by Matrix > Market (the platform to share test data such as sparse matrices), with > the addition of a matrix type header: > http://math.nist.gov/MatrixMarket/formats.html > > It is also written that "Harwell-Boeing" is the most commonly used > sparse matrix format, but that: > "Unfortunately the HB specification is somewhat complex difficult to > parse from languages other than Fortran, biased in favor of compressed > column representation and not easily extensible. Some of these factors > may have impeded the more widespread use of the HB format" > > It seems to me that complying to the Matrix Market coordinate format > would be the right choice, in terms of ease of implementation, > compliance to other packages and user-friendliness. I could update the > print/scan functions accordingly (mostly handling the header). What do > you think? > > Best, > Alexis > > > > > On Mon, Feb 8, 2016 at 1:59 AM, Alexis Tantet wrote: >> Ok, my mistake, now I see where I got confused. >> I had in mind to add all the elements first to the triplets and only >> while converting to compressed sum up the duplicates. >> While, indeed, if there's a way you can sum up the duplicates directly >> while adding them to the triplet matrix (thanks to _ptr), this is more >> handy and efficient. >> >> Thanks for the clarification, >> Alexis >> >> On Sun, Feb 7, 2016 at 10:34 PM, Patrick Alken wrote: >>> By design, gsl_spmatrix_set won't allow you to do this. >>> >>> If you add element (i, j, x) and then later to try add element (i, j, >>> y), gsl_spmatrix_set will detect that there exists an element in the (i, >>> j) spot and it will simply change x to y - the value of x will be >>> overwritten by y. This is the same behavior as gsl_matrix_set. >>> >>> So no duplicates are allowed by design. If you have such an application >>> where you want to keep track of duplicates, you could do the following: >>> >>> double *ptr = gsl_spmatrix_ptr(m, i, j); >>> if (ptr) >>> *ptr += x; /* sum duplicate values */ >>> else >>> gsl_spmatrix_set(m, i, j, x); /* initalize to x */ >>> >>> On 02/07/2016 01:31 PM, Alexis Tantet wrote: >>>> I'm not sure I got your last point. I have the following situation in mind: >>>> >>>> Start to construct a transition matrix in triplet format, adding one >>>> element after another. >>>> In this particular example, each element is one count of a transition >>>> from (state, box, etc.) i to j, >>>> so I add elements (i, j, 1) to the triplet object, with possibly duplicates. >>>> What happen to these duplicates in the binary tree? >>>> >>>> Eventually, when I compress to CRS or CCS, I would like the duplicates >>>> to be summed up, so that element (i, j) counts transitions from i to j >>>> (and no duplicates exist after compression). >>>> >>>> Is this more clear? >>>> >>>> On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken wrote: >>>>> Hi Alexis, >>>>> >>>>>>> I'm not sure what you mean. I've added a new function gsl_spmatrix_ptr >>>>>>> to the git, which as far as I can tell does exactly what your >>>>>>> sum_duplicate flag does. It searches the matrix for an (i,j) element, >>>>>>> and if found returns a pointer. If not found a null pointer is returned. >>>>>>> This makes it easy for the user to modify A(i,j) after it has been added >>>>>>> to the matrix. Are you thinking of something else? Can you point me to >>>>>>> the Eigen routine? >>>>>>> >>>>>> What I meant is to have the equivalent of gsl_spmatrix_compress, >>>>>> with the difference that gsl_spmatrix_ptr is used instead of gsl_spmatrix_set, >>>>>> so has to build the compressed matrix from triplets, summing the >>>>>> duplicates, instead of replacing them. >>>>>> This is what is done here : >>>>>> The http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b >>>>>> >>>>>> Best, >>>>>> Alexis >>>>> I'm not sure why a user would ever need to do this. The whole point of >>>>> the binary tree structure in the triplet storage is to efficiently find >>>>> duplicate entries, so that if a user tries to call gsl_spmatrix_set on >>>>> an element which is already been previously set, it can find that >>>>> element with a binary search (rather than linearly searching the arrays) >>>>> and change the value of that element. >>>>> >>>>> Therefore, the way the triplet storage is designed, there is will never >>>>> be a duplicate element in the triplet arrays. All of the (i[n],j[n]) >>>>> will be unique for each n <= nz. >>>>> >>>>> Am I missing something? >>>>> >>>>> Patrick >>>> >> >> >> -- >> Alexis Tantet > >