* Re: Sparse matrix extension [not found] <CAMWWPT3uJj4Vrn7ut6+F18gY===zd6+1r1UJhz0hcCj--zwtdg@mail.gmail.com> @ 2016-01-19 16:43 ` Alexis Tantet 2016-01-19 17:02 ` Patrick Alken 0 siblings, 1 reply; 25+ messages in thread From: Alexis Tantet @ 2016-01-19 16:43 UTC (permalink / raw) To: gsl-discuss Dear GSLers, As a scientist rather than a developer, I have developed an extension of the sparse matrix module (CRS, I/O, manipulation, see below), which I have tested. These modifications conserve the structure of the original module and be useful for a large number of sparse matrices users. I'm not familiar with the contributing process here. My repository can be found there: https://github.com/atantet/gslsp Unfortunately, I did not know of the gsl.git repository and I forked it froml: https://github.com/drjerry/gslsp , which seems to be a bit older than gsl.git. How can I push/merge to gsl.git ? Should it be as an update or another extension? Is it necessary to adapt to the newest version of the code ? Best regards, Alexis Tantet CHANGES.md: Extension of the sparse matrix module of GSL =================================== Introduction ------------ Usages of sparse matrices are numerous in scientific computing. When numerical linear algebra problems become large, sparse matrices become necessary to avoid memory overload and unnecessary computations, at the cost of element access and matrix construction. As a result, most large scale linear solvers or eigen solvers perform on sparse matrices. Fortunately, a very useful sparse matrix module has recently been introduced to GSL. However, important features are still lacking, such has Compressed Row Storage (CRS) matrices, input/output functions and other matrix properties and manipulation functions. This new version attempts to address this, conserving the original structure of the module and conventions. Major changes ------------- * Add CRS format and update functions manipulating compressed matrices : - additional flag GSL_SPMATRIX_CRS and macro GSLSP_ISMATRIX ( gsl_spmatrix.h ) - additional members innerSize and outerSize used to iterate matrix elements ( gsl_spmatrix.h ) - rename some variables for coherence ( gsl_spmatrix.h , *.c ) - update all functions on compressed matrices ( *.c ) * Allow to sum duplicate elements when compressing ( spcompress.c ) : - modify gsl_spmatrix_compress - add gsl_spmatrix_sum_duplicate * CCS <-> CRS and fast transpose inplace in spswap.c : - add gsl_spmatrix_switch_major - add gsl_spmatrix_transpose * Add printing and scanning functions in spio.c : - add gsl_spmatrix_fprintf - add gsl_spmatrix_fscanf * Add manipulation functions in spmanip.c (particularly useful for Markov chain transition matrices) : - add gsl_spmatrix_get_rowsum : get vector of sum over row elements - add gsl_spmatrix_get_colsum : get vector of sum over column elements - add gsl_spmatrix_div_rows : divide all elements of each row by a vector element - add gsl_spmatrix_div_cols : divide all elements of each column by a vector element * Add test functions in atprop.c : - add gsl_spmatrix_gt_elements : greater than test for each matrix element - add gsl_spmatrix_ge_elements : greater or equal than test for each matrix element - add gsl_spmatrix_lt_elements : lower than test for each matrix element - add gsl_spmatrix_le_elements : lower or equal than test for each matrix element - add gsl_spmatrix_any : test if any non-zero element in matrix Other minor changes have been made, such as error tests. test.c has also been updated to test new features. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-01-19 16:43 ` Sparse matrix extension Alexis Tantet @ 2016-01-19 17:02 ` Patrick Alken 2016-01-19 19:55 ` Alexis Tantet 0 siblings, 1 reply; 25+ messages in thread From: Patrick Alken @ 2016-01-19 17:02 UTC (permalink / raw) To: gsl-discuss Hi Alexis, This looks like very good work! Adding compressed row storage has been on my todo list for a while. The 'gslsp' extension is unfortunately very out of date, and the current git contains newer code (including a GMRES iterative linear solver). I removed the gslsp extension from the web page a while back to reflect this. You can browse the latest manual to see the current sparse matrix capabilities (http://www.gnu.org/software/gsl/manual/gsl-ref.pdf) - there are 3 chapters: sparse matrices, sparse blas and sparse linear algebra - it looks like your contributions will fit into the sparse matrices chapter. Would you be able to verify that your changes are compatible with the current gsl.git repository? This will make it much easier for me to merge everything into the git when ready. It would be best if you made a new branch of gsl.git, and add your changes so I can then pull them from github or somewhere. I will try to find some time in the next few days to look over your code. Thanks again, Patrick On 01/19/2016 09:43 AM, Alexis Tantet wrote: > Dear GSLers, > > As a scientist rather than a developer, I have developed an extension > of the sparse matrix module (CRS, I/O, manipulation, see below), which > I have tested. These modifications conserve the structure of the > original module and be useful for a large number of sparse matrices > users. > > I'm not familiar with the contributing process here. My repository can > be found there: > https://github.com/atantet/gslsp > Unfortunately, I did not know of the gsl.git repository and I forked it froml: > https://github.com/drjerry/gslsp , > which seems to be a bit older than gsl.git. > > How can I push/merge to gsl.git ? Should it be as an update or another > extension? Is it necessary to adapt to the newest version of the code > ? > > Best regards, > Alexis Tantet > > CHANGES.md: > > Extension of the sparse matrix module of GSL > > =================================== > > Introduction > ------------ > > Usages of sparse matrices are numerous in scientific computing. > When numerical linear algebra problems become large, sparse > matrices become necessary to avoid memory overload and unnecessary > computations, at the cost of element access and matrix construction. > As a result, most large scale linear solvers or eigen solvers perform > on sparse matrices. > > Fortunately, a very useful sparse matrix module has recently been > introduced to GSL. > However, important features are still lacking, such has > Compressed Row Storage (CRS) matrices, input/output functions and > other matrix properties and manipulation functions. > This new version attempts to address this, conserving the original > structure of the module and conventions. > > Major changes > ------------- > > * Add CRS format and update functions manipulating compressed matrices : > - additional flag GSL_SPMATRIX_CRS and macro GSLSP_ISMATRIX ( > gsl_spmatrix.h ) > - additional members innerSize and outerSize used to iterate > matrix elements ( gsl_spmatrix.h ) > - rename some variables for coherence ( gsl_spmatrix.h , *.c ) > - update all functions on compressed matrices ( *.c ) > * Allow to sum duplicate elements when compressing ( spcompress.c ) : > - modify gsl_spmatrix_compress > - add gsl_spmatrix_sum_duplicate > * CCS <-> CRS and fast transpose inplace in spswap.c : > - add gsl_spmatrix_switch_major > - add gsl_spmatrix_transpose > * Add printing and scanning functions in spio.c : > - add gsl_spmatrix_fprintf > - add gsl_spmatrix_fscanf > * Add manipulation functions in spmanip.c (particularly useful for > Markov chain transition matrices) : > - add gsl_spmatrix_get_rowsum : get vector of sum over row elements > - add gsl_spmatrix_get_colsum : get vector of sum over column elements > - add gsl_spmatrix_div_rows : divide all elements of each row > by a vector element > - add gsl_spmatrix_div_cols : divide all elements of each > column by a vector element > * Add test functions in atprop.c : > - add gsl_spmatrix_gt_elements : greater than test for each matrix element > - add gsl_spmatrix_ge_elements : greater or equal than test for > each matrix element > - add gsl_spmatrix_lt_elements : lower than test for each matrix element > - add gsl_spmatrix_le_elements : lower or equal than test for > each matrix element > - add gsl_spmatrix_any : test if any non-zero element in matrix > > Other minor changes have been made, such as error tests. > test.c has also been updated to test new features. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-01-19 17:02 ` Patrick Alken @ 2016-01-19 19:55 ` Alexis Tantet 2016-01-19 20:50 ` Patrick Alken 0 siblings, 1 reply; 25+ messages in thread From: Alexis Tantet @ 2016-01-19 19:55 UTC (permalink / raw) To: Patrick Alken; +Cc: gsl-discuss Hi Patrick, Thanks for the quick reply! I wanted to be sure that this contribution is useful before to spend time on the merging with the latest version. I will create the gsl.git repository and work on it during the week. I had already had a look at the documentation but did not know about the iterative solvers (a link between each modules would be useful). My contribution indeed fits in the sparse matrix module + the update of the dgemm and dgemv functions to support CRS (an update may also be needed for the solvers). I have also developed a simple C++ object allowing to use gsl_spmatrix as a user-defined matrix in ARPACK++ (a maintained fork of ARPACK++ can be found at https://github.com/m-reuter/arpackpp), allowing to avoid having to use other libraries such as superLU. It could be useful to others, maybe as an extension. Now that I think about it, the iterative solvers could also be used to support the shift and invert modes (see ARPACK++ documentation). What do you think (I could work on it)? If you have major comments, the sooner the better, so that I can work on them while merging. Thank you for your interest, Alexis On Tue, Jan 19, 2016 at 6:02 PM, Patrick Alken <alken@colorado.edu> wrote: > Hi Alexis, > > This looks like very good work! Adding compressed row storage has been on > my todo list for a while. The 'gslsp' extension is unfortunately very out of > date, and the current git contains newer code (including a GMRES iterative > linear solver). I removed the gslsp extension from the web page a while back > to reflect this. You can browse the latest manual to see the current sparse > matrix capabilities (http://www.gnu.org/software/gsl/manual/gsl-ref.pdf) - > there are 3 chapters: sparse matrices, sparse blas and sparse linear algebra > - it looks like your contributions will fit into the sparse matrices > chapter. > > Would you be able to verify that your changes are compatible with the > current gsl.git repository? This will make it much easier for me to merge > everything into the git when ready. It would be best if you made a new > branch of gsl.git, and add your changes so I can then pull them from github > or somewhere. I will try to find some time in the next few days to look over > your code. > > Thanks again, > Patrick > > > On 01/19/2016 09:43 AM, Alexis Tantet wrote: >> >> Dear GSLers, >> >> As a scientist rather than a developer, I have developed an extension >> of the sparse matrix module (CRS, I/O, manipulation, see below), which >> I have tested. These modifications conserve the structure of the >> original module and be useful for a large number of sparse matrices >> users. >> >> I'm not familiar with the contributing process here. My repository can >> be found there: >> https://github.com/atantet/gslsp >> Unfortunately, I did not know of the gsl.git repository and I forked it >> froml: >> https://github.com/drjerry/gslsp , >> which seems to be a bit older than gsl.git. >> >> How can I push/merge to gsl.git ? Should it be as an update or another >> extension? Is it necessary to adapt to the newest version of the code >> ? >> >> Best regards, >> Alexis Tantet >> >> CHANGES.md: >> >> Extension of the sparse matrix module of GSL >> >> =================================== >> >> Introduction >> ------------ >> >> Usages of sparse matrices are numerous in scientific computing. >> When numerical linear algebra problems become large, sparse >> matrices become necessary to avoid memory overload and unnecessary >> computations, at the cost of element access and matrix construction. >> As a result, most large scale linear solvers or eigen solvers perform >> on sparse matrices. >> >> Fortunately, a very useful sparse matrix module has recently been >> introduced to GSL. >> However, important features are still lacking, such has >> Compressed Row Storage (CRS) matrices, input/output functions and >> other matrix properties and manipulation functions. >> This new version attempts to address this, conserving the original >> structure of the module and conventions. >> >> Major changes >> ------------- >> >> * Add CRS format and update functions manipulating compressed matrices : >> - additional flag GSL_SPMATRIX_CRS and macro GSLSP_ISMATRIX ( >> gsl_spmatrix.h ) >> - additional members innerSize and outerSize used to iterate >> matrix elements ( gsl_spmatrix.h ) >> - rename some variables for coherence ( gsl_spmatrix.h , *.c ) >> - update all functions on compressed matrices ( *.c ) >> * Allow to sum duplicate elements when compressing ( spcompress.c ) : >> - modify gsl_spmatrix_compress >> - add gsl_spmatrix_sum_duplicate >> * CCS <-> CRS and fast transpose inplace in spswap.c : >> - add gsl_spmatrix_switch_major >> - add gsl_spmatrix_transpose >> * Add printing and scanning functions in spio.c : >> - add gsl_spmatrix_fprintf >> - add gsl_spmatrix_fscanf >> * Add manipulation functions in spmanip.c (particularly useful for >> Markov chain transition matrices) : >> - add gsl_spmatrix_get_rowsum : get vector of sum over row elements >> - add gsl_spmatrix_get_colsum : get vector of sum over column >> elements >> - add gsl_spmatrix_div_rows : divide all elements of each row >> by a vector element >> - add gsl_spmatrix_div_cols : divide all elements of each >> column by a vector element >> * Add test functions in atprop.c : >> - add gsl_spmatrix_gt_elements : greater than test for each matrix >> element >> - add gsl_spmatrix_ge_elements : greater or equal than test for >> each matrix element >> - add gsl_spmatrix_lt_elements : lower than test for each matrix >> element >> - add gsl_spmatrix_le_elements : lower or equal than test for >> each matrix element >> - add gsl_spmatrix_any : test if any non-zero element in >> matrix >> >> Other minor changes have been made, such as error tests. >> test.c has also been updated to test new features. > > -- Alexis Tantet ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-01-19 19:55 ` Alexis Tantet @ 2016-01-19 20:50 ` Patrick Alken 2016-01-20 18:31 ` Alexis Tantet 0 siblings, 1 reply; 25+ messages in thread From: Patrick Alken @ 2016-01-19 20:50 UTC (permalink / raw) To: gsl-discuss On 01/19/2016 12:55 PM, Alexis Tantet wrote: > Hi Patrick, > > Thanks for the quick reply! I wanted to be sure that this contribution > is useful before to spend time on the merging with the latest version. > I will create the gsl.git repository and work on it during the week. > > I had already had a look at the documentation but did not know about > the iterative solvers (a link between each modules would be useful). > My contribution indeed fits in the sparse matrix module + the update > of the dgemm and dgemv functions to support CRS (an update may also be > needed for the solvers). The solvers only call dgemv (as far as I remember) so they shouldn't need an update once dgemv is updated. > > I have also developed a simple C++ object allowing to use gsl_spmatrix > as a user-defined matrix in ARPACK++ (a maintained fork of ARPACK++ > can be found at https://github.com/m-reuter/arpackpp), allowing to > avoid having to use other libraries such as superLU. It could be > useful to others, maybe as an extension. Now that I think about it, > the iterative solvers could also be used to support the shift and > invert modes (see ARPACK++ documentation). What do you think (I could > work on it)? I've never used ARPACK, but if you want to make an extension to interface GSL/ARPACK its fine with me - such an extension would never be added to the main GSL code since GSL tries to be as standalone as possible. > > If you have major comments, the sooner the better, so that I can work > on them while merging. > > Thank you for your interest, > Alexis > > On Tue, Jan 19, 2016 at 6:02 PM, Patrick Alken <alken@colorado.edu> wrote: >> Hi Alexis, >> >> This looks like very good work! Adding compressed row storage has been on >> my todo list for a while. The 'gslsp' extension is unfortunately very out of >> date, and the current git contains newer code (including a GMRES iterative >> linear solver). I removed the gslsp extension from the web page a while back >> to reflect this. You can browse the latest manual to see the current sparse >> matrix capabilities (http://www.gnu.org/software/gsl/manual/gsl-ref.pdf) - >> there are 3 chapters: sparse matrices, sparse blas and sparse linear algebra >> - it looks like your contributions will fit into the sparse matrices >> chapter. >> >> Would you be able to verify that your changes are compatible with the >> current gsl.git repository? This will make it much easier for me to merge >> everything into the git when ready. It would be best if you made a new >> branch of gsl.git, and add your changes so I can then pull them from github >> or somewhere. I will try to find some time in the next few days to look over >> your code. >> >> Thanks again, >> Patrick >> >> >> On 01/19/2016 09:43 AM, Alexis Tantet wrote: >>> Dear GSLers, >>> >>> As a scientist rather than a developer, I have developed an extension >>> of the sparse matrix module (CRS, I/O, manipulation, see below), which >>> I have tested. These modifications conserve the structure of the >>> original module and be useful for a large number of sparse matrices >>> users. >>> >>> I'm not familiar with the contributing process here. My repository can >>> be found there: >>> https://github.com/atantet/gslsp >>> Unfortunately, I did not know of the gsl.git repository and I forked it >>> froml: >>> https://github.com/drjerry/gslsp , >>> which seems to be a bit older than gsl.git. >>> >>> How can I push/merge to gsl.git ? Should it be as an update or another >>> extension? Is it necessary to adapt to the newest version of the code >>> ? >>> >>> Best regards, >>> Alexis Tantet >>> >>> CHANGES.md: >>> >>> Extension of the sparse matrix module of GSL >>> >>> =================================== >>> >>> Introduction >>> ------------ >>> >>> Usages of sparse matrices are numerous in scientific computing. >>> When numerical linear algebra problems become large, sparse >>> matrices become necessary to avoid memory overload and unnecessary >>> computations, at the cost of element access and matrix construction. >>> As a result, most large scale linear solvers or eigen solvers perform >>> on sparse matrices. >>> >>> Fortunately, a very useful sparse matrix module has recently been >>> introduced to GSL. >>> However, important features are still lacking, such has >>> Compressed Row Storage (CRS) matrices, input/output functions and >>> other matrix properties and manipulation functions. >>> This new version attempts to address this, conserving the original >>> structure of the module and conventions. >>> >>> Major changes >>> ------------- >>> >>> * Add CRS format and update functions manipulating compressed matrices : >>> - additional flag GSL_SPMATRIX_CRS and macro GSLSP_ISMATRIX ( >>> gsl_spmatrix.h ) >>> - additional members innerSize and outerSize used to iterate >>> matrix elements ( gsl_spmatrix.h ) >>> - rename some variables for coherence ( gsl_spmatrix.h , *.c ) >>> - update all functions on compressed matrices ( *.c ) >>> * Allow to sum duplicate elements when compressing ( spcompress.c ) : >>> - modify gsl_spmatrix_compress >>> - add gsl_spmatrix_sum_duplicate >>> * CCS <-> CRS and fast transpose inplace in spswap.c : >>> - add gsl_spmatrix_switch_major >>> - add gsl_spmatrix_transpose >>> * Add printing and scanning functions in spio.c : >>> - add gsl_spmatrix_fprintf >>> - add gsl_spmatrix_fscanf >>> * Add manipulation functions in spmanip.c (particularly useful for >>> Markov chain transition matrices) : >>> - add gsl_spmatrix_get_rowsum : get vector of sum over row elements >>> - add gsl_spmatrix_get_colsum : get vector of sum over column >>> elements >>> - add gsl_spmatrix_div_rows : divide all elements of each row >>> by a vector element >>> - add gsl_spmatrix_div_cols : divide all elements of each >>> column by a vector element >>> * Add test functions in atprop.c : >>> - add gsl_spmatrix_gt_elements : greater than test for each matrix >>> element >>> - add gsl_spmatrix_ge_elements : greater or equal than test for >>> each matrix element >>> - add gsl_spmatrix_lt_elements : lower than test for each matrix >>> element >>> - add gsl_spmatrix_le_elements : lower or equal than test for >>> each matrix element >>> - add gsl_spmatrix_any : test if any non-zero element in >>> matrix >>> >>> Other minor changes have been made, such as error tests. >>> test.c has also been updated to test new features. >> > > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-01-19 20:50 ` Patrick Alken @ 2016-01-20 18:31 ` Alexis Tantet 2016-02-07 0:03 ` Patrick Alken 0 siblings, 1 reply; 25+ messages in thread From: Alexis Tantet @ 2016-01-20 18:31 UTC (permalink / raw) To: Patrick Alken; +Cc: gsl-discuss Hi Patrick, I've cloned gsl.git, made my modifications to spmatrix/ and spblas/ and pushed them to master at: https://github.com/atantet/gsl This time, I've tried to modify the code as little as possible. I've usually added comments labelled "AT" with each modification. I also have a few questions about the code which I've labelled "?AT". Finally, I've added the corresponding tests to both test.c (successful on my machine). I see that duplicates are handled for the triplet format by replacing. Thus, I removed the function taking care of sorting and summing duplicates that I had added to _compress (the trick was to transpose with copy to reorder first, transpose back and remove the duplicates then, taken from Eigen C++). Instead, I have added an option to sum the duplicates (useful e.g. when counting links of a graph or transitions of a Markov chain). One may need the third option of repeating the triplets, but then avl_insert should also be modified. However, if I've understood well, no sorting of the inner indices is performed right now. This could be problematic in the future (e.g. in functions such as _equal). For _spdgemm for CRS matrices, I've used the trick (A B)^t = B^t A^t to convert the problem to CSC. I've tested it successfully but you may not like the coding style. I hope that helps, Best, Alexis On Tue, Jan 19, 2016 at 9:50 PM, Patrick Alken <alken@colorado.edu> wrote: > > > On 01/19/2016 12:55 PM, Alexis Tantet wrote: >> >> Hi Patrick, >> >> Thanks for the quick reply! I wanted to be sure that this contribution >> is useful before to spend time on the merging with the latest version. >> I will create the gsl.git repository and work on it during the week. >> >> I had already had a look at the documentation but did not know about >> the iterative solvers (a link between each modules would be useful). >> My contribution indeed fits in the sparse matrix module + the update >> of the dgemm and dgemv functions to support CRS (an update may also be >> needed for the solvers). > > > The solvers only call dgemv (as far as I remember) so they shouldn't need an > update once dgemv is updated. > >> >> I have also developed a simple C++ object allowing to use gsl_spmatrix >> as a user-defined matrix in ARPACK++ (a maintained fork of ARPACK++ >> can be found at https://github.com/m-reuter/arpackpp), allowing to >> avoid having to use other libraries such as superLU. It could be >> useful to others, maybe as an extension. Now that I think about it, >> the iterative solvers could also be used to support the shift and >> invert modes (see ARPACK++ documentation). What do you think (I could >> work on it)? > > I've never used ARPACK, but if you want to make an extension to interface > GSL/ARPACK its fine with me - such an extension would never be added to the > main GSL code since GSL tries to be as standalone as possible. > > >> >> If you have major comments, the sooner the better, so that I can work >> on them while merging. >> >> Thank you for your interest, >> Alexis >> >> On Tue, Jan 19, 2016 at 6:02 PM, Patrick Alken <alken@colorado.edu> wrote: >>> >>> Hi Alexis, >>> >>> This looks like very good work! Adding compressed row storage has been >>> on >>> my todo list for a while. The 'gslsp' extension is unfortunately very out >>> of >>> date, and the current git contains newer code (including a GMRES >>> iterative >>> linear solver). I removed the gslsp extension from the web page a while >>> back >>> to reflect this. You can browse the latest manual to see the current >>> sparse >>> matrix capabilities (http://www.gnu.org/software/gsl/manual/gsl-ref.pdf) >>> - >>> there are 3 chapters: sparse matrices, sparse blas and sparse linear >>> algebra >>> - it looks like your contributions will fit into the sparse matrices >>> chapter. >>> >>> Would you be able to verify that your changes are compatible with the >>> current gsl.git repository? This will make it much easier for me to merge >>> everything into the git when ready. It would be best if you made a new >>> branch of gsl.git, and add your changes so I can then pull them from >>> github >>> or somewhere. I will try to find some time in the next few days to look >>> over >>> your code. >>> >>> Thanks again, >>> Patrick >>> >>> >>> On 01/19/2016 09:43 AM, Alexis Tantet wrote: >>>> >>>> Dear GSLers, >>>> >>>> As a scientist rather than a developer, I have developed an extension >>>> of the sparse matrix module (CRS, I/O, manipulation, see below), which >>>> I have tested. These modifications conserve the structure of the >>>> original module and be useful for a large number of sparse matrices >>>> users. >>>> >>>> I'm not familiar with the contributing process here. My repository can >>>> be found there: >>>> https://github.com/atantet/gslsp >>>> Unfortunately, I did not know of the gsl.git repository and I forked it >>>> froml: >>>> https://github.com/drjerry/gslsp , >>>> which seems to be a bit older than gsl.git. >>>> >>>> How can I push/merge to gsl.git ? Should it be as an update or another >>>> extension? Is it necessary to adapt to the newest version of the code >>>> ? >>>> >>>> Best regards, >>>> Alexis Tantet >>>> >>>> CHANGES.md: >>>> >>>> Extension of the sparse matrix module of GSL >>>> >>>> =================================== >>>> >>>> Introduction >>>> ------------ >>>> >>>> Usages of sparse matrices are numerous in scientific computing. >>>> When numerical linear algebra problems become large, sparse >>>> matrices become necessary to avoid memory overload and unnecessary >>>> computations, at the cost of element access and matrix construction. >>>> As a result, most large scale linear solvers or eigen solvers perform >>>> on sparse matrices. >>>> >>>> Fortunately, a very useful sparse matrix module has recently been >>>> introduced to GSL. >>>> However, important features are still lacking, such has >>>> Compressed Row Storage (CRS) matrices, input/output functions and >>>> other matrix properties and manipulation functions. >>>> This new version attempts to address this, conserving the original >>>> structure of the module and conventions. >>>> >>>> Major changes >>>> ------------- >>>> >>>> * Add CRS format and update functions manipulating compressed matrices : >>>> - additional flag GSL_SPMATRIX_CRS and macro GSLSP_ISMATRIX ( >>>> gsl_spmatrix.h ) >>>> - additional members innerSize and outerSize used to iterate >>>> matrix elements ( gsl_spmatrix.h ) >>>> - rename some variables for coherence ( gsl_spmatrix.h , *.c ) >>>> - update all functions on compressed matrices ( *.c ) >>>> * Allow to sum duplicate elements when compressing ( spcompress.c ) : >>>> - modify gsl_spmatrix_compress >>>> - add gsl_spmatrix_sum_duplicate >>>> * CCS <-> CRS and fast transpose inplace in spswap.c : >>>> - add gsl_spmatrix_switch_major >>>> - add gsl_spmatrix_transpose >>>> * Add printing and scanning functions in spio.c : >>>> - add gsl_spmatrix_fprintf >>>> - add gsl_spmatrix_fscanf >>>> * Add manipulation functions in spmanip.c (particularly useful for >>>> Markov chain transition matrices) : >>>> - add gsl_spmatrix_get_rowsum : get vector of sum over row >>>> elements >>>> - add gsl_spmatrix_get_colsum : get vector of sum over column >>>> elements >>>> - add gsl_spmatrix_div_rows : divide all elements of each row >>>> by a vector element >>>> - add gsl_spmatrix_div_cols : divide all elements of each >>>> column by a vector element >>>> * Add test functions in atprop.c : >>>> - add gsl_spmatrix_gt_elements : greater than test for each >>>> matrix >>>> element >>>> - add gsl_spmatrix_ge_elements : greater or equal than test for >>>> each matrix element >>>> - add gsl_spmatrix_lt_elements : lower than test for each matrix >>>> element >>>> - add gsl_spmatrix_le_elements : lower or equal than test for >>>> each matrix element >>>> - add gsl_spmatrix_any : test if any non-zero element in >>>> matrix >>>> >>>> Other minor changes have been made, such as error tests. >>>> test.c has also been updated to test new features. >>> >>> >> >> > -- Alexis Tantet ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-01-20 18:31 ` Alexis Tantet @ 2016-02-07 0:03 ` Patrick Alken 2016-02-07 1:03 ` Alexis Tantet 0 siblings, 1 reply; 25+ messages in thread From: Patrick Alken @ 2016-02-07 0:03 UTC (permalink / raw) To: Alexis Tantet; +Cc: gsl-discuss Hello, I've been working my way through your patch. There is a new branch on the git called 'sparse' with all the current changes. You've done a lot of work on your patch so its taking me a while to get through all of it. The main functionality of CRS is in there, along with basic stuff like memcpy, transpose_memcpy, transpose. Also dgemv works (and hence also the GMRES linear solver). I did some initial work on the fprintf/fscanf for the triplet case - for the compressed matrices, I am wondering if we should stick to a standard format such as Harwell-Boeing, I'm not sure at the moment. We might be able to avoid this issue by making a binary fwrite/fread interface which just directly writes the 3 arrays to disk. Things I still need to add for the CRS storage: 1. spmatrix_add 2. spblas_dgemm 3. fprintf/fscanf 4. transpose - for now I'll keep it the way you coded it, but I think it would be better to keep the matrix in the same storage format during a transpose, ie: the user might want to pass A^T to an external solver library which requires CCS and they might not expect the transpose routine to change it to CRS. I did a quick search for in-place transpose algorithms in CCS/CRS and they do appear to exist though I haven't had time to implement it yet. I removed the inner/outer idx stuff from gsl_spmatrix - its not necessary and also I'd like to avoid modifying the gsl_spmatrix struct since any changes would break binary compatibility for the next release. Also your modification to gsl_spmatrix_set - I see what you're trying to do but I think its better to add a new function: double *gsl_spmatrix_ptr(gsl_spmatrix *m, size_t i, size_t j) which returns a pointer to the data array for element (i,j), and the user can then add whatever value they want (this is how its done with gsl_matrix and gsl_vector). I think it should be straightforward to add this function, possibly even for the compressed storage too. Anyway, just wanted to let you know things are progressing (though a little slowly). Patrick On 01/20/2016 11:31 AM, Alexis Tantet wrote: > Hi Patrick, > > I've cloned gsl.git, made my modifications to spmatrix/ and spblas/ > and pushed them to master at: > https://github.com/atantet/gsl > > This time, I've tried to modify the code as little as possible. I've > usually added comments labelled "AT" with each modification. > I also have a few questions about the code which I've labelled "?AT". > Finally, I've added the corresponding tests to both test.c (successful > on my machine). > > I see that duplicates are handled for the triplet format by replacing. > Thus, I removed the function taking care of sorting and summing > duplicates that I had added to _compress (the trick was to transpose > with copy to reorder first, transpose back and remove the duplicates > then, taken from Eigen C++). Instead, I have added an option to sum > the duplicates (useful e.g. when counting links of a graph or > transitions of a Markov chain). One may need the third option of > repeating the triplets, but then avl_insert should also be modified. > > However, if I've understood well, no sorting of the inner indices is > performed right now. This could be problematic in the future (e.g. in > functions such as _equal). > > For _spdgemm for CRS matrices, I've used the trick (A B)^t = B^t A^t > to convert the problem to CSC. I've tested it successfully but you may > not like the coding style. > > I hope that helps, > Best, > Alexis > > On Tue, Jan 19, 2016 at 9:50 PM, Patrick Alken <alken@colorado.edu> wrote: >> >> On 01/19/2016 12:55 PM, Alexis Tantet wrote: >>> Hi Patrick, >>> >>> Thanks for the quick reply! I wanted to be sure that this contribution >>> is useful before to spend time on the merging with the latest version. >>> I will create the gsl.git repository and work on it during the week. >>> >>> I had already had a look at the documentation but did not know about >>> the iterative solvers (a link between each modules would be useful). >>> My contribution indeed fits in the sparse matrix module + the update >>> of the dgemm and dgemv functions to support CRS (an update may also be >>> needed for the solvers). >> >> The solvers only call dgemv (as far as I remember) so they shouldn't need an >> update once dgemv is updated. >> >>> I have also developed a simple C++ object allowing to use gsl_spmatrix >>> as a user-defined matrix in ARPACK++ (a maintained fork of ARPACK++ >>> can be found at https://github.com/m-reuter/arpackpp), allowing to >>> avoid having to use other libraries such as superLU. It could be >>> useful to others, maybe as an extension. Now that I think about it, >>> the iterative solvers could also be used to support the shift and >>> invert modes (see ARPACK++ documentation). What do you think (I could >>> work on it)? >> I've never used ARPACK, but if you want to make an extension to interface >> GSL/ARPACK its fine with me - such an extension would never be added to the >> main GSL code since GSL tries to be as standalone as possible. >> >> >>> If you have major comments, the sooner the better, so that I can work >>> on them while merging. >>> >>> Thank you for your interest, >>> Alexis >>> >>> On Tue, Jan 19, 2016 at 6:02 PM, Patrick Alken <alken@colorado.edu> wrote: >>>> Hi Alexis, >>>> >>>> This looks like very good work! Adding compressed row storage has been >>>> on >>>> my todo list for a while. The 'gslsp' extension is unfortunately very out >>>> of >>>> date, and the current git contains newer code (including a GMRES >>>> iterative >>>> linear solver). I removed the gslsp extension from the web page a while >>>> back >>>> to reflect this. You can browse the latest manual to see the current >>>> sparse >>>> matrix capabilities (http://www.gnu.org/software/gsl/manual/gsl-ref.pdf) >>>> - >>>> there are 3 chapters: sparse matrices, sparse blas and sparse linear >>>> algebra >>>> - it looks like your contributions will fit into the sparse matrices >>>> chapter. >>>> >>>> Would you be able to verify that your changes are compatible with the >>>> current gsl.git repository? This will make it much easier for me to merge >>>> everything into the git when ready. It would be best if you made a new >>>> branch of gsl.git, and add your changes so I can then pull them from >>>> github >>>> or somewhere. I will try to find some time in the next few days to look >>>> over >>>> your code. >>>> >>>> Thanks again, >>>> Patrick >>>> >>>> >>>> On 01/19/2016 09:43 AM, Alexis Tantet wrote: >>>>> Dear GSLers, >>>>> >>>>> As a scientist rather than a developer, I have developed an extension >>>>> of the sparse matrix module (CRS, I/O, manipulation, see below), which >>>>> I have tested. These modifications conserve the structure of the >>>>> original module and be useful for a large number of sparse matrices >>>>> users. >>>>> >>>>> I'm not familiar with the contributing process here. My repository can >>>>> be found there: >>>>> https://github.com/atantet/gslsp >>>>> Unfortunately, I did not know of the gsl.git repository and I forked it >>>>> froml: >>>>> https://github.com/drjerry/gslsp , >>>>> which seems to be a bit older than gsl.git. >>>>> >>>>> How can I push/merge to gsl.git ? Should it be as an update or another >>>>> extension? Is it necessary to adapt to the newest version of the code >>>>> ? >>>>> >>>>> Best regards, >>>>> Alexis Tantet >>>>> >>>>> CHANGES.md: >>>>> >>>>> Extension of the sparse matrix module of GSL >>>>> >>>>> =================================== >>>>> >>>>> Introduction >>>>> ------------ >>>>> >>>>> Usages of sparse matrices are numerous in scientific computing. >>>>> When numerical linear algebra problems become large, sparse >>>>> matrices become necessary to avoid memory overload and unnecessary >>>>> computations, at the cost of element access and matrix construction. >>>>> As a result, most large scale linear solvers or eigen solvers perform >>>>> on sparse matrices. >>>>> >>>>> Fortunately, a very useful sparse matrix module has recently been >>>>> introduced to GSL. >>>>> However, important features are still lacking, such has >>>>> Compressed Row Storage (CRS) matrices, input/output functions and >>>>> other matrix properties and manipulation functions. >>>>> This new version attempts to address this, conserving the original >>>>> structure of the module and conventions. >>>>> >>>>> Major changes >>>>> ------------- >>>>> >>>>> * Add CRS format and update functions manipulating compressed matrices : >>>>> - additional flag GSL_SPMATRIX_CRS and macro GSLSP_ISMATRIX ( >>>>> gsl_spmatrix.h ) >>>>> - additional members innerSize and outerSize used to iterate >>>>> matrix elements ( gsl_spmatrix.h ) >>>>> - rename some variables for coherence ( gsl_spmatrix.h , *.c ) >>>>> - update all functions on compressed matrices ( *.c ) >>>>> * Allow to sum duplicate elements when compressing ( spcompress.c ) : >>>>> - modify gsl_spmatrix_compress >>>>> - add gsl_spmatrix_sum_duplicate >>>>> * CCS <-> CRS and fast transpose inplace in spswap.c : >>>>> - add gsl_spmatrix_switch_major >>>>> - add gsl_spmatrix_transpose >>>>> * Add printing and scanning functions in spio.c : >>>>> - add gsl_spmatrix_fprintf >>>>> - add gsl_spmatrix_fscanf >>>>> * Add manipulation functions in spmanip.c (particularly useful for >>>>> Markov chain transition matrices) : >>>>> - add gsl_spmatrix_get_rowsum : get vector of sum over row >>>>> elements >>>>> - add gsl_spmatrix_get_colsum : get vector of sum over column >>>>> elements >>>>> - add gsl_spmatrix_div_rows : divide all elements of each row >>>>> by a vector element >>>>> - add gsl_spmatrix_div_cols : divide all elements of each >>>>> column by a vector element >>>>> * Add test functions in atprop.c : >>>>> - add gsl_spmatrix_gt_elements : greater than test for each >>>>> matrix >>>>> element >>>>> - add gsl_spmatrix_ge_elements : greater or equal than test for >>>>> each matrix element >>>>> - add gsl_spmatrix_lt_elements : lower than test for each matrix >>>>> element >>>>> - add gsl_spmatrix_le_elements : lower or equal than test for >>>>> each matrix element >>>>> - add gsl_spmatrix_any : test if any non-zero element in >>>>> matrix >>>>> >>>>> Other minor changes have been made, such as error tests. >>>>> test.c has also been updated to test new features. >>>> >>> > > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-07 0:03 ` Patrick Alken @ 2016-02-07 1:03 ` Alexis Tantet 2016-02-07 17:25 ` Patrick Alken 0 siblings, 1 reply; 25+ messages in thread From: Alexis Tantet @ 2016-02-07 1:03 UTC (permalink / raw) To: Patrick Alken; +Cc: gsl-discuss Hi Patrick, Thanks for working on it. Following, some comments: On Sun, Feb 7, 2016 at 1:02 AM, Patrick Alken <alken@colorado.edu> wrote: > > Hello, > > I've been working my way through your patch. There is a new branch on > the git called 'sparse' with all the current changes. You've done a lot > of work on your patch so its taking me a while to get through all of it. > > The main functionality of CRS is in there, along with basic stuff like > memcpy, transpose_memcpy, transpose. Also dgemv works (and hence also > the GMRES linear solver). Since the GMRES is already implemented, do you think it would be worth using the Arnoldi method to find eigenvalues and eigenvectors of sparse non-hermitian matrices? > I did some initial work on the fprintf/fscanf > for the triplet case - for the compressed matrices, I am wondering if we > should stick to a standard format such as Harwell-Boeing, I'm not sure > at the moment. We might be able to avoid this issue by making a binary > fwrite/fread interface which just directly writes the 3 arrays to disk. I agree that a standard format would be best but I am not aware of any dominant format. Is Harwell-Boeing one of them? It seems important to me to have both binary and formatted input/output (for coherence with gsl_matrix and because formatted files are more handy to share). Also, it is important to have some header indicating how many elements should be read and what is the dimension of the matrix. > > Things I still need to add for the CRS storage: > > 1. spmatrix_add > 2. spblas_dgemm > 3. fprintf/fscanf > 4. transpose - for now I'll keep it the way you coded it, but I think it > would be better to keep the matrix in the same storage format during a > transpose, ie: the user might want to pass A^T to an external solver > library which requires CCS and they might not expect the transpose > routine to change it to CRS. I did a quick search for in-place transpose > algorithms in CCS/CRS and they do appear to exist though I haven't had > time to implement it yet. Fine, but then I think it is important to add a separate function, say "transpose_switch_major", for which the transposition is performed by just changing the major (i.e. row->col or col->row) since this is the cheapest way to transpose while some libraries may work equally well with CRS or CCS. > > I removed the inner/outer idx stuff from gsl_spmatrix - its not > necessary and also I'd like to avoid modifying the gsl_spmatrix struct > since any changes would break binary compatibility for the next release. That makes perfect sense, but then you need to test each time for the major to know what is the size of the pointer? > Also your modification to gsl_spmatrix_set - I see what you're trying to > do but I think its better to add a new function: > > double *gsl_spmatrix_ptr(gsl_spmatrix *m, size_t i, size_t j) > > which returns a pointer to the data array for element (i,j), and the > user can then add whatever value they want (this is how its done with > gsl_matrix and gsl_vector). I think it should be straightforward to add > this function, possibly even for the compressed storage too. Then I would also add a compressed matrix constructor (or an option) in which gsl_spmatrix_ptr is used to sum duplicate entries instead of gsl_spmatrix_set. This possibility is offered by some other libraries such as Eigen and is for example useful to build matrices counting (e.g. transitions). > > Anyway, just wanted to let you know things are progressing (though a > little slowly). Thanks a lot! Would you like me to work on something in particular? Best, Alexis > > Patrick > > On 01/20/2016 11:31 AM, Alexis Tantet wrote: > > Hi Patrick, > > > > I've cloned gsl.git, made my modifications to spmatrix/ and spblas/ > > and pushed them to master at: > > https://github.com/atantet/gsl > > > > This time, I've tried to modify the code as little as possible. I've > > usually added comments labelled "AT" with each modification. > > I also have a few questions about the code which I've labelled "?AT". > > Finally, I've added the corresponding tests to both test.c (successful > > on my machine). > > > > I see that duplicates are handled for the triplet format by replacing. > > Thus, I removed the function taking care of sorting and summing > > duplicates that I had added to _compress (the trick was to transpose > > with copy to reorder first, transpose back and remove the duplicates > > then, taken from Eigen C++). Instead, I have added an option to sum > > the duplicates (useful e.g. when counting links of a graph or > > transitions of a Markov chain). One may need the third option of > > repeating the triplets, but then avl_insert should also be modified. > > > > However, if I've understood well, no sorting of the inner indices is > > performed right now. This could be problematic in the future (e.g. in > > functions such as _equal). > > > > For _spdgemm for CRS matrices, I've used the trick (A B)^t = B^t A^t > > to convert the problem to CSC. I've tested it successfully but you may > > not like the coding style. > > > > I hope that helps, > > Best, > > Alexis > > > > On Tue, Jan 19, 2016 at 9:50 PM, Patrick Alken <alken@colorado.edu> wrote: > >> > >> On 01/19/2016 12:55 PM, Alexis Tantet wrote: > >>> Hi Patrick, > >>> > >>> Thanks for the quick reply! I wanted to be sure that this contribution > >>> is useful before to spend time on the merging with the latest version. > >>> I will create the gsl.git repository and work on it during the week. > >>> > >>> I had already had a look at the documentation but did not know about > >>> the iterative solvers (a link between each modules would be useful). > >>> My contribution indeed fits in the sparse matrix module + the update > >>> of the dgemm and dgemv functions to support CRS (an update may also be > >>> needed for the solvers). > >> > >> The solvers only call dgemv (as far as I remember) so they shouldn't need an > >> update once dgemv is updated. > >> > >>> I have also developed a simple C++ object allowing to use gsl_spmatrix > >>> as a user-defined matrix in ARPACK++ (a maintained fork of ARPACK++ > >>> can be found at https://github.com/m-reuter/arpackpp), allowing to > >>> avoid having to use other libraries such as superLU. It could be > >>> useful to others, maybe as an extension. Now that I think about it, > >>> the iterative solvers could also be used to support the shift and > >>> invert modes (see ARPACK++ documentation). What do you think (I could > >>> work on it)? > >> I've never used ARPACK, but if you want to make an extension to interface > >> GSL/ARPACK its fine with me - such an extension would never be added to the > >> main GSL code since GSL tries to be as standalone as possible. > >> > >> > >>> If you have major comments, the sooner the better, so that I can work > >>> on them while merging. > >>> > >>> Thank you for your interest, > >>> Alexis > >>> > >>> On Tue, Jan 19, 2016 at 6:02 PM, Patrick Alken <alken@colorado.edu> wrote: > >>>> Hi Alexis, > >>>> > >>>> This looks like very good work! Adding compressed row storage has been > >>>> on > >>>> my todo list for a while. The 'gslsp' extension is unfortunately very out > >>>> of > >>>> date, and the current git contains newer code (including a GMRES > >>>> iterative > >>>> linear solver). I removed the gslsp extension from the web page a while > >>>> back > >>>> to reflect this. You can browse the latest manual to see the current > >>>> sparse > >>>> matrix capabilities (http://www.gnu.org/software/gsl/manual/gsl-ref.pdf) > >>>> - > >>>> there are 3 chapters: sparse matrices, sparse blas and sparse linear > >>>> algebra > >>>> - it looks like your contributions will fit into the sparse matrices > >>>> chapter. > >>>> > >>>> Would you be able to verify that your changes are compatible with the > >>>> current gsl.git repository? This will make it much easier for me to merge > >>>> everything into the git when ready. It would be best if you made a new > >>>> branch of gsl.git, and add your changes so I can then pull them from > >>>> github > >>>> or somewhere. I will try to find some time in the next few days to look > >>>> over > >>>> your code. > >>>> > >>>> Thanks again, > >>>> Patrick > >>>> > >>>> > >>>> On 01/19/2016 09:43 AM, Alexis Tantet wrote: > >>>>> Dear GSLers, > >>>>> > >>>>> As a scientist rather than a developer, I have developed an extension > >>>>> of the sparse matrix module (CRS, I/O, manipulation, see below), which > >>>>> I have tested. These modifications conserve the structure of the > >>>>> original module and be useful for a large number of sparse matrices > >>>>> users. > >>>>> > >>>>> I'm not familiar with the contributing process here. My repository can > >>>>> be found there: > >>>>> https://github.com/atantet/gslsp > >>>>> Unfortunately, I did not know of the gsl.git repository and I forked it > >>>>> froml: > >>>>> https://github.com/drjerry/gslsp , > >>>>> which seems to be a bit older than gsl.git. > >>>>> > >>>>> How can I push/merge to gsl.git ? Should it be as an update or another > >>>>> extension? Is it necessary to adapt to the newest version of the code > >>>>> ? > >>>>> > >>>>> Best regards, > >>>>> Alexis Tantet > >>>>> > >>>>> CHANGES.md: > >>>>> > >>>>> Extension of the sparse matrix module of GSL > >>>>> > >>>>> =================================== > >>>>> > >>>>> Introduction > >>>>> ------------ > >>>>> > >>>>> Usages of sparse matrices are numerous in scientific computing. > >>>>> When numerical linear algebra problems become large, sparse > >>>>> matrices become necessary to avoid memory overload and unnecessary > >>>>> computations, at the cost of element access and matrix construction. > >>>>> As a result, most large scale linear solvers or eigen solvers perform > >>>>> on sparse matrices. > >>>>> > >>>>> Fortunately, a very useful sparse matrix module has recently been > >>>>> introduced to GSL. > >>>>> However, important features are still lacking, such has > >>>>> Compressed Row Storage (CRS) matrices, input/output functions and > >>>>> other matrix properties and manipulation functions. > >>>>> This new version attempts to address this, conserving the original > >>>>> structure of the module and conventions. > >>>>> > >>>>> Major changes > >>>>> ------------- > >>>>> > >>>>> * Add CRS format and update functions manipulating compressed matrices : > >>>>> - additional flag GSL_SPMATRIX_CRS and macro GSLSP_ISMATRIX ( > >>>>> gsl_spmatrix.h ) > >>>>> - additional members innerSize and outerSize used to iterate > >>>>> matrix elements ( gsl_spmatrix.h ) > >>>>> - rename some variables for coherence ( gsl_spmatrix.h , *.c ) > >>>>> - update all functions on compressed matrices ( *.c ) > >>>>> * Allow to sum duplicate elements when compressing ( spcompress.c ) : > >>>>> - modify gsl_spmatrix_compress > >>>>> - add gsl_spmatrix_sum_duplicate > >>>>> * CCS <-> CRS and fast transpose inplace in spswap.c : > >>>>> - add gsl_spmatrix_switch_major > >>>>> - add gsl_spmatrix_transpose > >>>>> * Add printing and scanning functions in spio.c : > >>>>> - add gsl_spmatrix_fprintf > >>>>> - add gsl_spmatrix_fscanf > >>>>> * Add manipulation functions in spmanip.c (particularly useful for > >>>>> Markov chain transition matrices) : > >>>>> - add gsl_spmatrix_get_rowsum : get vector of sum over row > >>>>> elements > >>>>> - add gsl_spmatrix_get_colsum : get vector of sum over column > >>>>> elements > >>>>> - add gsl_spmatrix_div_rows : divide all elements of each row > >>>>> by a vector element > >>>>> - add gsl_spmatrix_div_cols : divide all elements of each > >>>>> column by a vector element > >>>>> * Add test functions in atprop.c : > >>>>> - add gsl_spmatrix_gt_elements : greater than test for each > >>>>> matrix > >>>>> element > >>>>> - add gsl_spmatrix_ge_elements : greater or equal than test for > >>>>> each matrix element > >>>>> - add gsl_spmatrix_lt_elements : lower than test for each matrix > >>>>> element > >>>>> - add gsl_spmatrix_le_elements : lower or equal than test for > >>>>> each matrix element > >>>>> - add gsl_spmatrix_any : test if any non-zero element in > >>>>> matrix > >>>>> > >>>>> Other minor changes have been made, such as error tests. > >>>>> test.c has also been updated to test new features. > >>>> > >>> > > > > > -- Alexis Tantet ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-07 1:03 ` Alexis Tantet @ 2016-02-07 17:25 ` Patrick Alken 2016-02-07 19:32 ` Alexis Tantet 0 siblings, 1 reply; 25+ messages in thread From: Patrick Alken @ 2016-02-07 17:25 UTC (permalink / raw) To: Alexis Tantet; +Cc: gsl-discuss On 02/06/2016 06:03 PM, Alexis Tantet wrote: > Hi Patrick, > > Thanks for working on it. Following, some comments: > > On Sun, Feb 7, 2016 at 1:02 AM, Patrick Alken <alken@colorado.edu> wrote: >> Hello, >> >> I've been working my way through your patch. There is a new branch on >> the git called 'sparse' with all the current changes. You've done a lot >> of work on your patch so its taking me a while to get through all of it. >> >> The main functionality of CRS is in there, along with basic stuff like >> memcpy, transpose_memcpy, transpose. Also dgemv works (and hence also >> the GMRES linear solver). > Since the GMRES is already implemented, do you think it would be worth > using the Arnoldi method to find eigenvalues and eigenvectors of > sparse non-hermitian matrices? If you think you can make a nice implementation, go for it. We could add it to the gsl_eigen interface. > >> I did some initial work on the fprintf/fscanf >> for the triplet case - for the compressed matrices, I am wondering if we >> should stick to a standard format such as Harwell-Boeing, I'm not sure >> at the moment. We might be able to avoid this issue by making a binary >> fwrite/fread interface which just directly writes the 3 arrays to disk. > I agree that a standard format would be best but I am not aware of any > dominant format. > Is Harwell-Boeing one of them? > It seems important to me to have both binary and formatted input/output > (for coherence with gsl_matrix and because formatted files are more > handy to share). > Also, it is important to have some header indicating how many elements > should be read > and what is the dimension of the matrix. I'm not aware of a standard format either, I'll try and look into this further to come up with a good solution. > >> Things I still need to add for the CRS storage: >> >> 1. spmatrix_add >> 2. spblas_dgemm >> 3. fprintf/fscanf >> 4. transpose - for now I'll keep it the way you coded it, but I think it >> would be better to keep the matrix in the same storage format during a >> transpose, ie: the user might want to pass A^T to an external solver >> library which requires CCS and they might not expect the transpose >> routine to change it to CRS. I did a quick search for in-place transpose >> algorithms in CCS/CRS and they do appear to exist though I haven't had >> time to implement it yet. > Fine, but then I think it is important to add a separate function, say > "transpose_switch_major", > for which the transposition is performed by just changing the major > (i.e. row->col or col->row) > since this is the cheapest way to transpose while some libraries may > work equally well with CRS or CCS. Ok I like this idea too since it is very efficient. > >> I removed the inner/outer idx stuff from gsl_spmatrix - its not >> necessary and also I'd like to avoid modifying the gsl_spmatrix struct >> since any changes would break binary compatibility for the next release. > That makes perfect sense, but then you need to test each time for the > major to know what is the size of the pointer? I am just making separate routines for CCS and CRS (ie: gsl_spmatrix_ccs and gsl_spmatrix_crs to do triplet -> compressed). I think its simpler this way. > >> Also your modification to gsl_spmatrix_set - I see what you're trying to >> do but I think its better to add a new function: >> >> double *gsl_spmatrix_ptr(gsl_spmatrix *m, size_t i, size_t j) >> >> which returns a pointer to the data array for element (i,j), and the >> user can then add whatever value they want (this is how its done with >> gsl_matrix and gsl_vector). I think it should be straightforward to add >> this function, possibly even for the compressed storage too. > Then I would also add a compressed matrix constructor (or an option) in which > gsl_spmatrix_ptr is used to sum duplicate entries instead of gsl_spmatrix_set. > This possibility is offered by some other libraries such as Eigen > and is for example useful to build matrices counting (e.g. transitions). I'm not sure what you mean. I've added a new function gsl_spmatrix_ptr to the git, which as far as I can tell does exactly what your sum_duplicate flag does. It searches the matrix for an (i,j) element, and if found returns a pointer. If not found a null pointer is returned. This makes it easy for the user to modify A(i,j) after it has been added to the matrix. Are you thinking of something else? Can you point me to the Eigen routine? > >> Anyway, just wanted to let you know things are progressing (though a >> little slowly). > Thanks a lot! Would you like me to work on something in particular? > > Best, > Alexis > >> Patrick >> >> On 01/20/2016 11:31 AM, Alexis Tantet wrote: >>> Hi Patrick, >>> >>> I've cloned gsl.git, made my modifications to spmatrix/ and spblas/ >>> and pushed them to master at: >>> https://github.com/atantet/gsl >>> >>> This time, I've tried to modify the code as little as possible. I've >>> usually added comments labelled "AT" with each modification. >>> I also have a few questions about the code which I've labelled "?AT". >>> Finally, I've added the corresponding tests to both test.c (successful >>> on my machine). >>> >>> I see that duplicates are handled for the triplet format by replacing. >>> Thus, I removed the function taking care of sorting and summing >>> duplicates that I had added to _compress (the trick was to transpose >>> with copy to reorder first, transpose back and remove the duplicates >>> then, taken from Eigen C++). Instead, I have added an option to sum >>> the duplicates (useful e.g. when counting links of a graph or >>> transitions of a Markov chain). One may need the third option of >>> repeating the triplets, but then avl_insert should also be modified. >>> >>> However, if I've understood well, no sorting of the inner indices is >>> performed right now. This could be problematic in the future (e.g. in >>> functions such as _equal). >>> >>> For _spdgemm for CRS matrices, I've used the trick (A B)^t = B^t A^t >>> to convert the problem to CSC. I've tested it successfully but you may >>> not like the coding style. >>> >>> I hope that helps, >>> Best, >>> Alexis >>> >>> On Tue, Jan 19, 2016 at 9:50 PM, Patrick Alken <alken@colorado.edu> wrote: >>>> On 01/19/2016 12:55 PM, Alexis Tantet wrote: >>>>> Hi Patrick, >>>>> >>>>> Thanks for the quick reply! I wanted to be sure that this contribution >>>>> is useful before to spend time on the merging with the latest version. >>>>> I will create the gsl.git repository and work on it during the week. >>>>> >>>>> I had already had a look at the documentation but did not know about >>>>> the iterative solvers (a link between each modules would be useful). >>>>> My contribution indeed fits in the sparse matrix module + the update >>>>> of the dgemm and dgemv functions to support CRS (an update may also be >>>>> needed for the solvers). >>>> The solvers only call dgemv (as far as I remember) so they shouldn't need an >>>> update once dgemv is updated. >>>> >>>>> I have also developed a simple C++ object allowing to use gsl_spmatrix >>>>> as a user-defined matrix in ARPACK++ (a maintained fork of ARPACK++ >>>>> can be found at https://github.com/m-reuter/arpackpp), allowing to >>>>> avoid having to use other libraries such as superLU. It could be >>>>> useful to others, maybe as an extension. Now that I think about it, >>>>> the iterative solvers could also be used to support the shift and >>>>> invert modes (see ARPACK++ documentation). What do you think (I could >>>>> work on it)? >>>> I've never used ARPACK, but if you want to make an extension to interface >>>> GSL/ARPACK its fine with me - such an extension would never be added to the >>>> main GSL code since GSL tries to be as standalone as possible. >>>> >>>> >>>>> If you have major comments, the sooner the better, so that I can work >>>>> on them while merging. >>>>> >>>>> Thank you for your interest, >>>>> Alexis >>>>> >>>>> On Tue, Jan 19, 2016 at 6:02 PM, Patrick Alken <alken@colorado.edu> wrote: >>>>>> Hi Alexis, >>>>>> >>>>>> This looks like very good work! Adding compressed row storage has been >>>>>> on >>>>>> my todo list for a while. The 'gslsp' extension is unfortunately very out >>>>>> of >>>>>> date, and the current git contains newer code (including a GMRES >>>>>> iterative >>>>>> linear solver). I removed the gslsp extension from the web page a while >>>>>> back >>>>>> to reflect this. You can browse the latest manual to see the current >>>>>> sparse >>>>>> matrix capabilities (http://www.gnu.org/software/gsl/manual/gsl-ref.pdf) >>>>>> - >>>>>> there are 3 chapters: sparse matrices, sparse blas and sparse linear >>>>>> algebra >>>>>> - it looks like your contributions will fit into the sparse matrices >>>>>> chapter. >>>>>> >>>>>> Would you be able to verify that your changes are compatible with the >>>>>> current gsl.git repository? This will make it much easier for me to merge >>>>>> everything into the git when ready. It would be best if you made a new >>>>>> branch of gsl.git, and add your changes so I can then pull them from >>>>>> github >>>>>> or somewhere. I will try to find some time in the next few days to look >>>>>> over >>>>>> your code. >>>>>> >>>>>> Thanks again, >>>>>> Patrick >>>>>> >>>>>> >>>>>> On 01/19/2016 09:43 AM, Alexis Tantet wrote: >>>>>>> Dear GSLers, >>>>>>> >>>>>>> As a scientist rather than a developer, I have developed an extension >>>>>>> of the sparse matrix module (CRS, I/O, manipulation, see below), which >>>>>>> I have tested. These modifications conserve the structure of the >>>>>>> original module and be useful for a large number of sparse matrices >>>>>>> users. >>>>>>> >>>>>>> I'm not familiar with the contributing process here. My repository can >>>>>>> be found there: >>>>>>> https://github.com/atantet/gslsp >>>>>>> Unfortunately, I did not know of the gsl.git repository and I forked it >>>>>>> froml: >>>>>>> https://github.com/drjerry/gslsp , >>>>>>> which seems to be a bit older than gsl.git. >>>>>>> >>>>>>> How can I push/merge to gsl.git ? Should it be as an update or another >>>>>>> extension? Is it necessary to adapt to the newest version of the code >>>>>>> ? >>>>>>> >>>>>>> Best regards, >>>>>>> Alexis Tantet >>>>>>> >>>>>>> CHANGES.md: >>>>>>> >>>>>>> Extension of the sparse matrix module of GSL >>>>>>> >>>>>>> =================================== >>>>>>> >>>>>>> Introduction >>>>>>> ------------ >>>>>>> >>>>>>> Usages of sparse matrices are numerous in scientific computing. >>>>>>> When numerical linear algebra problems become large, sparse >>>>>>> matrices become necessary to avoid memory overload and unnecessary >>>>>>> computations, at the cost of element access and matrix construction. >>>>>>> As a result, most large scale linear solvers or eigen solvers perform >>>>>>> on sparse matrices. >>>>>>> >>>>>>> Fortunately, a very useful sparse matrix module has recently been >>>>>>> introduced to GSL. >>>>>>> However, important features are still lacking, such has >>>>>>> Compressed Row Storage (CRS) matrices, input/output functions and >>>>>>> other matrix properties and manipulation functions. >>>>>>> This new version attempts to address this, conserving the original >>>>>>> structure of the module and conventions. >>>>>>> >>>>>>> Major changes >>>>>>> ------------- >>>>>>> >>>>>>> * Add CRS format and update functions manipulating compressed matrices : >>>>>>> - additional flag GSL_SPMATRIX_CRS and macro GSLSP_ISMATRIX ( >>>>>>> gsl_spmatrix.h ) >>>>>>> - additional members innerSize and outerSize used to iterate >>>>>>> matrix elements ( gsl_spmatrix.h ) >>>>>>> - rename some variables for coherence ( gsl_spmatrix.h , *.c ) >>>>>>> - update all functions on compressed matrices ( *.c ) >>>>>>> * Allow to sum duplicate elements when compressing ( spcompress.c ) : >>>>>>> - modify gsl_spmatrix_compress >>>>>>> - add gsl_spmatrix_sum_duplicate >>>>>>> * CCS <-> CRS and fast transpose inplace in spswap.c : >>>>>>> - add gsl_spmatrix_switch_major >>>>>>> - add gsl_spmatrix_transpose >>>>>>> * Add printing and scanning functions in spio.c : >>>>>>> - add gsl_spmatrix_fprintf >>>>>>> - add gsl_spmatrix_fscanf >>>>>>> * Add manipulation functions in spmanip.c (particularly useful for >>>>>>> Markov chain transition matrices) : >>>>>>> - add gsl_spmatrix_get_rowsum : get vector of sum over row >>>>>>> elements >>>>>>> - add gsl_spmatrix_get_colsum : get vector of sum over column >>>>>>> elements >>>>>>> - add gsl_spmatrix_div_rows : divide all elements of each row >>>>>>> by a vector element >>>>>>> - add gsl_spmatrix_div_cols : divide all elements of each >>>>>>> column by a vector element >>>>>>> * Add test functions in atprop.c : >>>>>>> - add gsl_spmatrix_gt_elements : greater than test for each >>>>>>> matrix >>>>>>> element >>>>>>> - add gsl_spmatrix_ge_elements : greater or equal than test for >>>>>>> each matrix element >>>>>>> - add gsl_spmatrix_lt_elements : lower than test for each matrix >>>>>>> element >>>>>>> - add gsl_spmatrix_le_elements : lower or equal than test for >>>>>>> each matrix element >>>>>>> - add gsl_spmatrix_any : test if any non-zero element in >>>>>>> matrix >>>>>>> >>>>>>> Other minor changes have been made, such as error tests. >>>>>>> test.c has also been updated to test new features. >>> > > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-07 17:25 ` Patrick Alken @ 2016-02-07 19:32 ` Alexis Tantet 2016-02-07 20:14 ` Patrick Alken 0 siblings, 1 reply; 25+ messages in thread From: Alexis Tantet @ 2016-02-07 19:32 UTC (permalink / raw) To: Patrick Alken; +Cc: gsl-discuss Hi Patrick, On Sun, Feb 7, 2016 at 6:25 PM, Patrick Alken <alken@colorado.edu> wrote: > On 02/06/2016 06:03 PM, Alexis Tantet wrote: >> Hi Patrick, >> >> Thanks for working on it. Following, some comments: >> >> On Sun, Feb 7, 2016 at 1:02 AM, Patrick Alken <alken@colorado.edu> wrote: >>> Hello, >>> >>> I've been working my way through your patch. There is a new branch on >>> the git called 'sparse' with all the current changes. You've done a lot >>> of work on your patch so its taking me a while to get through all of it. >>> >>> The main functionality of CRS is in there, along with basic stuff like >>> memcpy, transpose_memcpy, transpose. Also dgemv works (and hence also >>> the GMRES linear solver). >> Since the GMRES is already implemented, do you think it would be worth >> using the Arnoldi method to find eigenvalues and eigenvectors of >> sparse non-hermitian matrices? > > If you think you can make a nice implementation, go for it. We could add > it to the gsl_eigen interface. > The basic elements of the Arnoldi iteration and the eigenvalue solver for dense non-hermitian matrices are there. The question is then which restarting method to use, e.g. the implicite restart of ARPACK. I'll see what I can do. >> >>> I did some initial work on the fprintf/fscanf >>> for the triplet case - for the compressed matrices, I am wondering if we >>> should stick to a standard format such as Harwell-Boeing, I'm not sure >>> at the moment. We might be able to avoid this issue by making a binary >>> fwrite/fread interface which just directly writes the 3 arrays to disk. >> I agree that a standard format would be best but I am not aware of any >> dominant format. >> Is Harwell-Boeing one of them? >> It seems important to me to have both binary and formatted input/output >> (for coherence with gsl_matrix and because formatted files are more >> handy to share). >> Also, it is important to have some header indicating how many elements >> should be read >> and what is the dimension of the matrix. > > I'm not aware of a standard format either, I'll try and look into this > further to come up with a good solution. > >> >>> Things I still need to add for the CRS storage: >>> >>> 1. spmatrix_add >>> 2. spblas_dgemm >>> 3. fprintf/fscanf >>> 4. transpose - for now I'll keep it the way you coded it, but I think it >>> would be better to keep the matrix in the same storage format during a >>> transpose, ie: the user might want to pass A^T to an external solver >>> library which requires CCS and they might not expect the transpose >>> routine to change it to CRS. I did a quick search for in-place transpose >>> algorithms in CCS/CRS and they do appear to exist though I haven't had >>> time to implement it yet. >> Fine, but then I think it is important to add a separate function, say >> "transpose_switch_major", >> for which the transposition is performed by just changing the major >> (i.e. row->col or col->row) >> since this is the cheapest way to transpose while some libraries may >> work equally well with CRS or CCS. > > Ok I like this idea too since it is very efficient. > >> >>> I removed the inner/outer idx stuff from gsl_spmatrix - its not >>> necessary and also I'd like to avoid modifying the gsl_spmatrix struct >>> since any changes would break binary compatibility for the next release. >> That makes perfect sense, but then you need to test each time for the >> major to know what is the size of the pointer? > > I am just making separate routines for CCS and CRS (ie: gsl_spmatrix_ccs > and gsl_spmatrix_crs to do triplet -> compressed). I think its simpler > this way. > >> >>> Also your modification to gsl_spmatrix_set - I see what you're trying to >>> do but I think its better to add a new function: >>> >>> double *gsl_spmatrix_ptr(gsl_spmatrix *m, size_t i, size_t j) >>> >>> which returns a pointer to the data array for element (i,j), and the >>> user can then add whatever value they want (this is how its done with >>> gsl_matrix and gsl_vector). I think it should be straightforward to add >>> this function, possibly even for the compressed storage too. >> Then I would also add a compressed matrix constructor (or an option) in which >> gsl_spmatrix_ptr is used to sum duplicate entries instead of gsl_spmatrix_set. >> This possibility is offered by some other libraries such as Eigen >> and is for example useful to build matrices counting (e.g. transitions). > > I'm not sure what you mean. I've added a new function gsl_spmatrix_ptr > to the git, which as far as I can tell does exactly what your > sum_duplicate flag does. It searches the matrix for an (i,j) element, > and if found returns a pointer. If not found a null pointer is returned. > This makes it easy for the user to modify A(i,j) after it has been added > to the matrix. Are you thinking of something else? Can you point me to > the Eigen routine? > What I meant is to have the equivalent of gsl_spmatrix_compress, with the difference that gsl_spmatrix_ptr is used instead of gsl_spmatrix_set, so has to build the compressed matrix from triplets, summing the duplicates, instead of replacing them. This is what is done here : The http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b Best, Alexis >> >>> Anyway, just wanted to let you know things are progressing (though a >>> little slowly). >> Thanks a lot! Would you like me to work on something in particular? >> >> Best, >> Alexis >> >>> Patrick >>> >>> On 01/20/2016 11:31 AM, Alexis Tantet wrote: >>>> Hi Patrick, >>>> >>>> I've cloned gsl.git, made my modifications to spmatrix/ and spblas/ >>>> and pushed them to master at: >>>> https://github.com/atantet/gsl >>>> >>>> This time, I've tried to modify the code as little as possible. I've >>>> usually added comments labelled "AT" with each modification. >>>> I also have a few questions about the code which I've labelled "?AT". >>>> Finally, I've added the corresponding tests to both test.c (successful >>>> on my machine). >>>> >>>> I see that duplicates are handled for the triplet format by replacing. >>>> Thus, I removed the function taking care of sorting and summing >>>> duplicates that I had added to _compress (the trick was to transpose >>>> with copy to reorder first, transpose back and remove the duplicates >>>> then, taken from Eigen C++). Instead, I have added an option to sum >>>> the duplicates (useful e.g. when counting links of a graph or >>>> transitions of a Markov chain). One may need the third option of >>>> repeating the triplets, but then avl_insert should also be modified. >>>> >>>> However, if I've understood well, no sorting of the inner indices is >>>> performed right now. This could be problematic in the future (e.g. in >>>> functions such as _equal). >>>> >>>> For _spdgemm for CRS matrices, I've used the trick (A B)^t = B^t A^t >>>> to convert the problem to CSC. I've tested it successfully but you may >>>> not like the coding style. >>>> >>>> I hope that helps, >>>> Best, >>>> Alexis >>>> >>>> On Tue, Jan 19, 2016 at 9:50 PM, Patrick Alken <alken@colorado.edu> wrote: >>>>> On 01/19/2016 12:55 PM, Alexis Tantet wrote: >>>>>> Hi Patrick, >>>>>> >>>>>> Thanks for the quick reply! I wanted to be sure that this contribution >>>>>> is useful before to spend time on the merging with the latest version. >>>>>> I will create the gsl.git repository and work on it during the week. >>>>>> >>>>>> I had already had a look at the documentation but did not know about >>>>>> the iterative solvers (a link between each modules would be useful). >>>>>> My contribution indeed fits in the sparse matrix module + the update >>>>>> of the dgemm and dgemv functions to support CRS (an update may also be >>>>>> needed for the solvers). >>>>> The solvers only call dgemv (as far as I remember) so they shouldn't need an >>>>> update once dgemv is updated. >>>>> >>>>>> I have also developed a simple C++ object allowing to use gsl_spmatrix >>>>>> as a user-defined matrix in ARPACK++ (a maintained fork of ARPACK++ >>>>>> can be found at https://github.com/m-reuter/arpackpp), allowing to >>>>>> avoid having to use other libraries such as superLU. It could be >>>>>> useful to others, maybe as an extension. Now that I think about it, >>>>>> the iterative solvers could also be used to support the shift and >>>>>> invert modes (see ARPACK++ documentation). What do you think (I could >>>>>> work on it)? >>>>> I've never used ARPACK, but if you want to make an extension to interface >>>>> GSL/ARPACK its fine with me - such an extension would never be added to the >>>>> main GSL code since GSL tries to be as standalone as possible. >>>>> >>>>> >>>>>> If you have major comments, the sooner the better, so that I can work >>>>>> on them while merging. >>>>>> >>>>>> Thank you for your interest, >>>>>> Alexis >>>>>> >>>>>> On Tue, Jan 19, 2016 at 6:02 PM, Patrick Alken <alken@colorado.edu> wrote: >>>>>>> Hi Alexis, >>>>>>> >>>>>>> This looks like very good work! Adding compressed row storage has been >>>>>>> on >>>>>>> my todo list for a while. The 'gslsp' extension is unfortunately very out >>>>>>> of >>>>>>> date, and the current git contains newer code (including a GMRES >>>>>>> iterative >>>>>>> linear solver). I removed the gslsp extension from the web page a while >>>>>>> back >>>>>>> to reflect this. You can browse the latest manual to see the current >>>>>>> sparse >>>>>>> matrix capabilities (http://www.gnu.org/software/gsl/manual/gsl-ref.pdf) >>>>>>> - >>>>>>> there are 3 chapters: sparse matrices, sparse blas and sparse linear >>>>>>> algebra >>>>>>> - it looks like your contributions will fit into the sparse matrices >>>>>>> chapter. >>>>>>> >>>>>>> Would you be able to verify that your changes are compatible with the >>>>>>> current gsl.git repository? This will make it much easier for me to merge >>>>>>> everything into the git when ready. It would be best if you made a new >>>>>>> branch of gsl.git, and add your changes so I can then pull them from >>>>>>> github >>>>>>> or somewhere. I will try to find some time in the next few days to look >>>>>>> over >>>>>>> your code. >>>>>>> >>>>>>> Thanks again, >>>>>>> Patrick >>>>>>> >>>>>>> >>>>>>> On 01/19/2016 09:43 AM, Alexis Tantet wrote: >>>>>>>> Dear GSLers, >>>>>>>> >>>>>>>> As a scientist rather than a developer, I have developed an extension >>>>>>>> of the sparse matrix module (CRS, I/O, manipulation, see below), which >>>>>>>> I have tested. These modifications conserve the structure of the >>>>>>>> original module and be useful for a large number of sparse matrices >>>>>>>> users. >>>>>>>> >>>>>>>> I'm not familiar with the contributing process here. My repository can >>>>>>>> be found there: >>>>>>>> https://github.com/atantet/gslsp >>>>>>>> Unfortunately, I did not know of the gsl.git repository and I forked it >>>>>>>> froml: >>>>>>>> https://github.com/drjerry/gslsp , >>>>>>>> which seems to be a bit older than gsl.git. >>>>>>>> >>>>>>>> How can I push/merge to gsl.git ? Should it be as an update or another >>>>>>>> extension? Is it necessary to adapt to the newest version of the code >>>>>>>> ? >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Alexis Tantet >>>>>>>> >>>>>>>> CHANGES.md: >>>>>>>> >>>>>>>> Extension of the sparse matrix module of GSL >>>>>>>> >>>>>>>> =================================== >>>>>>>> >>>>>>>> Introduction >>>>>>>> ------------ >>>>>>>> >>>>>>>> Usages of sparse matrices are numerous in scientific computing. >>>>>>>> When numerical linear algebra problems become large, sparse >>>>>>>> matrices become necessary to avoid memory overload and unnecessary >>>>>>>> computations, at the cost of element access and matrix construction. >>>>>>>> As a result, most large scale linear solvers or eigen solvers perform >>>>>>>> on sparse matrices. >>>>>>>> >>>>>>>> Fortunately, a very useful sparse matrix module has recently been >>>>>>>> introduced to GSL. >>>>>>>> However, important features are still lacking, such has >>>>>>>> Compressed Row Storage (CRS) matrices, input/output functions and >>>>>>>> other matrix properties and manipulation functions. >>>>>>>> This new version attempts to address this, conserving the original >>>>>>>> structure of the module and conventions. >>>>>>>> >>>>>>>> Major changes >>>>>>>> ------------- >>>>>>>> >>>>>>>> * Add CRS format and update functions manipulating compressed matrices : >>>>>>>> - additional flag GSL_SPMATRIX_CRS and macro GSLSP_ISMATRIX ( >>>>>>>> gsl_spmatrix.h ) >>>>>>>> - additional members innerSize and outerSize used to iterate >>>>>>>> matrix elements ( gsl_spmatrix.h ) >>>>>>>> - rename some variables for coherence ( gsl_spmatrix.h , *.c ) >>>>>>>> - update all functions on compressed matrices ( *.c ) >>>>>>>> * Allow to sum duplicate elements when compressing ( spcompress.c ) : >>>>>>>> - modify gsl_spmatrix_compress >>>>>>>> - add gsl_spmatrix_sum_duplicate >>>>>>>> * CCS <-> CRS and fast transpose inplace in spswap.c : >>>>>>>> - add gsl_spmatrix_switch_major >>>>>>>> - add gsl_spmatrix_transpose >>>>>>>> * Add printing and scanning functions in spio.c : >>>>>>>> - add gsl_spmatrix_fprintf >>>>>>>> - add gsl_spmatrix_fscanf >>>>>>>> * Add manipulation functions in spmanip.c (particularly useful for >>>>>>>> Markov chain transition matrices) : >>>>>>>> - add gsl_spmatrix_get_rowsum : get vector of sum over row >>>>>>>> elements >>>>>>>> - add gsl_spmatrix_get_colsum : get vector of sum over column >>>>>>>> elements >>>>>>>> - add gsl_spmatrix_div_rows : divide all elements of each row >>>>>>>> by a vector element >>>>>>>> - add gsl_spmatrix_div_cols : divide all elements of each >>>>>>>> column by a vector element >>>>>>>> * Add test functions in atprop.c : >>>>>>>> - add gsl_spmatrix_gt_elements : greater than test for each >>>>>>>> matrix >>>>>>>> element >>>>>>>> - add gsl_spmatrix_ge_elements : greater or equal than test for >>>>>>>> each matrix element >>>>>>>> - add gsl_spmatrix_lt_elements : lower than test for each matrix >>>>>>>> element >>>>>>>> - add gsl_spmatrix_le_elements : lower or equal than test for >>>>>>>> each matrix element >>>>>>>> - add gsl_spmatrix_any : test if any non-zero element in >>>>>>>> matrix >>>>>>>> >>>>>>>> Other minor changes have been made, such as error tests. >>>>>>>> test.c has also been updated to test new features. >>>> >> >> > -- Alexis Tantet ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-07 19:32 ` Alexis Tantet @ 2016-02-07 20:14 ` Patrick Alken 2016-02-07 20:31 ` Alexis Tantet 0 siblings, 1 reply; 25+ messages in thread From: Patrick Alken @ 2016-02-07 20:14 UTC (permalink / raw) To: Alexis Tantet; +Cc: gsl-discuss Hi Alexis, >> I'm not sure what you mean. I've added a new function gsl_spmatrix_ptr >> to the git, which as far as I can tell does exactly what your >> sum_duplicate flag does. It searches the matrix for an (i,j) element, >> and if found returns a pointer. If not found a null pointer is returned. >> This makes it easy for the user to modify A(i,j) after it has been added >> to the matrix. Are you thinking of something else? Can you point me to >> the Eigen routine? >> > What I meant is to have the equivalent of gsl_spmatrix_compress, > with the difference that gsl_spmatrix_ptr is used instead of gsl_spmatrix_set, > so has to build the compressed matrix from triplets, summing the > duplicates, instead of replacing them. > This is what is done here : > The http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b > > Best, > Alexis I'm not sure why a user would ever need to do this. The whole point of the binary tree structure in the triplet storage is to efficiently find duplicate entries, so that if a user tries to call gsl_spmatrix_set on an element which is already been previously set, it can find that element with a binary search (rather than linearly searching the arrays) and change the value of that element. Therefore, the way the triplet storage is designed, there is will never be a duplicate element in the triplet arrays. All of the (i[n],j[n]) will be unique for each n <= nz. Am I missing something? Patrick ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-07 20:14 ` Patrick Alken @ 2016-02-07 20:31 ` Alexis Tantet 2016-02-07 21:34 ` Patrick Alken 0 siblings, 1 reply; 25+ messages in thread From: Alexis Tantet @ 2016-02-07 20:31 UTC (permalink / raw) To: Patrick Alken; +Cc: gsl-discuss I'm not sure I got your last point. I have the following situation in mind: Start to construct a transition matrix in triplet format, adding one element after another. In this particular example, each element is one count of a transition from (state, box, etc.) i to j, so I add elements (i, j, 1) to the triplet object, with possibly duplicates. What happen to these duplicates in the binary tree? Eventually, when I compress to CRS or CCS, I would like the duplicates to be summed up, so that element (i, j) counts transitions from i to j (and no duplicates exist after compression). Is this more clear? On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken <alken@colorado.edu> wrote: > Hi Alexis, > >>> I'm not sure what you mean. I've added a new function gsl_spmatrix_ptr >>> to the git, which as far as I can tell does exactly what your >>> sum_duplicate flag does. It searches the matrix for an (i,j) element, >>> and if found returns a pointer. If not found a null pointer is returned. >>> This makes it easy for the user to modify A(i,j) after it has been added >>> to the matrix. Are you thinking of something else? Can you point me to >>> the Eigen routine? >>> >> What I meant is to have the equivalent of gsl_spmatrix_compress, >> with the difference that gsl_spmatrix_ptr is used instead of gsl_spmatrix_set, >> so has to build the compressed matrix from triplets, summing the >> duplicates, instead of replacing them. >> This is what is done here : >> The http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b >> >> Best, >> Alexis > > I'm not sure why a user would ever need to do this. The whole point of > the binary tree structure in the triplet storage is to efficiently find > duplicate entries, so that if a user tries to call gsl_spmatrix_set on > an element which is already been previously set, it can find that > element with a binary search (rather than linearly searching the arrays) > and change the value of that element. > > Therefore, the way the triplet storage is designed, there is will never > be a duplicate element in the triplet arrays. All of the (i[n],j[n]) > will be unique for each n <= nz. > > Am I missing something? > > Patrick -- Alexis Tantet ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-07 20:31 ` Alexis Tantet @ 2016-02-07 21:34 ` Patrick Alken 2016-02-08 0:59 ` Alexis Tantet 0 siblings, 1 reply; 25+ messages in thread From: Patrick Alken @ 2016-02-07 21:34 UTC (permalink / raw) To: Alexis Tantet; +Cc: gsl-discuss By design, gsl_spmatrix_set won't allow you to do this. If you add element (i, j, x) and then later to try add element (i, j, y), gsl_spmatrix_set will detect that there exists an element in the (i, j) spot and it will simply change x to y - the value of x will be overwritten by y. This is the same behavior as gsl_matrix_set. So no duplicates are allowed by design. If you have such an application where you want to keep track of duplicates, you could do the following: double *ptr = gsl_spmatrix_ptr(m, i, j); if (ptr) *ptr += x; /* sum duplicate values */ else gsl_spmatrix_set(m, i, j, x); /* initalize to x */ On 02/07/2016 01:31 PM, Alexis Tantet wrote: > I'm not sure I got your last point. I have the following situation in mind: > > Start to construct a transition matrix in triplet format, adding one > element after another. > In this particular example, each element is one count of a transition > from (state, box, etc.) i to j, > so I add elements (i, j, 1) to the triplet object, with possibly duplicates. > What happen to these duplicates in the binary tree? > > Eventually, when I compress to CRS or CCS, I would like the duplicates > to be summed up, so that element (i, j) counts transitions from i to j > (and no duplicates exist after compression). > > Is this more clear? > > On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken <alken@colorado.edu> wrote: >> Hi Alexis, >> >>>> I'm not sure what you mean. I've added a new function gsl_spmatrix_ptr >>>> to the git, which as far as I can tell does exactly what your >>>> sum_duplicate flag does. It searches the matrix for an (i,j) element, >>>> and if found returns a pointer. If not found a null pointer is returned. >>>> This makes it easy for the user to modify A(i,j) after it has been added >>>> to the matrix. Are you thinking of something else? Can you point me to >>>> the Eigen routine? >>>> >>> What I meant is to have the equivalent of gsl_spmatrix_compress, >>> with the difference that gsl_spmatrix_ptr is used instead of gsl_spmatrix_set, >>> so has to build the compressed matrix from triplets, summing the >>> duplicates, instead of replacing them. >>> This is what is done here : >>> The http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b >>> >>> Best, >>> Alexis >> I'm not sure why a user would ever need to do this. The whole point of >> the binary tree structure in the triplet storage is to efficiently find >> duplicate entries, so that if a user tries to call gsl_spmatrix_set on >> an element which is already been previously set, it can find that >> element with a binary search (rather than linearly searching the arrays) >> and change the value of that element. >> >> Therefore, the way the triplet storage is designed, there is will never >> be a duplicate element in the triplet arrays. All of the (i[n],j[n]) >> will be unique for each n <= nz. >> >> Am I missing something? >> >> Patrick > > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-07 21:34 ` Patrick Alken @ 2016-02-08 0:59 ` Alexis Tantet 2016-02-10 13:16 ` Alexis Tantet 0 siblings, 1 reply; 25+ messages in thread From: Alexis Tantet @ 2016-02-08 0:59 UTC (permalink / raw) To: Patrick Alken; +Cc: gsl-discuss Ok, my mistake, now I see where I got confused. I had in mind to add all the elements first to the triplets and only while converting to compressed sum up the duplicates. While, indeed, if there's a way you can sum up the duplicates directly while adding them to the triplet matrix (thanks to _ptr), this is more handy and efficient. Thanks for the clarification, Alexis On Sun, Feb 7, 2016 at 10:34 PM, Patrick Alken <alken@colorado.edu> wrote: > By design, gsl_spmatrix_set won't allow you to do this. > > If you add element (i, j, x) and then later to try add element (i, j, > y), gsl_spmatrix_set will detect that there exists an element in the (i, > j) spot and it will simply change x to y - the value of x will be > overwritten by y. This is the same behavior as gsl_matrix_set. > > So no duplicates are allowed by design. If you have such an application > where you want to keep track of duplicates, you could do the following: > > double *ptr = gsl_spmatrix_ptr(m, i, j); > if (ptr) > *ptr += x; /* sum duplicate values */ > else > gsl_spmatrix_set(m, i, j, x); /* initalize to x */ > > On 02/07/2016 01:31 PM, Alexis Tantet wrote: >> I'm not sure I got your last point. I have the following situation in mind: >> >> Start to construct a transition matrix in triplet format, adding one >> element after another. >> In this particular example, each element is one count of a transition >> from (state, box, etc.) i to j, >> so I add elements (i, j, 1) to the triplet object, with possibly duplicates. >> What happen to these duplicates in the binary tree? >> >> Eventually, when I compress to CRS or CCS, I would like the duplicates >> to be summed up, so that element (i, j) counts transitions from i to j >> (and no duplicates exist after compression). >> >> Is this more clear? >> >> On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken <alken@colorado.edu> wrote: >>> Hi Alexis, >>> >>>>> I'm not sure what you mean. I've added a new function gsl_spmatrix_ptr >>>>> to the git, which as far as I can tell does exactly what your >>>>> sum_duplicate flag does. It searches the matrix for an (i,j) element, >>>>> and if found returns a pointer. If not found a null pointer is returned. >>>>> This makes it easy for the user to modify A(i,j) after it has been added >>>>> to the matrix. Are you thinking of something else? Can you point me to >>>>> the Eigen routine? >>>>> >>>> What I meant is to have the equivalent of gsl_spmatrix_compress, >>>> with the difference that gsl_spmatrix_ptr is used instead of gsl_spmatrix_set, >>>> so has to build the compressed matrix from triplets, summing the >>>> duplicates, instead of replacing them. >>>> This is what is done here : >>>> The http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b >>>> >>>> Best, >>>> Alexis >>> I'm not sure why a user would ever need to do this. The whole point of >>> the binary tree structure in the triplet storage is to efficiently find >>> duplicate entries, so that if a user tries to call gsl_spmatrix_set on >>> an element which is already been previously set, it can find that >>> element with a binary search (rather than linearly searching the arrays) >>> and change the value of that element. >>> >>> Therefore, the way the triplet storage is designed, there is will never >>> be a duplicate element in the triplet arrays. All of the (i[n],j[n]) >>> will be unique for each n <= nz. >>> >>> Am I missing something? >>> >>> Patrick >> >> > -- Alexis Tantet ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-08 0:59 ` Alexis Tantet @ 2016-02-10 13:16 ` Alexis Tantet 2016-02-10 13:48 ` Alexis Tantet 2016-02-10 15:56 ` Patrick Alken 0 siblings, 2 replies; 25+ messages in thread From: Alexis Tantet @ 2016-02-10 13:16 UTC (permalink / raw) To: Patrick Alken; +Cc: gsl-discuss Hi Patrick, Regarding the file format for sparse matrices, the one I have coded actually happen to be the coordinate format implemented by Matrix Market (the platform to share test data such as sparse matrices), with the addition of a matrix type header: http://math.nist.gov/MatrixMarket/formats.html It is also written that "Harwell-Boeing" is the most commonly used sparse matrix format, but that: "Unfortunately the HB specification is somewhat complex difficult to parse from languages other than Fortran, biased in favor of compressed column representation and not easily extensible. Some of these factors may have impeded the more widespread use of the HB format" It seems to me that complying to the Matrix Market coordinate format would be the right choice, in terms of ease of implementation, compliance to other packages and user-friendliness. I could update the print/scan functions accordingly (mostly handling the header). What do you think? Best, Alexis On Mon, Feb 8, 2016 at 1:59 AM, Alexis Tantet <alexis.tantet@gmail.com> wrote: > Ok, my mistake, now I see where I got confused. > I had in mind to add all the elements first to the triplets and only > while converting to compressed sum up the duplicates. > While, indeed, if there's a way you can sum up the duplicates directly > while adding them to the triplet matrix (thanks to _ptr), this is more > handy and efficient. > > Thanks for the clarification, > Alexis > > On Sun, Feb 7, 2016 at 10:34 PM, Patrick Alken <alken@colorado.edu> wrote: >> By design, gsl_spmatrix_set won't allow you to do this. >> >> If you add element (i, j, x) and then later to try add element (i, j, >> y), gsl_spmatrix_set will detect that there exists an element in the (i, >> j) spot and it will simply change x to y - the value of x will be >> overwritten by y. This is the same behavior as gsl_matrix_set. >> >> So no duplicates are allowed by design. If you have such an application >> where you want to keep track of duplicates, you could do the following: >> >> double *ptr = gsl_spmatrix_ptr(m, i, j); >> if (ptr) >> *ptr += x; /* sum duplicate values */ >> else >> gsl_spmatrix_set(m, i, j, x); /* initalize to x */ >> >> On 02/07/2016 01:31 PM, Alexis Tantet wrote: >>> I'm not sure I got your last point. I have the following situation in mind: >>> >>> Start to construct a transition matrix in triplet format, adding one >>> element after another. >>> In this particular example, each element is one count of a transition >>> from (state, box, etc.) i to j, >>> so I add elements (i, j, 1) to the triplet object, with possibly duplicates. >>> What happen to these duplicates in the binary tree? >>> >>> Eventually, when I compress to CRS or CCS, I would like the duplicates >>> to be summed up, so that element (i, j) counts transitions from i to j >>> (and no duplicates exist after compression). >>> >>> Is this more clear? >>> >>> On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken <alken@colorado.edu> wrote: >>>> Hi Alexis, >>>> >>>>>> I'm not sure what you mean. I've added a new function gsl_spmatrix_ptr >>>>>> to the git, which as far as I can tell does exactly what your >>>>>> sum_duplicate flag does. It searches the matrix for an (i,j) element, >>>>>> and if found returns a pointer. If not found a null pointer is returned. >>>>>> This makes it easy for the user to modify A(i,j) after it has been added >>>>>> to the matrix. Are you thinking of something else? Can you point me to >>>>>> the Eigen routine? >>>>>> >>>>> What I meant is to have the equivalent of gsl_spmatrix_compress, >>>>> with the difference that gsl_spmatrix_ptr is used instead of gsl_spmatrix_set, >>>>> so has to build the compressed matrix from triplets, summing the >>>>> duplicates, instead of replacing them. >>>>> This is what is done here : >>>>> The http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b >>>>> >>>>> Best, >>>>> Alexis >>>> I'm not sure why a user would ever need to do this. The whole point of >>>> the binary tree structure in the triplet storage is to efficiently find >>>> duplicate entries, so that if a user tries to call gsl_spmatrix_set on >>>> an element which is already been previously set, it can find that >>>> element with a binary search (rather than linearly searching the arrays) >>>> and change the value of that element. >>>> >>>> Therefore, the way the triplet storage is designed, there is will never >>>> be a duplicate element in the triplet arrays. All of the (i[n],j[n]) >>>> will be unique for each n <= nz. >>>> >>>> Am I missing something? >>>> >>>> Patrick >>> >>> >> > > > > -- > Alexis Tantet -- Alexis Tantet ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-10 13:16 ` Alexis Tantet @ 2016-02-10 13:48 ` Alexis Tantet 2016-02-10 15:56 ` Patrick Alken 1 sibling, 0 replies; 25+ messages in thread From: Alexis Tantet @ 2016-02-10 13:48 UTC (permalink / raw) To: Patrick Alken; +Cc: gsl-discuss I actually just had to add the one line header to _fprintf and _fscanf did not need to be modified (sum_duplicate should eventually be removed though). I have pushed the change to my clone on branch dev. Also, in spmanip.c, there are some function returning a pointer to a gsl_vector. It may be better to let them return an exit status instead and put the vector as an output argument. Alexis On Wed, Feb 10, 2016 at 2:16 PM, Alexis Tantet <alexis.tantet@gmail.com> wrote: > Hi Patrick, > > Regarding the file format for sparse matrices, the one I have coded > actually happen to be the coordinate format implemented by Matrix > Market (the platform to share test data such as sparse matrices), with > the addition of a matrix type header: > http://math.nist.gov/MatrixMarket/formats.html > > It is also written that "Harwell-Boeing" is the most commonly used > sparse matrix format, but that: > "Unfortunately the HB specification is somewhat complex difficult to > parse from languages other than Fortran, biased in favor of compressed > column representation and not easily extensible. Some of these factors > may have impeded the more widespread use of the HB format" > > It seems to me that complying to the Matrix Market coordinate format > would be the right choice, in terms of ease of implementation, > compliance to other packages and user-friendliness. I could update the > print/scan functions accordingly (mostly handling the header). What do > you think? > > Best, > Alexis > > > > > On Mon, Feb 8, 2016 at 1:59 AM, Alexis Tantet <alexis.tantet@gmail.com> wrote: >> Ok, my mistake, now I see where I got confused. >> I had in mind to add all the elements first to the triplets and only >> while converting to compressed sum up the duplicates. >> While, indeed, if there's a way you can sum up the duplicates directly >> while adding them to the triplet matrix (thanks to _ptr), this is more >> handy and efficient. >> >> Thanks for the clarification, >> Alexis >> >> On Sun, Feb 7, 2016 at 10:34 PM, Patrick Alken <alken@colorado.edu> wrote: >>> By design, gsl_spmatrix_set won't allow you to do this. >>> >>> If you add element (i, j, x) and then later to try add element (i, j, >>> y), gsl_spmatrix_set will detect that there exists an element in the (i, >>> j) spot and it will simply change x to y - the value of x will be >>> overwritten by y. This is the same behavior as gsl_matrix_set. >>> >>> So no duplicates are allowed by design. If you have such an application >>> where you want to keep track of duplicates, you could do the following: >>> >>> double *ptr = gsl_spmatrix_ptr(m, i, j); >>> if (ptr) >>> *ptr += x; /* sum duplicate values */ >>> else >>> gsl_spmatrix_set(m, i, j, x); /* initalize to x */ >>> >>> On 02/07/2016 01:31 PM, Alexis Tantet wrote: >>>> I'm not sure I got your last point. I have the following situation in mind: >>>> >>>> Start to construct a transition matrix in triplet format, adding one >>>> element after another. >>>> In this particular example, each element is one count of a transition >>>> from (state, box, etc.) i to j, >>>> so I add elements (i, j, 1) to the triplet object, with possibly duplicates. >>>> What happen to these duplicates in the binary tree? >>>> >>>> Eventually, when I compress to CRS or CCS, I would like the duplicates >>>> to be summed up, so that element (i, j) counts transitions from i to j >>>> (and no duplicates exist after compression). >>>> >>>> Is this more clear? >>>> >>>> On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken <alken@colorado.edu> wrote: >>>>> Hi Alexis, >>>>> >>>>>>> I'm not sure what you mean. I've added a new function gsl_spmatrix_ptr >>>>>>> to the git, which as far as I can tell does exactly what your >>>>>>> sum_duplicate flag does. It searches the matrix for an (i,j) element, >>>>>>> and if found returns a pointer. If not found a null pointer is returned. >>>>>>> This makes it easy for the user to modify A(i,j) after it has been added >>>>>>> to the matrix. Are you thinking of something else? Can you point me to >>>>>>> the Eigen routine? >>>>>>> >>>>>> What I meant is to have the equivalent of gsl_spmatrix_compress, >>>>>> with the difference that gsl_spmatrix_ptr is used instead of gsl_spmatrix_set, >>>>>> so has to build the compressed matrix from triplets, summing the >>>>>> duplicates, instead of replacing them. >>>>>> This is what is done here : >>>>>> The http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b >>>>>> >>>>>> Best, >>>>>> Alexis >>>>> I'm not sure why a user would ever need to do this. The whole point of >>>>> the binary tree structure in the triplet storage is to efficiently find >>>>> duplicate entries, so that if a user tries to call gsl_spmatrix_set on >>>>> an element which is already been previously set, it can find that >>>>> element with a binary search (rather than linearly searching the arrays) >>>>> and change the value of that element. >>>>> >>>>> Therefore, the way the triplet storage is designed, there is will never >>>>> be a duplicate element in the triplet arrays. All of the (i[n],j[n]) >>>>> will be unique for each n <= nz. >>>>> >>>>> Am I missing something? >>>>> >>>>> Patrick >>>> >>>> >>> >> >> >> >> -- >> Alexis Tantet > > > > -- > Alexis Tantet -- Alexis Tantet ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-10 13:16 ` Alexis Tantet 2016-02-10 13:48 ` Alexis Tantet @ 2016-02-10 15:56 ` Patrick Alken 2016-02-12 10:43 ` Alexis Tantet 1 sibling, 1 reply; 25+ messages in thread From: Patrick Alken @ 2016-02-10 15:56 UTC (permalink / raw) To: Alexis Tantet; +Cc: gsl-discuss I'm in favor of simplicity and easy-parsing, so matrix market sounds good to me. I'll take a look at your latest code in the next few days. Patrick On 02/10/2016 06:16 AM, Alexis Tantet wrote: > Hi Patrick, > > Regarding the file format for sparse matrices, the one I have coded > actually happen to be the coordinate format implemented by Matrix > Market (the platform to share test data such as sparse matrices), with > the addition of a matrix type header: > http://math.nist.gov/MatrixMarket/formats.html > > It is also written that "Harwell-Boeing" is the most commonly used > sparse matrix format, but that: > "Unfortunately the HB specification is somewhat complex difficult to > parse from languages other than Fortran, biased in favor of compressed > column representation and not easily extensible. Some of these factors > may have impeded the more widespread use of the HB format" > > It seems to me that complying to the Matrix Market coordinate format > would be the right choice, in terms of ease of implementation, > compliance to other packages and user-friendliness. I could update the > print/scan functions accordingly (mostly handling the header). What do > you think? > > Best, > Alexis > > > > > On Mon, Feb 8, 2016 at 1:59 AM, Alexis Tantet <alexis.tantet@gmail.com> wrote: >> Ok, my mistake, now I see where I got confused. >> I had in mind to add all the elements first to the triplets and only >> while converting to compressed sum up the duplicates. >> While, indeed, if there's a way you can sum up the duplicates directly >> while adding them to the triplet matrix (thanks to _ptr), this is more >> handy and efficient. >> >> Thanks for the clarification, >> Alexis >> >> On Sun, Feb 7, 2016 at 10:34 PM, Patrick Alken <alken@colorado.edu> wrote: >>> By design, gsl_spmatrix_set won't allow you to do this. >>> >>> If you add element (i, j, x) and then later to try add element (i, j, >>> y), gsl_spmatrix_set will detect that there exists an element in the (i, >>> j) spot and it will simply change x to y - the value of x will be >>> overwritten by y. This is the same behavior as gsl_matrix_set. >>> >>> So no duplicates are allowed by design. If you have such an application >>> where you want to keep track of duplicates, you could do the following: >>> >>> double *ptr = gsl_spmatrix_ptr(m, i, j); >>> if (ptr) >>> *ptr += x; /* sum duplicate values */ >>> else >>> gsl_spmatrix_set(m, i, j, x); /* initalize to x */ >>> >>> On 02/07/2016 01:31 PM, Alexis Tantet wrote: >>>> I'm not sure I got your last point. I have the following situation in mind: >>>> >>>> Start to construct a transition matrix in triplet format, adding one >>>> element after another. >>>> In this particular example, each element is one count of a transition >>>> from (state, box, etc.) i to j, >>>> so I add elements (i, j, 1) to the triplet object, with possibly duplicates. >>>> What happen to these duplicates in the binary tree? >>>> >>>> Eventually, when I compress to CRS or CCS, I would like the duplicates >>>> to be summed up, so that element (i, j) counts transitions from i to j >>>> (and no duplicates exist after compression). >>>> >>>> Is this more clear? >>>> >>>> On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken <alken@colorado.edu> wrote: >>>>> Hi Alexis, >>>>> >>>>>>> I'm not sure what you mean. I've added a new function gsl_spmatrix_ptr >>>>>>> to the git, which as far as I can tell does exactly what your >>>>>>> sum_duplicate flag does. It searches the matrix for an (i,j) element, >>>>>>> and if found returns a pointer. If not found a null pointer is returned. >>>>>>> This makes it easy for the user to modify A(i,j) after it has been added >>>>>>> to the matrix. Are you thinking of something else? Can you point me to >>>>>>> the Eigen routine? >>>>>>> >>>>>> What I meant is to have the equivalent of gsl_spmatrix_compress, >>>>>> with the difference that gsl_spmatrix_ptr is used instead of gsl_spmatrix_set, >>>>>> so has to build the compressed matrix from triplets, summing the >>>>>> duplicates, instead of replacing them. >>>>>> This is what is done here : >>>>>> The http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b >>>>>> >>>>>> Best, >>>>>> Alexis >>>>> I'm not sure why a user would ever need to do this. The whole point of >>>>> the binary tree structure in the triplet storage is to efficiently find >>>>> duplicate entries, so that if a user tries to call gsl_spmatrix_set on >>>>> an element which is already been previously set, it can find that >>>>> element with a binary search (rather than linearly searching the arrays) >>>>> and change the value of that element. >>>>> >>>>> Therefore, the way the triplet storage is designed, there is will never >>>>> be a duplicate element in the triplet arrays. All of the (i[n],j[n]) >>>>> will be unique for each n <= nz. >>>>> >>>>> Am I missing something? >>>>> >>>>> Patrick >>>> >> >> >> -- >> Alexis Tantet > > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-10 15:56 ` Patrick Alken @ 2016-02-12 10:43 ` Alexis Tantet 2016-02-13 19:42 ` Patrick Alken 0 siblings, 1 reply; 25+ messages in thread From: Alexis Tantet @ 2016-02-12 10:43 UTC (permalink / raw) To: Patrick Alken; +Cc: gsl-discuss I corrected _fscanf. There was an error when reading the comment header. On Wed, Feb 10, 2016 at 4:55 PM, Patrick Alken <alken@colorado.edu> wrote: > I'm in favor of simplicity and easy-parsing, so matrix market sounds good to > me. I'll take a look at your latest code in the next few days. > > Patrick > > > On 02/10/2016 06:16 AM, Alexis Tantet wrote: >> >> Hi Patrick, >> >> Regarding the file format for sparse matrices, the one I have coded >> actually happen to be the coordinate format implemented by Matrix >> Market (the platform to share test data such as sparse matrices), with >> the addition of a matrix type header: >> http://math.nist.gov/MatrixMarket/formats.html >> >> It is also written that "Harwell-Boeing" is the most commonly used >> sparse matrix format, but that: >> "Unfortunately the HB specification is somewhat complex difficult to >> parse from languages other than Fortran, biased in favor of compressed >> column representation and not easily extensible. Some of these factors >> may have impeded the more widespread use of the HB format" >> >> It seems to me that complying to the Matrix Market coordinate format >> would be the right choice, in terms of ease of implementation, >> compliance to other packages and user-friendliness. I could update the >> print/scan functions accordingly (mostly handling the header). What do >> you think? >> >> Best, >> Alexis >> >> >> >> >> On Mon, Feb 8, 2016 at 1:59 AM, Alexis Tantet <alexis.tantet@gmail.com> >> wrote: >>> >>> Ok, my mistake, now I see where I got confused. >>> I had in mind to add all the elements first to the triplets and only >>> while converting to compressed sum up the duplicates. >>> While, indeed, if there's a way you can sum up the duplicates directly >>> while adding them to the triplet matrix (thanks to _ptr), this is more >>> handy and efficient. >>> >>> Thanks for the clarification, >>> Alexis >>> >>> On Sun, Feb 7, 2016 at 10:34 PM, Patrick Alken <alken@colorado.edu> >>> wrote: >>>> >>>> By design, gsl_spmatrix_set won't allow you to do this. >>>> >>>> If you add element (i, j, x) and then later to try add element (i, j, >>>> y), gsl_spmatrix_set will detect that there exists an element in the (i, >>>> j) spot and it will simply change x to y - the value of x will be >>>> overwritten by y. This is the same behavior as gsl_matrix_set. >>>> >>>> So no duplicates are allowed by design. If you have such an application >>>> where you want to keep track of duplicates, you could do the following: >>>> >>>> double *ptr = gsl_spmatrix_ptr(m, i, j); >>>> if (ptr) >>>> *ptr += x; /* sum duplicate values */ >>>> else >>>> gsl_spmatrix_set(m, i, j, x); /* initalize to x */ >>>> >>>> On 02/07/2016 01:31 PM, Alexis Tantet wrote: >>>>> >>>>> I'm not sure I got your last point. I have the following situation in >>>>> mind: >>>>> >>>>> Start to construct a transition matrix in triplet format, adding one >>>>> element after another. >>>>> In this particular example, each element is one count of a transition >>>>> from (state, box, etc.) i to j, >>>>> so I add elements (i, j, 1) to the triplet object, with possibly >>>>> duplicates. >>>>> What happen to these duplicates in the binary tree? >>>>> >>>>> Eventually, when I compress to CRS or CCS, I would like the duplicates >>>>> to be summed up, so that element (i, j) counts transitions from i to j >>>>> (and no duplicates exist after compression). >>>>> >>>>> Is this more clear? >>>>> >>>>> On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken <alken@colorado.edu> >>>>> wrote: >>>>>> >>>>>> Hi Alexis, >>>>>> >>>>>>>> I'm not sure what you mean. I've added a new function >>>>>>>> gsl_spmatrix_ptr >>>>>>>> to the git, which as far as I can tell does exactly what your >>>>>>>> sum_duplicate flag does. It searches the matrix for an (i,j) >>>>>>>> element, >>>>>>>> and if found returns a pointer. If not found a null pointer is >>>>>>>> returned. >>>>>>>> This makes it easy for the user to modify A(i,j) after it has been >>>>>>>> added >>>>>>>> to the matrix. Are you thinking of something else? Can you point me >>>>>>>> to >>>>>>>> the Eigen routine? >>>>>>>> >>>>>>> What I meant is to have the equivalent of gsl_spmatrix_compress, >>>>>>> with the difference that gsl_spmatrix_ptr is used instead of >>>>>>> gsl_spmatrix_set, >>>>>>> so has to build the compressed matrix from triplets, summing the >>>>>>> duplicates, instead of replacing them. >>>>>>> This is what is done here : >>>>>>> The >>>>>>> http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b >>>>>>> >>>>>>> Best, >>>>>>> Alexis >>>>>> >>>>>> I'm not sure why a user would ever need to do this. The whole point of >>>>>> the binary tree structure in the triplet storage is to efficiently >>>>>> find >>>>>> duplicate entries, so that if a user tries to call gsl_spmatrix_set on >>>>>> an element which is already been previously set, it can find that >>>>>> element with a binary search (rather than linearly searching the >>>>>> arrays) >>>>>> and change the value of that element. >>>>>> >>>>>> Therefore, the way the triplet storage is designed, there is will >>>>>> never >>>>>> be a duplicate element in the triplet arrays. All of the (i[n],j[n]) >>>>>> will be unique for each n <= nz. >>>>>> >>>>>> Am I missing something? >>>>>> >>>>>> Patrick >>>>> >>>>> >>> >>> >>> -- >>> Alexis Tantet >> >> >> > -- Alexis Tantet ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-12 10:43 ` Alexis Tantet @ 2016-02-13 19:42 ` Patrick Alken 2016-02-14 11:06 ` Alexis Tantet 0 siblings, 1 reply; 25+ messages in thread From: Patrick Alken @ 2016-02-13 19:42 UTC (permalink / raw) To: gsl-discuss Ok I've added the MatrixMarket stuff to fprintf/fscanf. I also merged all the new sparse matrix stuff into the master branch on git. I've added almost your whole patch, except for all the stuff in spmanip.c and also the gt_elements() routines in spprop.c. I'm not sure these routines belong in a general purpose matrix library....although I can possibly see a need for the row/col multiplication stuff, since we can't easily make vector views of rows/columns of sparse matrices.. I'll need to think some more about this. Also I haven't yet modified dgemm, because I'm hesitant to use the dynamic allocation method in your patch. Also I found a sparse blas document on netlib which tries to outline a standard and I realize our sparse blas routines don't adhere to this standard...so I may try to redesign everything to follow that document. Anyway thanks a lot for all your work on this. Patrick On 02/12/2016 03:42 AM, Alexis Tantet wrote: > I corrected _fscanf. There was an error when reading the comment header. > > On Wed, Feb 10, 2016 at 4:55 PM, Patrick Alken <alken@colorado.edu> wrote: >> I'm in favor of simplicity and easy-parsing, so matrix market sounds good to >> me. I'll take a look at your latest code in the next few days. >> >> Patrick >> >> >> On 02/10/2016 06:16 AM, Alexis Tantet wrote: >>> Hi Patrick, >>> >>> Regarding the file format for sparse matrices, the one I have coded >>> actually happen to be the coordinate format implemented by Matrix >>> Market (the platform to share test data such as sparse matrices), with >>> the addition of a matrix type header: >>> http://math.nist.gov/MatrixMarket/formats.html >>> >>> It is also written that "Harwell-Boeing" is the most commonly used >>> sparse matrix format, but that: >>> "Unfortunately the HB specification is somewhat complex difficult to >>> parse from languages other than Fortran, biased in favor of compressed >>> column representation and not easily extensible. Some of these factors >>> may have impeded the more widespread use of the HB format" >>> >>> It seems to me that complying to the Matrix Market coordinate format >>> would be the right choice, in terms of ease of implementation, >>> compliance to other packages and user-friendliness. I could update the >>> print/scan functions accordingly (mostly handling the header). What do >>> you think? >>> >>> Best, >>> Alexis >>> >>> >>> >>> >>> On Mon, Feb 8, 2016 at 1:59 AM, Alexis Tantet <alexis.tantet@gmail.com> >>> wrote: >>>> Ok, my mistake, now I see where I got confused. >>>> I had in mind to add all the elements first to the triplets and only >>>> while converting to compressed sum up the duplicates. >>>> While, indeed, if there's a way you can sum up the duplicates directly >>>> while adding them to the triplet matrix (thanks to _ptr), this is more >>>> handy and efficient. >>>> >>>> Thanks for the clarification, >>>> Alexis >>>> >>>> On Sun, Feb 7, 2016 at 10:34 PM, Patrick Alken <alken@colorado.edu> >>>> wrote: >>>>> By design, gsl_spmatrix_set won't allow you to do this. >>>>> >>>>> If you add element (i, j, x) and then later to try add element (i, j, >>>>> y), gsl_spmatrix_set will detect that there exists an element in the (i, >>>>> j) spot and it will simply change x to y - the value of x will be >>>>> overwritten by y. This is the same behavior as gsl_matrix_set. >>>>> >>>>> So no duplicates are allowed by design. If you have such an application >>>>> where you want to keep track of duplicates, you could do the following: >>>>> >>>>> double *ptr = gsl_spmatrix_ptr(m, i, j); >>>>> if (ptr) >>>>> *ptr += x; /* sum duplicate values */ >>>>> else >>>>> gsl_spmatrix_set(m, i, j, x); /* initalize to x */ >>>>> >>>>> On 02/07/2016 01:31 PM, Alexis Tantet wrote: >>>>>> I'm not sure I got your last point. I have the following situation in >>>>>> mind: >>>>>> >>>>>> Start to construct a transition matrix in triplet format, adding one >>>>>> element after another. >>>>>> In this particular example, each element is one count of a transition >>>>>> from (state, box, etc.) i to j, >>>>>> so I add elements (i, j, 1) to the triplet object, with possibly >>>>>> duplicates. >>>>>> What happen to these duplicates in the binary tree? >>>>>> >>>>>> Eventually, when I compress to CRS or CCS, I would like the duplicates >>>>>> to be summed up, so that element (i, j) counts transitions from i to j >>>>>> (and no duplicates exist after compression). >>>>>> >>>>>> Is this more clear? >>>>>> >>>>>> On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken <alken@colorado.edu> >>>>>> wrote: >>>>>>> Hi Alexis, >>>>>>> >>>>>>>>> I'm not sure what you mean. I've added a new function >>>>>>>>> gsl_spmatrix_ptr >>>>>>>>> to the git, which as far as I can tell does exactly what your >>>>>>>>> sum_duplicate flag does. It searches the matrix for an (i,j) >>>>>>>>> element, >>>>>>>>> and if found returns a pointer. If not found a null pointer is >>>>>>>>> returned. >>>>>>>>> This makes it easy for the user to modify A(i,j) after it has been >>>>>>>>> added >>>>>>>>> to the matrix. Are you thinking of something else? Can you point me >>>>>>>>> to >>>>>>>>> the Eigen routine? >>>>>>>>> >>>>>>>> What I meant is to have the equivalent of gsl_spmatrix_compress, >>>>>>>> with the difference that gsl_spmatrix_ptr is used instead of >>>>>>>> gsl_spmatrix_set, >>>>>>>> so has to build the compressed matrix from triplets, summing the >>>>>>>> duplicates, instead of replacing them. >>>>>>>> This is what is done here : >>>>>>>> The >>>>>>>> http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b >>>>>>>> >>>>>>>> Best, >>>>>>>> Alexis >>>>>>> I'm not sure why a user would ever need to do this. The whole point of >>>>>>> the binary tree structure in the triplet storage is to efficiently >>>>>>> find >>>>>>> duplicate entries, so that if a user tries to call gsl_spmatrix_set on >>>>>>> an element which is already been previously set, it can find that >>>>>>> element with a binary search (rather than linearly searching the >>>>>>> arrays) >>>>>>> and change the value of that element. >>>>>>> >>>>>>> Therefore, the way the triplet storage is designed, there is will >>>>>>> never >>>>>>> be a duplicate element in the triplet arrays. All of the (i[n],j[n]) >>>>>>> will be unique for each n <= nz. >>>>>>> >>>>>>> Am I missing something? >>>>>>> >>>>>>> Patrick >>>>>> >>>> >>>> -- >>>> Alexis Tantet >>> >>> > > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-13 19:42 ` Patrick Alken @ 2016-02-14 11:06 ` Alexis Tantet 2016-02-14 18:11 ` Patrick Alken 0 siblings, 1 reply; 25+ messages in thread From: Alexis Tantet @ 2016-02-14 11:06 UTC (permalink / raw) To: Patrick Alken; +Cc: gsl-discuss Great! I've just figured that the printf/scanf need a modification. I had the indices starting from 0 as in C, while the MatrixMarket standard starts from 1. It is just a matter of adding one to the indices in printf and removing one for scanf. Could you take care of it? On Sat, Feb 13, 2016 at 8:41 PM, Patrick Alken <alken@colorado.edu> wrote: > Ok I've added the MatrixMarket stuff to fprintf/fscanf. I also merged > all the new sparse matrix stuff into the master branch on git. > > I've added almost your whole patch, except for all the stuff in > spmanip.c and also the gt_elements() routines in spprop.c. I'm not sure > these routines belong in a general purpose matrix library....although I > can possibly see a need for the row/col multiplication stuff, since we > can't easily make vector views of rows/columns of sparse matrices.. > > I'll need to think some more about this. > > Also I haven't yet modified dgemm, because I'm hesitant to use the > dynamic allocation method in your patch. Also I found a sparse blas > document on netlib which tries to outline a standard and I realize our > sparse blas routines don't adhere to this standard...so I may try to > redesign everything to follow that document. > > Anyway thanks a lot for all your work on this. > > Patrick > > On 02/12/2016 03:42 AM, Alexis Tantet wrote: >> I corrected _fscanf. There was an error when reading the comment header. >> >> On Wed, Feb 10, 2016 at 4:55 PM, Patrick Alken <alken@colorado.edu> wrote: >>> I'm in favor of simplicity and easy-parsing, so matrix market sounds good to >>> me. I'll take a look at your latest code in the next few days. >>> >>> Patrick >>> >>> >>> On 02/10/2016 06:16 AM, Alexis Tantet wrote: >>>> Hi Patrick, >>>> >>>> Regarding the file format for sparse matrices, the one I have coded >>>> actually happen to be the coordinate format implemented by Matrix >>>> Market (the platform to share test data such as sparse matrices), with >>>> the addition of a matrix type header: >>>> http://math.nist.gov/MatrixMarket/formats.html >>>> >>>> It is also written that "Harwell-Boeing" is the most commonly used >>>> sparse matrix format, but that: >>>> "Unfortunately the HB specification is somewhat complex difficult to >>>> parse from languages other than Fortran, biased in favor of compressed >>>> column representation and not easily extensible. Some of these factors >>>> may have impeded the more widespread use of the HB format" >>>> >>>> It seems to me that complying to the Matrix Market coordinate format >>>> would be the right choice, in terms of ease of implementation, >>>> compliance to other packages and user-friendliness. I could update the >>>> print/scan functions accordingly (mostly handling the header). What do >>>> you think? >>>> >>>> Best, >>>> Alexis >>>> >>>> >>>> >>>> >>>> On Mon, Feb 8, 2016 at 1:59 AM, Alexis Tantet <alexis.tantet@gmail.com> >>>> wrote: >>>>> Ok, my mistake, now I see where I got confused. >>>>> I had in mind to add all the elements first to the triplets and only >>>>> while converting to compressed sum up the duplicates. >>>>> While, indeed, if there's a way you can sum up the duplicates directly >>>>> while adding them to the triplet matrix (thanks to _ptr), this is more >>>>> handy and efficient. >>>>> >>>>> Thanks for the clarification, >>>>> Alexis >>>>> >>>>> On Sun, Feb 7, 2016 at 10:34 PM, Patrick Alken <alken@colorado.edu> >>>>> wrote: >>>>>> By design, gsl_spmatrix_set won't allow you to do this. >>>>>> >>>>>> If you add element (i, j, x) and then later to try add element (i, j, >>>>>> y), gsl_spmatrix_set will detect that there exists an element in the (i, >>>>>> j) spot and it will simply change x to y - the value of x will be >>>>>> overwritten by y. This is the same behavior as gsl_matrix_set. >>>>>> >>>>>> So no duplicates are allowed by design. If you have such an application >>>>>> where you want to keep track of duplicates, you could do the following: >>>>>> >>>>>> double *ptr = gsl_spmatrix_ptr(m, i, j); >>>>>> if (ptr) >>>>>> *ptr += x; /* sum duplicate values */ >>>>>> else >>>>>> gsl_spmatrix_set(m, i, j, x); /* initalize to x */ >>>>>> >>>>>> On 02/07/2016 01:31 PM, Alexis Tantet wrote: >>>>>>> I'm not sure I got your last point. I have the following situation in >>>>>>> mind: >>>>>>> >>>>>>> Start to construct a transition matrix in triplet format, adding one >>>>>>> element after another. >>>>>>> In this particular example, each element is one count of a transition >>>>>>> from (state, box, etc.) i to j, >>>>>>> so I add elements (i, j, 1) to the triplet object, with possibly >>>>>>> duplicates. >>>>>>> What happen to these duplicates in the binary tree? >>>>>>> >>>>>>> Eventually, when I compress to CRS or CCS, I would like the duplicates >>>>>>> to be summed up, so that element (i, j) counts transitions from i to j >>>>>>> (and no duplicates exist after compression). >>>>>>> >>>>>>> Is this more clear? >>>>>>> >>>>>>> On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken <alken@colorado.edu> >>>>>>> wrote: >>>>>>>> Hi Alexis, >>>>>>>> >>>>>>>>>> I'm not sure what you mean. I've added a new function >>>>>>>>>> gsl_spmatrix_ptr >>>>>>>>>> to the git, which as far as I can tell does exactly what your >>>>>>>>>> sum_duplicate flag does. It searches the matrix for an (i,j) >>>>>>>>>> element, >>>>>>>>>> and if found returns a pointer. If not found a null pointer is >>>>>>>>>> returned. >>>>>>>>>> This makes it easy for the user to modify A(i,j) after it has been >>>>>>>>>> added >>>>>>>>>> to the matrix. Are you thinking of something else? Can you point me >>>>>>>>>> to >>>>>>>>>> the Eigen routine? >>>>>>>>>> >>>>>>>>> What I meant is to have the equivalent of gsl_spmatrix_compress, >>>>>>>>> with the difference that gsl_spmatrix_ptr is used instead of >>>>>>>>> gsl_spmatrix_set, >>>>>>>>> so has to build the compressed matrix from triplets, summing the >>>>>>>>> duplicates, instead of replacing them. >>>>>>>>> This is what is done here : >>>>>>>>> The >>>>>>>>> http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Alexis >>>>>>>> I'm not sure why a user would ever need to do this. The whole point of >>>>>>>> the binary tree structure in the triplet storage is to efficiently >>>>>>>> find >>>>>>>> duplicate entries, so that if a user tries to call gsl_spmatrix_set on >>>>>>>> an element which is already been previously set, it can find that >>>>>>>> element with a binary search (rather than linearly searching the >>>>>>>> arrays) >>>>>>>> and change the value of that element. >>>>>>>> >>>>>>>> Therefore, the way the triplet storage is designed, there is will >>>>>>>> never >>>>>>>> be a duplicate element in the triplet arrays. All of the (i[n],j[n]) >>>>>>>> will be unique for each n <= nz. >>>>>>>> >>>>>>>> Am I missing something? >>>>>>>> >>>>>>>> Patrick >>>>>>> >>>>> >>>>> -- >>>>> Alexis Tantet >>>> >>>> >> >> > -- Alexis Tantet ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-14 11:06 ` Alexis Tantet @ 2016-02-14 18:11 ` Patrick Alken 2016-02-14 18:25 ` Brian Gladman 0 siblings, 1 reply; 25+ messages in thread From: Patrick Alken @ 2016-02-14 18:11 UTC (permalink / raw) To: Alexis Tantet; +Cc: gsl-discuss Ok fixed On 02/14/2016 04:06 AM, Alexis Tantet wrote: > Great! > > I've just figured that the printf/scanf need a modification. I had the > indices starting from 0 as in C, while the MatrixMarket standard > starts from 1. It is just a matter of adding one to the indices in > printf and removing one for scanf. Could you take care of it? > > On Sat, Feb 13, 2016 at 8:41 PM, Patrick Alken <alken@colorado.edu> wrote: >> Ok I've added the MatrixMarket stuff to fprintf/fscanf. I also merged >> all the new sparse matrix stuff into the master branch on git. >> >> I've added almost your whole patch, except for all the stuff in >> spmanip.c and also the gt_elements() routines in spprop.c. I'm not sure >> these routines belong in a general purpose matrix library....although I >> can possibly see a need for the row/col multiplication stuff, since we >> can't easily make vector views of rows/columns of sparse matrices.. >> >> I'll need to think some more about this. >> >> Also I haven't yet modified dgemm, because I'm hesitant to use the >> dynamic allocation method in your patch. Also I found a sparse blas >> document on netlib which tries to outline a standard and I realize our >> sparse blas routines don't adhere to this standard...so I may try to >> redesign everything to follow that document. >> >> Anyway thanks a lot for all your work on this. >> >> Patrick >> >> On 02/12/2016 03:42 AM, Alexis Tantet wrote: >>> I corrected _fscanf. There was an error when reading the comment header. >>> >>> On Wed, Feb 10, 2016 at 4:55 PM, Patrick Alken <alken@colorado.edu> wrote: >>>> I'm in favor of simplicity and easy-parsing, so matrix market sounds good to >>>> me. I'll take a look at your latest code in the next few days. >>>> >>>> Patrick >>>> >>>> >>>> On 02/10/2016 06:16 AM, Alexis Tantet wrote: >>>>> Hi Patrick, >>>>> >>>>> Regarding the file format for sparse matrices, the one I have coded >>>>> actually happen to be the coordinate format implemented by Matrix >>>>> Market (the platform to share test data such as sparse matrices), with >>>>> the addition of a matrix type header: >>>>> http://math.nist.gov/MatrixMarket/formats.html >>>>> >>>>> It is also written that "Harwell-Boeing" is the most commonly used >>>>> sparse matrix format, but that: >>>>> "Unfortunately the HB specification is somewhat complex difficult to >>>>> parse from languages other than Fortran, biased in favor of compressed >>>>> column representation and not easily extensible. Some of these factors >>>>> may have impeded the more widespread use of the HB format" >>>>> >>>>> It seems to me that complying to the Matrix Market coordinate format >>>>> would be the right choice, in terms of ease of implementation, >>>>> compliance to other packages and user-friendliness. I could update the >>>>> print/scan functions accordingly (mostly handling the header). What do >>>>> you think? >>>>> >>>>> Best, >>>>> Alexis >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, Feb 8, 2016 at 1:59 AM, Alexis Tantet <alexis.tantet@gmail.com> >>>>> wrote: >>>>>> Ok, my mistake, now I see where I got confused. >>>>>> I had in mind to add all the elements first to the triplets and only >>>>>> while converting to compressed sum up the duplicates. >>>>>> While, indeed, if there's a way you can sum up the duplicates directly >>>>>> while adding them to the triplet matrix (thanks to _ptr), this is more >>>>>> handy and efficient. >>>>>> >>>>>> Thanks for the clarification, >>>>>> Alexis >>>>>> >>>>>> On Sun, Feb 7, 2016 at 10:34 PM, Patrick Alken <alken@colorado.edu> >>>>>> wrote: >>>>>>> By design, gsl_spmatrix_set won't allow you to do this. >>>>>>> >>>>>>> If you add element (i, j, x) and then later to try add element (i, j, >>>>>>> y), gsl_spmatrix_set will detect that there exists an element in the (i, >>>>>>> j) spot and it will simply change x to y - the value of x will be >>>>>>> overwritten by y. This is the same behavior as gsl_matrix_set. >>>>>>> >>>>>>> So no duplicates are allowed by design. If you have such an application >>>>>>> where you want to keep track of duplicates, you could do the following: >>>>>>> >>>>>>> double *ptr = gsl_spmatrix_ptr(m, i, j); >>>>>>> if (ptr) >>>>>>> *ptr += x; /* sum duplicate values */ >>>>>>> else >>>>>>> gsl_spmatrix_set(m, i, j, x); /* initalize to x */ >>>>>>> >>>>>>> On 02/07/2016 01:31 PM, Alexis Tantet wrote: >>>>>>>> I'm not sure I got your last point. I have the following situation in >>>>>>>> mind: >>>>>>>> >>>>>>>> Start to construct a transition matrix in triplet format, adding one >>>>>>>> element after another. >>>>>>>> In this particular example, each element is one count of a transition >>>>>>>> from (state, box, etc.) i to j, >>>>>>>> so I add elements (i, j, 1) to the triplet object, with possibly >>>>>>>> duplicates. >>>>>>>> What happen to these duplicates in the binary tree? >>>>>>>> >>>>>>>> Eventually, when I compress to CRS or CCS, I would like the duplicates >>>>>>>> to be summed up, so that element (i, j) counts transitions from i to j >>>>>>>> (and no duplicates exist after compression). >>>>>>>> >>>>>>>> Is this more clear? >>>>>>>> >>>>>>>> On Sun, Feb 7, 2016 at 9:14 PM, Patrick Alken <alken@colorado.edu> >>>>>>>> wrote: >>>>>>>>> Hi Alexis, >>>>>>>>> >>>>>>>>>>> I'm not sure what you mean. I've added a new function >>>>>>>>>>> gsl_spmatrix_ptr >>>>>>>>>>> to the git, which as far as I can tell does exactly what your >>>>>>>>>>> sum_duplicate flag does. It searches the matrix for an (i,j) >>>>>>>>>>> element, >>>>>>>>>>> and if found returns a pointer. If not found a null pointer is >>>>>>>>>>> returned. >>>>>>>>>>> This makes it easy for the user to modify A(i,j) after it has been >>>>>>>>>>> added >>>>>>>>>>> to the matrix. Are you thinking of something else? Can you point me >>>>>>>>>>> to >>>>>>>>>>> the Eigen routine? >>>>>>>>>>> >>>>>>>>>> What I meant is to have the equivalent of gsl_spmatrix_compress, >>>>>>>>>> with the difference that gsl_spmatrix_ptr is used instead of >>>>>>>>>> gsl_spmatrix_set, >>>>>>>>>> so has to build the compressed matrix from triplets, summing the >>>>>>>>>> duplicates, instead of replacing them. >>>>>>>>>> This is what is done here : >>>>>>>>>> The >>>>>>>>>> http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#a5bcf3187e372ff7cea1e8f61152ae49b >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Alexis >>>>>>>>> I'm not sure why a user would ever need to do this. The whole point of >>>>>>>>> the binary tree structure in the triplet storage is to efficiently >>>>>>>>> find >>>>>>>>> duplicate entries, so that if a user tries to call gsl_spmatrix_set on >>>>>>>>> an element which is already been previously set, it can find that >>>>>>>>> element with a binary search (rather than linearly searching the >>>>>>>>> arrays) >>>>>>>>> and change the value of that element. >>>>>>>>> >>>>>>>>> Therefore, the way the triplet storage is designed, there is will >>>>>>>>> never >>>>>>>>> be a duplicate element in the triplet arrays. All of the (i[n],j[n]) >>>>>>>>> will be unique for each n <= nz. >>>>>>>>> >>>>>>>>> Am I missing something? >>>>>>>>> >>>>>>>>> Patrick >>>>>> -- >>>>>> Alexis Tantet >>>>> >>> > > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-14 18:11 ` Patrick Alken @ 2016-02-14 18:25 ` Brian Gladman 2016-02-15 5:10 ` Patrick Alken 0 siblings, 1 reply; 25+ messages in thread From: Brian Gladman @ 2016-02-14 18:25 UTC (permalink / raw) To: gsl-discuss On 14/02/2016 18:10, Patrick Alken wrote: > Ok fixed I am seeing the following failure with the latest repository version of GSL using Visual Studio 2015: 47> gsl: ..\..\spmatrix\spio.c:325: ERROR: fread failed on row indices 47> Default GSL error handler invoked. Brian ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-14 18:25 ` Brian Gladman @ 2016-02-15 5:10 ` Patrick Alken 2016-02-15 11:09 ` Brian Gladman 0 siblings, 1 reply; 25+ messages in thread From: Patrick Alken @ 2016-02-15 5:10 UTC (permalink / raw) To: gsl-discuss I think the problem should be fixed now, can you test the latest git and let me know? Thanks, Patrick On 02/14/2016 11:25 AM, Brian Gladman wrote: > On 14/02/2016 18:10, Patrick Alken wrote: >> Ok fixed > I am seeing the following failure with the latest repository version of > GSL using Visual Studio 2015: > > 47> gsl: ..\..\spmatrix\spio.c:325: ERROR: fread failed on row indices > 47> Default GSL error handler invoked. > > Brian > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-15 5:10 ` Patrick Alken @ 2016-02-15 11:09 ` Brian Gladman [not found] ` <CAMWWPT0J9ENRZjJHLO=cxot4DGdSLer+n2HkBVnhFnO0oiVV8g@mail.gmail.com> 0 siblings, 1 reply; 25+ messages in thread From: Brian Gladman @ 2016-02-15 11:09 UTC (permalink / raw) To: gsl-discuss On 15/02/2016 05:10, Patrick Alken wrote: > I think the problem should be fixed now, can you test the latest git and > let me know? Thanks Patrick, Yes, all tests now pass. Brian ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <CAMWWPT0J9ENRZjJHLO=cxot4DGdSLer+n2HkBVnhFnO0oiVV8g@mail.gmail.com>]
* Re: Sparse matrix extension [not found] ` <CAMWWPT0J9ENRZjJHLO=cxot4DGdSLer+n2HkBVnhFnO0oiVV8g@mail.gmail.com> @ 2016-02-15 13:56 ` Alexis Tantet 2016-02-15 17:17 ` Patrick Alken 0 siblings, 1 reply; 25+ messages in thread From: Alexis Tantet @ 2016-02-15 13:56 UTC (permalink / raw) To: Brian Gladman; +Cc: gsl-discuss Thanks Patrick. I've had a quick check at your modifications and all the tests pass. Some very minor details: - I find _transpose2 an inexpressive name. What about something like _transpose_shallow or _transpose_swap. - Regarding _set, wouldn't it be more secure to raise an error when the matrix dimensions have to be increased? Do you know of applications in which the matrix dimensions are not known beforehand? All the best, Alexis > On Mon, Feb 15, 2016 at 12:09 PM, Brian Gladman <brg@gladman.plus.com> wrote: >> >> On 15/02/2016 05:10, Patrick Alken wrote: >> > I think the problem should be fixed now, can you test the latest git and >> > let me know? >> >> Thanks Patrick, >> >> Yes, all tests now pass. >> >> Brian >> > > > > -- > Alexis Tantet ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Sparse matrix extension 2016-02-15 13:56 ` Alexis Tantet @ 2016-02-15 17:17 ` Patrick Alken 0 siblings, 0 replies; 25+ messages in thread From: Patrick Alken @ 2016-02-15 17:17 UTC (permalink / raw) To: gsl-discuss On 02/15/2016 06:56 AM, Alexis Tantet wrote: > Thanks Patrick. > I've had a quick check at your modifications and all the tests pass. > > Some very minor details: > - I find _transpose2 an inexpressive name. What about something like > _transpose_shallow or _transpose_swap. Well, many routines the linear algebra section append a "2" when the routine does the same thing but using a different method/algorithm, so I was trying to be consistent with the naming, and I tried to document this routine thoroughly in the manual. I will think some more about whether it would make sense to make a longer descriptive name. > - Regarding _set, wouldn't it be more secure to raise an error when > the matrix dimensions have to be increased? Do you know of > applications in which the matrix dimensions are not known beforehand? This was originally done since the code was based on the CSparse library, which did the same thing. You're probably right it makes sense to do more strict checking...although this code has been through a few releases so people's programs might break if we change the behavior now. > > All the best, > Alexis > >> On Mon, Feb 15, 2016 at 12:09 PM, Brian Gladman <brg@gladman.plus.com> wrote: >>> On 15/02/2016 05:10, Patrick Alken wrote: >>>> I think the problem should be fixed now, can you test the latest git and >>>> let me know? >>> Thanks Patrick, >>> >>> Yes, all tests now pass. >>> >>> Brian >>> >> >> >> -- >> Alexis Tantet ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2016-02-15 17:17 UTC | newest] Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CAMWWPT3uJj4Vrn7ut6+F18gY===zd6+1r1UJhz0hcCj--zwtdg@mail.gmail.com> 2016-01-19 16:43 ` Sparse matrix extension Alexis Tantet 2016-01-19 17:02 ` Patrick Alken 2016-01-19 19:55 ` Alexis Tantet 2016-01-19 20:50 ` Patrick Alken 2016-01-20 18:31 ` Alexis Tantet 2016-02-07 0:03 ` Patrick Alken 2016-02-07 1:03 ` Alexis Tantet 2016-02-07 17:25 ` Patrick Alken 2016-02-07 19:32 ` Alexis Tantet 2016-02-07 20:14 ` Patrick Alken 2016-02-07 20:31 ` Alexis Tantet 2016-02-07 21:34 ` Patrick Alken 2016-02-08 0:59 ` Alexis Tantet 2016-02-10 13:16 ` Alexis Tantet 2016-02-10 13:48 ` Alexis Tantet 2016-02-10 15:56 ` Patrick Alken 2016-02-12 10:43 ` Alexis Tantet 2016-02-13 19:42 ` Patrick Alken 2016-02-14 11:06 ` Alexis Tantet 2016-02-14 18:11 ` Patrick Alken 2016-02-14 18:25 ` Brian Gladman 2016-02-15 5:10 ` Patrick Alken 2016-02-15 11:09 ` Brian Gladman [not found] ` <CAMWWPT0J9ENRZjJHLO=cxot4DGdSLer+n2HkBVnhFnO0oiVV8g@mail.gmail.com> 2016-02-15 13:56 ` Alexis Tantet 2016-02-15 17:17 ` Patrick Alken
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).