*Recursive linear algebra algorithms[not found] <5b929a7c-30d6-1e20-9ff8-eaa6d54b12f4@colorado.edu>@ 2019-05-30 15:06 ` Patrick Alken2019-06-08 20:45 ` Patrick Alken 0 siblings, 1 reply; 3+ messages in thread From: Patrick Alken @ 2019-05-30 15:06 UTC (permalink / raw) To: gsl-discuss Hi all, I have recently learned of a project called ReLAPACK (https://github.com/HPAC/ReLAPACK, paper here: https://arxiv.org/abs/1602.06763) which implements a number of LAPACK algorithms (such as LU, Cholesky, Sylvester equations) using recursive methods which can use Level 3 BLAS calls. The paper shows that most of these algorithms out-perform the block Level 3 algorithms in LAPACK. The main advantage is that LAPACK block algorithms require fixing the block size ahead of time, which may not be optimal for a given architecture, while the recursive methods don't require a block size parameter. The recursive methods do however require a "base case" - i.e. at what size matrix should it switch to the Level 2 BLAS based algorithms. ReLAPACK fixes this currently at N=24. Anyway, the recursive Cholesky variant is quite straightforward to implement, and I have already coded it for GSL (both the decomposition and inversion). I did some tests for N=10,000 with ATLAS BLAS and found that it runs faster than DPOTRF from LAPACK. This fast Cholesky decomposition will improve the performance also for the generalized symmetric definite eigensolvers, and the least squares modules (linear and nonlinear). So GSL now has a competitive Cholesky solver, which I think should make many GSL users happy :) Work is currently underway to implement the recursive pivoted LU decomposition in GSL. Unfortunately the ReLAPACK authors state that the QR algorithm is not amenable to recursive methods, so the block QR seems to still be the best choice. It would be nice to implement this for GSL, in case any volunteers are looking for a project ;) Patrick ^ permalink raw reply [flat|nested] 3+ messages in thread

*Re: Recursive linear algebra algorithms2019-05-30 15:06 ` Recursive linear algebra algorithms Patrick Alken@ 2019-06-08 20:45 ` Patrick Alken2019-06-21 22:14 ` Patrick Alken 0 siblings, 1 reply; 3+ messages in thread From: Patrick Alken @ 2019-06-08 20:45 UTC (permalink / raw) To: gsl-discuss The LU decomposition in GSL (both real and complex) is now based on a recursive Level 3 BLAS algorithm. The performance improvement is quite dramatic when using an optimized multi-threaded BLAS library like ATLAS. I'd be interested in hearing feedback from anyone who uses Cholesky/LU factorizations in their work. GSL may out-perform LAPACK in these areas now, and the recursive algorithms are surprisingly simple to implement and fit quite nicely with GSL's codebase. Enjoy, Patrick On 5/30/19 9:06 AM, Patrick Alken wrote: > Hi all, > > > I have recently learned of a project called ReLAPACK > (https://github.com/HPAC/ReLAPACK, paper here: > https://arxiv.org/abs/1602.06763) which implements a number of LAPACK > algorithms (such as LU, Cholesky, Sylvester equations) using recursive > methods which can use Level 3 BLAS calls. The paper shows that most of > these algorithms out-perform the block Level 3 algorithms in LAPACK. The > main advantage is that LAPACK block algorithms require fixing the block > size ahead of time, which may not be optimal for a given architecture, > while the recursive methods don't require a block size parameter. > > The recursive methods do however require a "base case" - i.e. at what > size matrix should it switch to the Level 2 BLAS based algorithms. > ReLAPACK fixes this currently at N=24. > > Anyway, the recursive Cholesky variant is quite straightforward to > implement, and I have already coded it for GSL (both the decomposition > and inversion). I did some tests for N=10,000 with ATLAS BLAS and found > that it runs faster than DPOTRF from LAPACK. This fast Cholesky > decomposition will improve the performance also for the generalized > symmetric definite eigensolvers, and the least squares modules (linear > and nonlinear). > > So GSL now has a competitive Cholesky solver, which I think should make > many GSL users happy :) > > Work is currently underway to implement the recursive pivoted LU > decomposition in GSL. > > Unfortunately the ReLAPACK authors state that the QR algorithm is not > amenable to recursive methods, so the block QR seems to still be the > best choice. It would be nice to implement this for GSL, in case any > volunteers are looking for a project ;) > > Patrick > > ^ permalink raw reply [flat|nested] 3+ messages in thread

*2019-06-08 20:45 ` Patrick AlkenRe: Recursive linear algebra algorithms@ 2019-06-21 22:14 ` Patrick Alken0 siblings, 0 replies; 3+ messages in thread From: Patrick Alken @ 2019-06-21 22:14 UTC (permalink / raw) To: gsl-discuss It seems that the ReLAPACK authors were a bit pessimistic about the possibility of a Level 3 BLAS recursive QR algorithm, since back in 2000, Elmroth and Gustavson published some work on exactly this. Their algorithm tends to work best for "tall skinny" matrices with M >> N, since the number of flops grows as O(N^3). One of the LAPACK authors is currently doing research on this algorithm, and has kindly provided GSL with GPL'd code to implement the Elmroth/Gustavson algorithm, including the decomposition, Q^T b calculation, and calculation of the full Q. All of these operations significantly outperform the current Level 2 implementation for various matrix sizes I have tried. Everything is already on the git. Due to the nature of the new algorithm, I needed to create a new interface, which I have suffixed with _r (i.e. gsl_linalg_QR_decomp_r). So GSL now has "state of the art" algorithms for Cholesky, LU and QR. These are exciting times :) There are other parts of the library which use QR decompositions which will likely benefit from this new algorithm, though I have not yet converted them over. Over the years, there have been discussions about whether we should create GSL interfaces for LAPACK routines. I think these new developments have reduced the need for that, though of course there are many other areas where LAPACK is far superior (i.e. SVD, eigensystems, COD). Enjoy, Patrick On 6/8/19 2:45 PM, Patrick Alken wrote: > The LU decomposition in GSL (both real and complex) is now based on a > recursive Level 3 BLAS algorithm. The performance improvement is quite > dramatic when using an optimized multi-threaded BLAS library like ATLAS. > I'd be interested in hearing feedback from anyone who uses Cholesky/LU > factorizations in their work. GSL may out-perform LAPACK in these areas > now, and the recursive algorithms are surprisingly simple to implement > and fit quite nicely with GSL's codebase. > > Enjoy, > Patrick > > On 5/30/19 9:06 AM, Patrick Alken wrote: >> Hi all, >> >> >> I have recently learned of a project called ReLAPACK >> (https://github.com/HPAC/ReLAPACK, paper here: >> https://arxiv.org/abs/1602.06763) which implements a number of LAPACK >> algorithms (such as LU, Cholesky, Sylvester equations) using recursive >> methods which can use Level 3 BLAS calls. The paper shows that most of >> these algorithms out-perform the block Level 3 algorithms in LAPACK. The >> main advantage is that LAPACK block algorithms require fixing the block >> size ahead of time, which may not be optimal for a given architecture, >> while the recursive methods don't require a block size parameter. >> >> The recursive methods do however require a "base case" - i.e. at what >> size matrix should it switch to the Level 2 BLAS based algorithms. >> ReLAPACK fixes this currently at N=24. >> >> Anyway, the recursive Cholesky variant is quite straightforward to >> implement, and I have already coded it for GSL (both the decomposition >> and inversion). I did some tests for N=10,000 with ATLAS BLAS and found >> that it runs faster than DPOTRF from LAPACK. This fast Cholesky >> decomposition will improve the performance also for the generalized >> symmetric definite eigensolvers, and the least squares modules (linear >> and nonlinear). >> >> So GSL now has a competitive Cholesky solver, which I think should make >> many GSL users happy :) >> >> Work is currently underway to implement the recursive pivoted LU >> decomposition in GSL. >> >> Unfortunately the ReLAPACK authors state that the QR algorithm is not >> amenable to recursive methods, so the block QR seems to still be the >> best choice. It would be nice to implement this for GSL, in case any >> volunteers are looking for a project ;) >> >> Patrick >> >> ^ permalink raw reply [flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-06-21 22:14 UTC | newest]Thread overview:3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <5b929a7c-30d6-1e20-9ff8-eaa6d54b12f4@colorado.edu> 2019-05-30 15:06 ` Recursive linear algebra algorithms Patrick Alken 2019-06-08 20:45 ` Patrick Alken 2019-06-21 22:14 ` Patrick Alken

This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).