public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed
* Recursive linear algebra algorithms
       [not found] <5b929a7c-30d6-1e20-9ff8-eaa6d54b12f4@colorado.edu>
@ 2019-05-30 15:06 ` Patrick Alken
  2019-06-08 20:45   ` Patrick Alken
  0 siblings, 1 reply; 3+ messages in thread
From: Patrick Alken @ 2019-05-30 15:06 UTC (permalink / raw)
  To: gsl-discuss

Hi all,


   I have recently learned of a project called ReLAPACK 
(https://github.com/HPAC/ReLAPACK, paper here: 
https://arxiv.org/abs/1602.06763) which implements a number of LAPACK 
algorithms (such as LU, Cholesky, Sylvester equations) using recursive 
methods which can use Level 3 BLAS calls. The paper shows that most of 
these algorithms out-perform the block Level 3 algorithms in LAPACK. The 
main advantage is that LAPACK block algorithms require fixing the block 
size ahead of time, which may not be optimal for a given architecture, 
while the recursive methods don't require a block size parameter.

The recursive methods do however require a "base case" - i.e. at what 
size matrix should it switch to the Level 2 BLAS based algorithms. 
ReLAPACK fixes this currently at N=24.

Anyway, the recursive Cholesky variant is quite straightforward to 
implement, and I have already coded it for GSL (both the decomposition 
and inversion). I did some tests for N=10,000 with ATLAS BLAS and found 
that it runs faster than DPOTRF from LAPACK. This fast Cholesky 
decomposition will improve the performance also for the generalized 
symmetric definite eigensolvers, and the least squares modules (linear 
and nonlinear).

So GSL now has a competitive Cholesky solver, which I think should make 
many GSL users happy :)

Work is currently underway to implement the recursive pivoted LU 
decomposition in GSL.

Unfortunately the ReLAPACK authors state that the QR algorithm is not 
amenable to recursive methods, so the block QR seems to still be the 
best choice. It would be nice to implement this for GSL, in case any 
volunteers are looking for a project ;)

Patrick



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Recursive linear algebra algorithms
  2019-05-30 15:06 ` Recursive linear algebra algorithms Patrick Alken
@ 2019-06-08 20:45   ` Patrick Alken
  2019-06-21 22:14     ` Patrick Alken
  0 siblings, 1 reply; 3+ messages in thread
From: Patrick Alken @ 2019-06-08 20:45 UTC (permalink / raw)
  To: gsl-discuss

The LU decomposition in GSL (both real and complex) is now based on a
recursive Level 3 BLAS algorithm. The performance improvement is quite
dramatic when using an optimized multi-threaded BLAS library like ATLAS.
I'd be interested in hearing feedback from anyone who uses Cholesky/LU
factorizations in their work. GSL may out-perform LAPACK in these areas
now, and the recursive algorithms are surprisingly simple to implement
and fit quite nicely with GSL's codebase.

Enjoy,
Patrick

On 5/30/19 9:06 AM, Patrick Alken wrote:
> Hi all,
>
>
>    I have recently learned of a project called ReLAPACK 
> (https://github.com/HPAC/ReLAPACK, paper here: 
> https://arxiv.org/abs/1602.06763) which implements a number of LAPACK 
> algorithms (such as LU, Cholesky, Sylvester equations) using recursive 
> methods which can use Level 3 BLAS calls. The paper shows that most of 
> these algorithms out-perform the block Level 3 algorithms in LAPACK. The 
> main advantage is that LAPACK block algorithms require fixing the block 
> size ahead of time, which may not be optimal for a given architecture, 
> while the recursive methods don't require a block size parameter.
>
> The recursive methods do however require a "base case" - i.e. at what 
> size matrix should it switch to the Level 2 BLAS based algorithms. 
> ReLAPACK fixes this currently at N=24.
>
> Anyway, the recursive Cholesky variant is quite straightforward to 
> implement, and I have already coded it for GSL (both the decomposition 
> and inversion). I did some tests for N=10,000 with ATLAS BLAS and found 
> that it runs faster than DPOTRF from LAPACK. This fast Cholesky 
> decomposition will improve the performance also for the generalized 
> symmetric definite eigensolvers, and the least squares modules (linear 
> and nonlinear).
>
> So GSL now has a competitive Cholesky solver, which I think should make 
> many GSL users happy :)
>
> Work is currently underway to implement the recursive pivoted LU 
> decomposition in GSL.
>
> Unfortunately the ReLAPACK authors state that the QR algorithm is not 
> amenable to recursive methods, so the block QR seems to still be the 
> best choice. It would be nice to implement this for GSL, in case any 
> volunteers are looking for a project ;)
>
> Patrick
>
>


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Recursive linear algebra algorithms
  2019-06-08 20:45   ` Patrick Alken
@ 2019-06-21 22:14     ` Patrick Alken
  0 siblings, 0 replies; 3+ messages in thread
From: Patrick Alken @ 2019-06-21 22:14 UTC (permalink / raw)
  To: gsl-discuss

It seems that the ReLAPACK authors were a bit pessimistic about the
possibility of a Level 3 BLAS recursive QR algorithm, since back in
2000, Elmroth and Gustavson published some work on exactly this. Their
algorithm tends to work best for "tall skinny" matrices with M >> N,
since the number of flops grows as O(N^3).

One of the LAPACK authors is currently doing research on this algorithm,
and has kindly provided GSL with GPL'd code to implement the
Elmroth/Gustavson algorithm, including the decomposition, Q^T b
calculation, and calculation of the full Q. All of these operations
significantly outperform the current Level 2 implementation for various
matrix sizes I have tried. Everything is already on the git. Due to the
nature of the new algorithm, I needed to create a new interface, which I
have suffixed with _r (i.e. gsl_linalg_QR_decomp_r).

So GSL now has "state of the art" algorithms for Cholesky, LU and QR.
These are exciting times :)

There are other parts of the library which use QR decompositions which
will likely benefit from this new algorithm, though I have not yet
converted them over.

Over the years, there have been discussions about whether we should
create GSL interfaces for LAPACK routines. I think these new
developments have reduced the need for that, though of course there are
many other areas where LAPACK is far superior (i.e. SVD, eigensystems,
COD).

Enjoy,
Patrick

On 6/8/19 2:45 PM, Patrick Alken wrote:
> The LU decomposition in GSL (both real and complex) is now based on a
> recursive Level 3 BLAS algorithm. The performance improvement is quite
> dramatic when using an optimized multi-threaded BLAS library like ATLAS.
> I'd be interested in hearing feedback from anyone who uses Cholesky/LU
> factorizations in their work. GSL may out-perform LAPACK in these areas
> now, and the recursive algorithms are surprisingly simple to implement
> and fit quite nicely with GSL's codebase.
>
> Enjoy,
> Patrick
>
> On 5/30/19 9:06 AM, Patrick Alken wrote:
>> Hi all,
>>
>>
>>    I have recently learned of a project called ReLAPACK 
>> (https://github.com/HPAC/ReLAPACK, paper here: 
>> https://arxiv.org/abs/1602.06763) which implements a number of LAPACK 
>> algorithms (such as LU, Cholesky, Sylvester equations) using recursive 
>> methods which can use Level 3 BLAS calls. The paper shows that most of 
>> these algorithms out-perform the block Level 3 algorithms in LAPACK. The 
>> main advantage is that LAPACK block algorithms require fixing the block 
>> size ahead of time, which may not be optimal for a given architecture, 
>> while the recursive methods don't require a block size parameter.
>>
>> The recursive methods do however require a "base case" - i.e. at what 
>> size matrix should it switch to the Level 2 BLAS based algorithms. 
>> ReLAPACK fixes this currently at N=24.
>>
>> Anyway, the recursive Cholesky variant is quite straightforward to 
>> implement, and I have already coded it for GSL (both the decomposition 
>> and inversion). I did some tests for N=10,000 with ATLAS BLAS and found 
>> that it runs faster than DPOTRF from LAPACK. This fast Cholesky 
>> decomposition will improve the performance also for the generalized 
>> symmetric definite eigensolvers, and the least squares modules (linear 
>> and nonlinear).
>>
>> So GSL now has a competitive Cholesky solver, which I think should make 
>> many GSL users happy :)
>>
>> Work is currently underway to implement the recursive pivoted LU 
>> decomposition in GSL.
>>
>> Unfortunately the ReLAPACK authors state that the QR algorithm is not 
>> amenable to recursive methods, so the block QR seems to still be the 
>> best choice. It would be nice to implement this for GSL, in case any 
>> volunteers are looking for a project ;)
>>
>> Patrick
>>
>>


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-06-21 22:14 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <5b929a7c-30d6-1e20-9ff8-eaa6d54b12f4@colorado.edu>
2019-05-30 15:06 ` Recursive linear algebra algorithms Patrick Alken
2019-06-08 20:45   ` Patrick Alken
2019-06-21 22:14     ` Patrick Alken

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).