Re: sparce matrices

public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed

* Re: sparce matrices
@ 2001-12-19 13:20 Edwin Robert Tisdale
  2001-12-19 13:20 ` Peter Schmitteckert
  0 siblings, 1 reply; 6+ messages in thread
From: Edwin Robert Tisdale @ 2001-12-19 13:20 UTC (permalink / raw)
  To: gsl-discuss

Matthew J. Doller wrote:

> I am about to begin a project at my school using gsl
> and my advising professor seems to be very interested in gsl.
> one question he did have was,
> "Are there any constructs to store sparces matrices?"
> It might be beneficial to have such a way to store them
> especially when dealing with matrices that are 10e8 x 10e8 in size.

First, let me point out that a matrix -- even a sparse matrix --
of that size has no physical relevance.
Even a matrix of size 10,000 by 10,000 is on shaky numerical ground.
More typically, large sparse matrices are on the order of 1,000 by
1,000.
If they are much larger than that,
the user is probably doing something very, very wrong.

There are two distinct considerations in sparse matrix libraries.
The primary consideration is to save time by avoiding algorithms
that move zeros in and out of memory.
The secondary consideration is to save memory
by storing only the non zero entries of the sparse matrix.
Modern computer architectures have fast floating-point pipelines
and large, fast and cheap memory so neither of these considerations
is worth pursuing unless the matrix is very large and very sparse
(i.e., less than 10% of the entries are non zero).
It is especially important to keep this in mind
when considering triangular and symmetric matrix representations.

There doesn't appear to be any suitable abstraction for sparse matrices.
Sparse matrix libraries are all about the specific data representation.
Take a look at Survey of Sparse Matrix Storage Formats

http://www.netlib.org/linalg/old_html_templates/subsection2.8.3.1.html

Low level vector and matrix arithmetic libraries like the GSL and
the Vector, Signal and Image Processing Library (VSIPL)

	http://www.vsipl.org/

don't support sparse matrix data types directly but the VSIPL API
specifies a gather operation which can be used to collect
the non zero elements of a sparse matrix into a dense vector
and a scatter operation which can be used to insert
the elements of a dense vector into a sparse matrix.
Otherwise, the only other truly useful operation
is sparse matrix - dense vector multiplication
which can be used to solve a system of equations iteratively --
see IML++ (Iterative Methods Library) v. 1.2a

	http://math.nist.gov/iml++/

for example.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: sparce matrices
  2001-12-19 13:20 sparce matrices Edwin Robert Tisdale
@ 2001-12-19 13:20 ` Peter Schmitteckert
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Schmitteckert @ 2001-12-19 13:20 UTC (permalink / raw)
  To: gsl-discuss

Salut,

On Wednesday 03 October 2001 20:31, Edwin Robert Tisdale wrote:
> Matthew J. Doller wrote:
> > I am about to begin a project at my school using gsl
> > and my advising professor seems to be very interested in gsl.
> > one question he did have was,
> > "Are there any constructs to store sparces matrices?"
> > It might be beneficial to have such a way to store them
> > especially when dealing with matrices that are 10e8 x 10e8 in size.
>
> First, let me point out that a matrix -- even a sparse matrix --
> of that size has no physical relevance.
> Even a matrix of size 10,000 by 10,000 is on shaky numerical ground.
> More typically, large sparse matrices are on the order of 1,000 by
> 1,000.
> If they are much larger than that,
> the user is probably doing something very, very wrong.

Please let me note that I diagonalized 10e7x10e7 matrices as
a sub-problem of a larger calculation (Density matrix renormalization
group calculation) on high end workstation about 5 years ago, with the
aim of getting more than 5 Digits accuracy of the lowest eigen state of a
10e30x10e30 symmetric matrix, and it worked very well. If you're interested,
just search my name on the cond-mat preprint server.

Best wishes,
Peter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: sparce matrices
  2001-12-19 13:20 Matthew J. Doller
@ 2001-12-19 13:20 ` Brian Gough
  0 siblings, 0 replies; 6+ messages in thread
From: Brian Gough @ 2001-12-19 13:20 UTC (permalink / raw)
  To: Matthew J. Doller; +Cc: gsl-discuss

Matthew J. Doller writes:
 > I am about to begin a project at my school using gsl, and my advising
 > professor seems to be very interested in gsl.
 > one question he did have was are there any constructs to store sparces
 > matrices, i.e. matrices that a primarily zero.

Not in GSL.  Sparse matrix code tends to be problem-specific,
depending on the structure of the matrix and what you want to do with
it. I think there are about ten different storage formats that people
use, depending on the problem.  So you would probably be best to code
up the appropriate one for your application to get maximum compression
and efficiency.

regards
Brian Gough

^ permalink raw reply	[flat|nested] 6+ messages in thread

* sparce matrices
@ 2001-12-19 13:20 Matthew J. Doller
  2001-12-19 13:20 ` Brian Gough
  0 siblings, 1 reply; 6+ messages in thread
From: Matthew J. Doller @ 2001-12-19 13:20 UTC (permalink / raw)
  To: gsl-discuss

I am about to begin a project at my school using gsl, and my advising
professor seems to be very interested in gsl.
one question he did have was are there any constructs to store sparces
matrices, i.e. matrices that a primarily zero.
it might be beneficial to have such a way to store them, especially when
dealing with matrices that are 10e8 x 10e8 in size
thanks
matt

-- 
Matthew J. Doller
mdoller@wpi.edu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: sparce matrices
@ 2001-12-19 13:20 Mikael Adlers
  0 siblings, 0 replies; 6+ messages in thread
From: Mikael Adlers @ 2001-12-19 13:20 UTC (permalink / raw)
  To: gsl-discuss

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 6287 bytes --]

Hi all,
I have to add some remarks to this discussion. First of all I would say that
a matrix of the size 1e8 x 1e8 seem quite big to me. Even if only the
diagonal 
of this matrix is stored it would require about 3 Gbytes of memory. However,
on a large cluster of computers and if the matrix has a lot of structure
(band matrix, block triangular or other nive structure) who knows.

I do not agree that a matrix of the order 10.000 "is on shaky numerical
ground".
A sparse matrix of this size is easily solvable for a sparse solver (even a
dense
solver can handle a matrix of this size). There is no shaky numerical
behaviour
for matrices of this size and if it is sparse it is even possible to sharpen

the usual error estimates that are quite pessimistic in most cases (there is

a growth factor in the error estimate that are 2^(n-1), however numerical
analasists
agree that this grows is very unlikely (but attainable) and the growth
factor is in most 
practical cases around 10). I have worked with a sparse factorization
algorithm, 
the QR factorization (computed the least squares solution to a system) and
my 
test problems was about 30.000 x 10.000 with 130.000 nonzero elements and
the 
factorization took a couple of seconds. I considered my test problems as
small ones.
In practice the density of a sparse matrix decreases as the dimension
increases
(like meshes, the number of neighbours do not increase as the mesh
increase).

The really largescale FEM problems can generate sparse matrices of the order
1e6 - 1e7
and the solution of there system is possible on a supercomputer or cluster
of ordinary pc
using iterative methods with sufficient accuracy. I know of an example with
an 3D grid
if the space shuttle consisting of 16.000.000 grid points, each point
generation at least 
one equation.

It is to simplify to say that a goal of sparse computation is to minimize
the movement
of zeros to and from the memory. The goal is more to structure the
computation to avoid 
operations on zeros elements and to minimize the creation of new nonzero
elements
(happens in all direct methods, i.e. methods based on factorization).

The reason why arithmetic matrix libraries like gsl or LAPACK (I have no
knowledge about VSIPL) 
do not support any sparse matrix data type is the following. To be able to
use the sparsity
of the problem it is important that all intermediate results also are
sparse. For example
when solving a linear system with LU decomposition both L and U has to be
sparse
otherwise all is for nothing. To do this, a graph representing the structure
is
created. Then some graph algorithms are added to analyse the graph and try
to
minimize any destruction of sparsity.

However, in iterative libraries like IML++ only matrix - vector products is
needed and 
therefore it is possible to use it with sparse data structures.

The different storage formats that exists is to maximize the performance of
the
specific operation that you would like to do to the matrix. Solving a
tridiagonal
matrix you would like to have it stored compressed rowwize format and
computing the 
QR factorization it is more useful to have it column wise. 

If you would like to solve a problem that you cannot load into memory you
have
to use iterative methods like IML++ otherwise I would recommend the SuperLU
package
( http://www.nersc.gov/~xiaoye/SuperLU/ ).

Regards,
/Mikael Adlers

------------------------------------------------------------------ 
 Mikael Adlers, Ph.D.          email: mikael@mathcore.com 
 MathCore AB                   phone: +4613 32 85 07 
 Wallenbergsgata 4             fax:         21 27 01
 SE-583 35 LinkÃ¶ping, Sweden   http://www.mathcore.com

> -----Original Message-----
> From: Edwin Robert Tisdale [ mailto:E.Robert.Tisdale@jpl.nasa.gov ] 
> Sent: den 3 oktober 2001 20:31
> To: gsl-discuss@sources.redhat.com
> Subject: Re: sparce matrices
> 
> 
> Matthew J. Doller wrote:
> 
> > I am about to begin a project at my school using gsl
> > and my advising professor seems to be very interested in gsl.
> > one question he did have was,
> > "Are there any constructs to store sparces matrices?"
> > It might be beneficial to have such a way to store them
> > especially when dealing with matrices that are 10e8 x 10e8 in size.
> 
> First, let me point out that a matrix -- even a sparse matrix --
> of that size has no physical relevance.
> Even a matrix of size 10,000 by 10,000 is on shaky numerical ground.
> More typically, large sparse matrices are on the order of 1,000 by
> 1,000.
> If they are much larger than that,
> the user is probably doing something very, very wrong.
> 
> There are two distinct considerations in sparse matrix libraries.
> The primary consideration is to save time by avoiding algorithms
> that move zeros in and out of memory.
> The secondary consideration is to save memory
> by storing only the non zero entries of the sparse matrix.
> Modern computer architectures have fast floating-point pipelines
> and large, fast and cheap memory so neither of these considerations
> is worth pursuing unless the matrix is very large and very sparse
> (i.e., less than 10% of the entries are non zero).
> It is especially important to keep this in mind
> when considering triangular and symmetric matrix representations.
> 
> There doesn't appear to be any suitable abstraction for 
> sparse matrices.
> Sparse matrix libraries are all about the specific data 
> representation.
> Take a look at Survey of Sparse Matrix Storage Formats
> 
> http://www.netlib.org/linalg/old_html_templates/subsection2.8.3.1.html
>
>Low level vector and matrix arithmetic libraries like the GSL and
>the Vector, Signal and Image Processing Library (VSIPL)
>
>	http://www.vsipl.org/
>
>don't support sparse matrix data types directly but the VSIPL API
>specifies a gather operation which can be used to collect
>the non zero elements of a sparse matrix into a dense vector
>and a scatter operation which can be used to insert
>the elements of a dense vector into a sparse matrix.
>Otherwise, the only other truly useful operation
>is sparse matrix - dense vector multiplication
>which can be used to solve a system of equations iteratively --
>see IML++ (Iterative Methods Library) v. 1.2a
>
>	http://math.nist.gov/iml++/
>
>for example.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: sparce matrices
@ 2001-12-19 13:20 Edwin Robert Tisdale
  0 siblings, 0 replies; 6+ messages in thread
From: Edwin Robert Tisdale @ 2001-12-19 13:20 UTC (permalink / raw)
  To: gsl-discuss

Mikael Adlers wrote:

> I do not agree that a matrix of the order 10.000
> "is on shaky numerical ground".
> A sparse matrix of this size is easily solvable for a sparse solver
> (even a dense solver can handle a matrix of this size).
> There is no shaky numerical behaviour for matrices of this size
> and if it is sparse, it is even possible to sharpen
> the usual error estimates that are quite pessimistic in most cases.
> (There is a growth factor in the error estimate that are 2^(n-1).)

Should we assume that n is the number of rows (or colums) in the matrix?
What do you mean by growth factor?
Are you considering intrinsic and method errors as well as round-off?

> However, numerical analasists agree that this growth is very unlikely
> and the growth factor is in most practical cases is around 10).

Apparently,
you are contrasting the error bound with the expected error here.

> I have worked with a sparse factorization algorithm, 
> the QR factorization (computed the least squares solution to a system)
> and my test problems was about 30.000 x 10.000 with 130.000 nonzero
> elements and the factorization took a couple of seconds.
> I considered my test problems as small ones.

Were you using single or double precision arithmetic?
What was the error bound?  Expected error?

> In practice the density of a sparse matrix decreases
> as the dimension increases (like meshes,
> the number of neighbours do not increase as the mesh increase).

It appears that you are generalizing from a very special case.

> The really largescale FEM problems can generate sparse matrices
> of the order 1e6 - 1e7 and the solution of there system is possible
> on a supercomputer or cluster of ordinary pc using iterative methods
> with sufficient accuracy. I know of an example with an 3D grid
> if the space shuttle consisting of 16.000.000 grid points,
> each point generation at least one equation.

This is a common practice which I find very disturbing.
Good PDE solvers don't copy the elements from
a finite element/difference grid into a sparse matrix structure
just to apply sparse system solver.  The elements are updated in situ.

> It is to simplify to say that a goal of sparse computation
> is to minimize the movement of zeros to and from the memory.
> The goal is more to structure the computation to avoid operations
> on zeros elements and to minimize the creation of new nonzero elements
> (happens in all direct methods, i.e. methods based on factorization).

Once upon a time, floating-point operations were expensive
but modern floating-point pipelines are fast, cheap and plentiful.
Today, it doesn't take any more time to multiply by zero
than it takes to check for a zero and a branch on zero
would most likely just stall the floating-point pipeline.
The problem is the time required to fetch and store all of those zeros.

> The reason why matrix arithmetic libraries like gsl or LAPACK
> (I have no knowledge about VSIPL)
> do not support any sparse matrix data type is the following.
> To be able to use the sparsity of the problem,
> it is important that all intermediate results also are sparse.
> For example, when solving a linear system with LU decomposition,
> both L and U must be sparse.  Otherwise, all is for nothing.
> To do this, a graph representing the structure is created.
> Then some graph algorithms are added to analyse the graph
> and try to minimize any destruction of sparsity.

In general, neither L or U will be sparse.

> However, in iterative libraries like IML++,
> only matrix - vector products are needed and therefore,
> it is possible to use it with sparse data structures.

Iterative system solvers are the only practical method
for sparse matrices in general.

> The different storage formats that exist maximize the performance
> of the specific operation that you would like to do to the matrix.
> When solving a tridiagonal matrix,
> you would like to have it stored compressed row wise format
> and, when computing the QR factorization,
> it is more useful to have it column wise. 

Neither your computer or the C computer programming language
have any notion of rows or columns.
The notion of rows and columns exists only in the programmer's mind.
A QR factorization on an array of numbers
that you perceive to be stored column wise is simply
an LQ factorization on an the same array of numbers
that you perceive to be stored row wise.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2001-12-19 13:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-12-19 13:20 sparce matrices Edwin Robert Tisdale
2001-12-19 13:20 ` Peter Schmitteckert
  -- strict thread matches above, loose matches on Subject: below --
2001-12-19 13:20 Edwin Robert Tisdale
2001-12-19 13:20 Matthew J. Doller
2001-12-19 13:20 ` Brian Gough
2001-12-19 13:20 Mikael Adlers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).