From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22890 invoked by alias); 23 Oct 2002 03:49:44 -0000 Mailing-List: contact gsl-discuss-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gsl-discuss-owner@sources.redhat.com Received: (qmail 22883 invoked from network); 23 Oct 2002 03:49:42 -0000 Received: from unknown (HELO okra.cchem.Berkeley.edu) (128.32.198.129) by sources.redhat.com with SMTP; 23 Oct 2002 03:49:42 -0000 Received: from localhost (aspuru@localhost) by okra.cchem.Berkeley.edu (8.11.0/8.11.0) with ESMTP id g9N3sLf21809; Tue, 22 Oct 2002 20:54:22 -0700 Date: Wed, 23 Oct 2002 13:43:00 -0000 From: Alan Aspuru-Guzik To: Christos Siopis cc: gsl-discuss@sources.redhat.com Subject: Re: About coordinated efforts on scientific software. In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT X-SW-Source: 2002-q4/txt/msg00074.txt.bz2 Christos, Greetings again. In reply to your long message's concerns. I believe that all the community should gather (and support) the Common Components Architecture and BABEL (which interoperate together) for the issues that you talk about. If the community agreed on a representation for vectors/matrices/operators/statistics functions, etc. current libraries can be wrapped with things like Babel. Take a look at this nice tutorial: http://www.llnl.gov/CASC/components/docs/2001-pict-intro.pdf And the documentation here: http://www.llnl.gov/CASC/components/overview.html The languages they support are: Fortran 77, Fortran 90, C, C++, Java, MATLAB, and Python. That means: if we had a group of people creating BABEL interfaces to GSL, it could be automatically used from all these different languages at the cost of a C++ call (at least that is what the Babel people say that their wrappers cost in CPU time). The common components architecture, which can use Babel components, has even a GUI builder where you can assemble prebuilt components (like an optimizer, a Monte Carlo integrator, etc.) for creating applications in literally seconds. Babel is LGPL, CCA and people are componentizing a lot of stuff. Jaideep Ray showed a very complex example of a solver for the diffusion equation. For example, they componentized stuff like a mesh refiner http://www.cca-forum.org/cca-sc01/sandbox/jaray/samr/GrACEComponent/docs/README.html or a mesh statistics tool: http://www.cca-forum.org/cca-sc01/sandbox/jaray/samr/StatsComponent/docs/README.html an optimizer: http://www.cca-forum.org/cca-sc01/sandbox/norris/nonlinear/taosolver/README.html I think that these guys are very advanced in this project, and if all the community joined in, this could become our own umbrella project. I am a graduate student trying to finish my work, so I decided not to componentize anything right now, but later I might work on it. Anyway, a simple effort of componentizing what we already have, might get us farther than we think. Alan On Tue, 22 Oct 2002, Christos Siopis wrote: > On Mon, 21 Oct 2002, Manoj Warrier wrote: > > > I guess (hope rather) that GSL will eventually cover the numerical library > > part of point (1). For plotting and graphics we again have a similar > > situation as in the "mathematic packages" ... Check out > > (http://scilinux.sf.net/graphvis.html" for a list of free packages. > > I think the problem is not so much with lack of libraries as it is with > lack of an "integrated" environment where one can start with raw data, > pass them through various mathematical transformations, and finally plot > some result, all from inside the same "package"/environment that > encourages trial-and-error, what-if experiments, and rapid prototyping. > > The first thought that one might have for achieving this would be to > somehow wrap a number of relevant libraries and use them from inside a > scripting language like python. I can see at least three kinds of problems > with this: > > - First, most of the existing libraries are too low-level for direct use > from an interactive scripting environment. Things like memory allocation > (needed e.g. by GSL) or opening a window for plotting (needed by e.g. by > PGPLOT) are *show-stoppers* in an interactive environment. Some heroic > people are going through the pain of actually creating usable interfaces, > such as the PyGSL folks. This is fine, except two things: how do the > wrapper functions interoperate with functions from other wrapped libraries > (see next item below), and how do we ensure we do not enter into a > versioning hell, where the wrapper uses some version A of the library, but > the library has now moved on to version B; add some RPM versioning issues > if you use RedHat's stuff and multiply all this by the number of libraries > which you want to wrap and enjoy the mess... > > - Second, there is the issue of consistency of the user interface. For > instance, a NumPy (numeric python) user is used to the ufuncs, "universal" > functions with return type that depends on the input type. So if a NumPy > user wanted to compute the mean of an array, he or she would expect that a > function call like mean(arrayx) would return an long int or a float, > depending on whether arrayx is an array of longs or of floats. But doing > this through GSL/PyGSL, the user would have to use pygsl.statistics.mean > for a float or pygsl.statistics.long.mean for a long int, i.e. the user is > asked to think in terms of C, a strongly typed language. This is both > annoying and prone to hard-to-find errors. A related issue is the overlap > between wrappers of different libraries (e.g., NumPy already has a couple > of mean/average functions from other libraries!). And there is also the > issue of performance, as NumPy objects are converted back and forth to > different formats (some wrappers do a better job at this than others). > > - Third, it's the question of "putting this all together". Wrappers are > good for wrapping a small number of small libraries. As you add more and > more, there's all sorts of issues related to the distribution of the > "final" package, the quality and homogeneity of the documentation, and so > on. If there was no other solution at hand, maybe this would all be > acceptable. But with commercial packages offering a "one-stop" solution > (despite a number of other disadvantages) i think the open-source science > community has to do better than that. > > SciPy ( www.scipy.org ) is a package that tries to solve some of these > problems but i this it is a little too early to tell how good the outcome > will be, and i cannot help wondering how many more times the open source > numerical community will have to code and debug e.g. an FFT transform or a > statistical package and whether this is the best use of our resources... > > Christos > -- Alan Aspuru-Guzik Dios mueve al jugador, y éste, la pieza. (510)642-5911 UC Berkeley ¿Qué Dios detrás de Dios la trama empieza (925)422-8739 LLNL de polvo y tiempo y sueño y agonías? -Borges