From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 13594 invoked by alias); 18 Dec 2001 18:03:19 -0000 Mailing-List: contact gsl-discuss-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gsl-discuss-owner@sources.redhat.com Received: (qmail 13544 invoked from network); 18 Dec 2001 18:03:15 -0000 Received: from unknown (HELO ronispc.chem.mcgill.ca) (132.206.205.91) by sources.redhat.com with SMTP; 18 Dec 2001 18:03:15 -0000 Received: from ronispc.chem.mcgill.ca (IDENT:11@localhost [127.0.0.1]) by ronispc.chem.mcgill.ca (8.12.1/8.12.1) with ESMTP id fBII39N0009931; Tue, 18 Dec 2001 13:03:10 -0500 Received: (from ronis@localhost) by ronispc.chem.mcgill.ca (8.12.1/8.12.1/Submit) id fBII39bt009928; Tue, 18 Dec 2001 13:03:09 -0500 From: David Ronis MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15391.34013.675987.972662@ronispc.chem.mcgill.ca> Date: Mon, 10 Dec 2001 13:34:00 -0000 To: gsl-discuss@sources.redhat.com Subject: Speed Issues X-Mailer: VM 7.00 under Emacs 21.1.1 Reply-To: ronis@onsager.chem.mcgill.ca X-SW-Source: 2001-q4/txt/msg00143.txt.bz2 Message-ID: <20011210133400.OablRr67UQ5eYxS580OJ7lm8B7SkUFx9gDYj2RX6f6Y@z> I've compiled gsl-1.0 on an i686-Linux-gnu and on a dual athlon boxes, each with it's own local build of the atlas blas routines. In the one application we've tried, I notice about a 30% slowdown (on either box) compared to the same code compiled with IMSL routines. All the libraries and code were compiled with gcc-2.95.3 and my application had GSL_RANGE_CHECK_OFF and HAVE_INLINE defined (the speed difference is not that large even if it wasn't) Specifically, the application has to solve for the roots of about 600 coupled nonlinear equations, and we've been using the following code: const gsl_multiroot_fsolver_type *T; gsl_multiroot_fsolver *sss; int status; size_t iii, iter = 0; double x_init[3*Ntarget]; const size_t nnn = 3*Ntarget; struct rparams ppp = {1.0, 10.0}; gsl_multiroot_function f = {&rosenbrock_f, nnn, &ppp}; for(i = 0; i < 3*Ntarget; i++) x_init[i] = 0.0; gsl_vector *x = gsl_vector_alloc (nnn); for(i = 0;i < 3*Ntarget; i++) gsl_vector_set (x, i, x_init[i]); T = gsl_multiroot_fsolver_hybrids; sss = gsl_multiroot_fsolver_alloc (T, nnn); start = clock(); gsl_multiroot_fsolver_set (sss, &f, x); do { iter++; status = gsl_multiroot_fsolver_iterate (sss); if (status) /* check if solver is stuck */ break; status=gsl_multiroot_test_delta (sss->dx, sss->x, 0.0, 1.0e-6); } while (status == GSL_CONTINUE && iter < 1000); elapsed_time += (double)(clock()-start)/CLOCKS_PER_SEC; I compile with the following flags: -O3 -march=i686 -ffast-math -funroll-loops -fomit-frame-pointer -fforce-mem -fforce-addr -malign-jumps=3 -malign-loops=3 -malign-functions=3 -mpreferred-stack-boundary=3 and link with the atlas blas routines. I've also tried the IMSL routine ZSPOW (written in fortran from an early version of the IMSL library). As I mentioned at the outset, the gsl version is about 30% slower, although the two give identical roots. Any suggestions? I've played around eliminating some of the additional indirection associated with having general code for arbitrary strides (e.g., by manipulating the data members of the gsl_vector directly, assuming stride=1), but this only speeds things up slightly. David P.S., it doesn't seem to be in the documentation, but is there any convention as to what the initial stride of a gsl_vector is? When can I assume that it's 1 and will remain so?