public inbox for gsl-discuss@sourceware.org
 help / color / mirror / Atom feed
From: Rhys Ulerich <rhys.ulerich@gmail.com>
To: Maxime Boissonneault <maxime.boissonneault@calculquebec.ca>
Cc: gsl-discuss@sourceware.org
Subject: Re: Adding OpenMP support for some of the GSL functions
Date: Thu, 13 Dec 2012 15:53:00 -0000	[thread overview]
Message-ID: <CAKDqugQRt+q-V+Sv=AxmREwZAq1jFrGzU+xFmZi1YRQDSnXmfA@mail.gmail.com> (raw)
In-Reply-To: <50C9D68A.1050903@calculquebec.ca>

> I am doing that too, but any gain we can get is an important one, and it
> turns out that by parallelizing rkf45_apply, my simulation runs 30% faster
> on 8 cores.

That's a parallel efficiency of 0.18% ( = 1 time unit / (8 cores *
0.70 time units)).  This feels like you're getting a small
memory/cache bandwidth increase for the rkf45_apply level-1-BLAS-like
operations by using multiple cores but the cores are otherwise not
being used effectively.  I say this because a state vector 1e6 doubles
long will not generally fit in cache.  Adding more cores increases the
amount of cache available.

> I will have a deeper look at vectorization of GSL, but in my understanding,
> vectorizing can only be done with simple operations, while algorithms like
> RKF45 involve about 10 operations per loop iterations.

The compilers are generally very good.  Intel's icc 11.1 has to be
told that the last four loops you annotated are vectorizable.  GCC
nails it out of the box.

On GCC 4.4.3 with something like
    CFLAGS="-g -O2 -march=native -mtune=native
-ftree-vectorizer-verbose=2 -ftree-vectorize" ../gsl/configure && make
shows every one of those 6 loops vectorizing.  You can check this by
configuring with those options, running make and waiting for the build
to finish, and then cd-ing into ode-initval2 and running
    rm rkf45*o && make
and observing all those beautiful
    LOOP VECTORIZED
messages.  Better yet, with those options, 'make check' passes for me
on the 'ode-initval2' subdirectory.

Try ripping out your OpenMP pragmas in GSL, building with
vectorization against stock GSL as I suggested, and then seeing how
fast your code runs with GSL vectorized on 1 core versus GSL's
unvectorized rkf45_apply parallelized over 8 cores.  I suspect it will
be comparable.

- Rhys

  reply	other threads:[~2012-12-13 15:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-11 20:04 Maxime Boissonneault
2012-12-12 16:35 ` Frank Reininghaus
2012-12-12 21:11   ` Maxime Boissonneault
2012-12-12 21:41     ` Rhys Ulerich
2012-12-12 23:32       ` Maxime Boissonneault
2012-12-12 17:05 ` Rhys Ulerich
2012-12-12 21:13   ` Maxime Boissonneault
2012-12-12 21:36     ` Rhys Ulerich
2012-12-12 23:35       ` Maxime Boissonneault
2012-12-13  2:29         ` Rhys Ulerich
2012-12-13 13:22           ` Maxime Boissonneault
2012-12-13 15:53             ` Rhys Ulerich [this message]
2012-12-13 16:44               ` Rhys Ulerich
2012-12-13 21:07                 ` Maxime Boissonneault
2012-12-13 21:05               ` Maxime Boissonneault
2012-12-13 21:14                 ` Maxime Boissonneault

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKDqugQRt+q-V+Sv=AxmREwZAq1jFrGzU+xFmZi1YRQDSnXmfA@mail.gmail.com' \
    --to=rhys.ulerich@gmail.com \
    --cc=gsl-discuss@sourceware.org \
    --cc=maxime.boissonneault@calculquebec.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).