From: "Dario Bahena Tapia" <dario.mx@gmail.com>
To: "Brian Budge" <brian.budge@gmail.com>
Cc: gcc-help@gcc.gnu.org
Subject: Re: Why worse performace in euclidean distance with SSE2?
Date: Tue, 08 Apr 2008 02:15:00 -0000 [thread overview]
Message-ID: <3d104d6f0804071602u2cf25d61w3cdf13f3e7ac1f51@mail.gmail.com> (raw)
In-Reply-To: <5b7094580804071551m67759fb0r84b018de3c4a4267@mail.gmail.com>
Hello,
Think I concur, indeed, original program had structure of arrays (each
coordinate in separate array). Will try to use SSE2 over that flavor,
although I think sqrt will still be the bottleneck ... maybe I could
use also another norm function (like maximum or taxicab).
Thanks.
On Mon, Apr 7, 2008 at 5:51 PM, Brian Budge <brian.budge@gmail.com> wrote:
> In my experience, SSE is generally more useful when you can optimize
> your structures as SOA (struct of array) vs AOS (array of struct). If
> you expect a speed up by doing individual groups of pairs of doubles,
> I doubt you'll see much improvement except in extreme situations, or
> when the compiler might detect a pattern in your code. Also, shuffles
> etc... are killers.
>
> Much better would be if you had 10000 of these things to take
> distances at once, and you could lay out the data friendlier for SSE
> (SOA).
>
> Brian
>
>
>
> On Mon, Apr 7, 2008 at 9:08 AM, Dario Bahena Tapia <dario.mx@gmail.com> wrote:
> > Hello,
> >
> > I tried with your options but it seems to make no difference. In
> > another email it was suggested to use _mm_sqrt_sd, because I only
> > needed one sqrt calculation. That improved time and indeed, almost
> > reach serial version (now it runs up to 1 second slower for the 10,000
> > data example, hehe).
> >
> > But of course, I would wanna/expect the vector version to run faster
> > ... still unsure how to achieve that.
> >
> > Thanks
> >
> >
> >
> > On Mon, Apr 7, 2008 at 10:23 AM, jlh <jlh@gmx.ch> wrote:
> > > Dario Bahena Tapia wrote:
> > >
> > > >
> > > > inline static double dist(int i,int j)
> > > > {
> > > > double xd = C[i][X] - C[j][X];
> > > > double yd = C[i][Y] - C[j][Y];
> > > > return rint(sqrt(xd*xd + yd*yd));
> > > > }
> > > > [...]
> > > >
> > > > And in order to activate the SSE2 features, I am using the following
> > > > flags for gcc (my computer is a laptop):
> > > >
> > > > CFLAGS = -O -Wall -march=pentium-m -msse2
> > > >
> > >
> > > These options do not make dist() use any SSE for me. Have you
> > > tried compiling with this?
> > >
> > > CFLAGS = -O2 -Wall -march=pentium-m -mfpmath=sse
> > >
> > > I think -msse2 is redundant if you say -march-pentium-m. I don't
> > > have an SSE2 machine to try this though.
> > >
> > > jlh
> > >
> >
>
next prev parent reply other threads:[~2008-04-07 23:03 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-07 14:09 Dario Bahena Tapia
2008-04-07 15:23 ` Dario Saccavino
2008-04-07 16:09 ` Dario Bahena Tapia
2008-04-07 16:41 ` Dario Bahena Tapia
2008-04-07 16:05 ` jlh
2008-04-07 17:02 ` Dario Bahena Tapia
2008-04-07 23:42 ` Brian Budge
2008-04-08 2:15 ` Dario Bahena Tapia [this message]
2008-04-08 8:34 ` Zuxy Meng
2008-04-08 15:57 ` Dario Bahena Tapia
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3d104d6f0804071602u2cf25d61w3cdf13f3e7ac1f51@mail.gmail.com \
--to=dario.mx@gmail.com \
--cc=brian.budge@gmail.com \
--cc=gcc-help@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).