From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16749 invoked by alias); 7 Apr 2008 16:09:02 -0000 Received: (qmail 16741 invoked by uid 22791); 7 Apr 2008 16:09:02 -0000 X-Spam-Check-By: sourceware.org Received: from ti-out-0910.google.com (HELO ti-out-0910.google.com) (209.85.142.190) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 07 Apr 2008 16:08:44 +0000 Received: by ti-out-0910.google.com with SMTP id j3so506631tid.20 for ; Mon, 07 Apr 2008 09:08:41 -0700 (PDT) Received: by 10.150.219.16 with SMTP id r16mr2796221ybg.198.1207584520148; Mon, 07 Apr 2008 09:08:40 -0700 (PDT) Received: by 10.150.195.1 with HTTP; Mon, 7 Apr 2008 09:08:40 -0700 (PDT) Message-ID: <3d104d6f0804070908q7ee3513ehd18db00437c6d835@mail.gmail.com> Date: Mon, 07 Apr 2008 17:02:00 -0000 From: "Dario Bahena Tapia" To: jlh Subject: Re: Why worse performace in euclidean distance with SSE2? Cc: gcc-help@gcc.gnu.org In-Reply-To: <47FA3C65.6020701@gmx.ch> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <3d104d6f0804070617u47213cc8nbc697dab9dc262b5@mail.gmail.com> <47FA3C65.6020701@gmx.ch> X-IsSubscribed: yes Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-help-owner@gcc.gnu.org X-SW-Source: 2008-04/txt/msg00079.txt.bz2 Hello, I tried with your options but it seems to make no difference. In another email it was suggested to use _mm_sqrt_sd, because I only needed one sqrt calculation. That improved time and indeed, almost reach serial version (now it runs up to 1 second slower for the 10,000 data example, hehe). But of course, I would wanna/expect the vector version to run faster ... still unsure how to achieve that. Thanks On Mon, Apr 7, 2008 at 10:23 AM, jlh wrote: > Dario Bahena Tapia wrote: > > > > > inline static double dist(int i,int j) > > { > > double xd = C[i][X] - C[j][X]; > > double yd = C[i][Y] - C[j][Y]; > > return rint(sqrt(xd*xd + yd*yd)); > > } > > [...] > > > > And in order to activate the SSE2 features, I am using the following > > flags for gcc (my computer is a laptop): > > > > CFLAGS = -O -Wall -march=pentium-m -msse2 > > > > These options do not make dist() use any SSE for me. Have you > tried compiling with this? > > CFLAGS = -O2 -Wall -march=pentium-m -mfpmath=sse > > I think -msse2 is redundant if you say -march-pentium-m. I don't > have an SSE2 machine to try this though. > > jlh >