From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 19587 invoked by alias); 31 Dec 2009 16:41:17 -0000 Received: (qmail 19577 invoked by uid 22791); 31 Dec 2009 16:41:15 -0000 X-SWARE-Spam-Status: No, hits=-1.6 required=5.0 tests=AWL,BAYES_00,SARE_MSGID_LONG40,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mail-pw0-f57.google.com (HELO mail-pw0-f57.google.com) (209.85.160.57) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 31 Dec 2009 16:41:12 +0000 Received: by pwi12 with SMTP id 12so8530813pwi.16 for ; Thu, 31 Dec 2009 08:41:10 -0800 (PST) MIME-Version: 1.0 Received: by 10.142.61.41 with SMTP id j41mr2616114wfa.335.1262277670161; Thu, 31 Dec 2009 08:41:10 -0800 (PST) In-Reply-To: <4B3C0E6A.2070706@analog.com> References: <4B3BAA3E.80802@ncsu.edu> <4B3C0E6A.2070706@analog.com> Date: Thu, 31 Dec 2009 20:28:00 -0000 Message-ID: <5b7094580912310841g6d8c957fueb7c451cd0d75309@mail.gmail.com> Subject: Re: Correct way to make a 16-byte aligned double* for SSE vectorization? From: Brian Budge To: Jie Zhang Cc: Benjamin Redelings I , gcc-help@gcc.gnu.org Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-help-owner@gcc.gnu.org X-SW-Source: 2009-12/txt/msg00387.txt.bz2 The reason it won't work is that you're saying the pointer itself needs to be 16 (or 8) byte aligned. You need the address that the pointer points to to be aligned. On the stack: __attribute__ ((aligned(16)) real myArray[32]; On the heap (*nix): real *myArray; posix_memalign((void**)&myArray, 16, 32 * sizeof(real)); or for more portability you could use the SSE intrinsic mm_malloc. To know why the one version you posted works, we'd need to see the calling code of f. In general, it shouldn't work if malloc or new are used to allocate the memory passed in, but it might be that the memory is allocated on the stack? Brian On Wed, Dec 30, 2009 at 6:37 PM, Jie Zhang wrote: > Hi, > > On 12/31/2009 03:30 AM, Benjamin Redelings I wrote: >> >> Hi, >> >> I am trying to figure out how to make a double* that is 16-byte aligned >> in the way that SSE instructions want. Hopefully this would allow GCC to >> auto-vectorize loops in a better way. The problem that I am having is >> that I want a pointer to an aligned double, not an aligned pointer to a >> double. >> >> I am compiling with these options: >> % gcc -c test.C -O3 -ftree-vectorizer-verbose=3D3 -ffast-math >> >> According to the output of the vectorizer, none of the three ways >> (below) of declaring an aligned pointer actually work. They are treated >> as unaligned accesses, so presumably the location of the pointer itself >> is being aligned, but it does not point to an aligned location. In >> contrast, if I define an aligned double, and then define a pointer to >> it, this works. Is this recommended? >> > Below is just taken from the GCC Manual: > [quote] > As another example, > > =A0 =A0 char *__attribute__((aligned(8))) *f; > > specifies the type =93pointer to 8-byte-aligned pointer to char=94. Note = again > that this does not work with most attributes; for example, the usage of > `aligned' and `noreturn' attributes given above is not yet supported. > [/quote] > > If it had been supported, you could use > >> //typedef const real __attribute__((aligned(16))) *SSE_PTR; > > But since it is not yet supported now, you have to use > >> typedef double real; >> >> // these two lines work (together) >> typedef real aligned_real __attribute__((aligned(16))); >> typedef const aligned_real* SSE_PTR; >> > > Jie >