From: Brian Dessent <brian@dessent.net>
To: JP Fournier <jape41@gmail.com>
Cc: gcc-help@gcc.gnu.org
Subject: Re: is -O2 breaking sse2 alignment?
Date: Thu, 13 Mar 2008 00:28:00 -0000 [thread overview]
Message-ID: <47D8751F.722F7196@dessent.net> (raw)
In-Reply-To: <47D8679C.4090606@gmail.com>
JP Fournier wrote:
> In the example below, compiling with -O2 results in incorrect output
> from the program. -O seems OK. Am I missing something alignment wise
> (or otherwise) or is -O2 breaking my alignment?
If it was an alignment problem you'd most likely be getting a
segmentation fault. The __m128i type should already include the proper
alignment so you don't need the __attribute__((aligned (16))) stuff.
> // array of 2 8 byte ints
> long int *a = _mm_malloc(16, 16);
> long int *b = _mm_malloc(16, 16);
> long int *c = _mm_malloc(16, 16);
>
> __m128i ai __attribute__ ((aligned (16)));
> __m128i bi __attribute__ ((aligned (16)));
> __m128i ci __attribute__ ((aligned (16)));
>
> a[0] = a[1] = 1;
> b[0] = b[1] = 1;
> c[0] = c[1] = 0;
>
> ai = _mm_load_si128( (__m128i *) (void*)a );
> bi = _mm_load_si128( (__m128i *) (void*)b );
>
> ci = _mm_add_epi8( ai, bi );
> _mm_store_si128( (__m128i *) (void*)c, ci );
> printf("c0=%ld c1=%ld\n", c[0], c[1] );
> }
You're violates the C aliasing rules. You can't store through a casted
pointer like that. You also don't have to do the load/store, the
compiler know what you want when you use a union instead:
union { __m128i v; long l[2]; } a, b, c;
a.l[0] = a.l[1] = 1;
b.l[0] = b.l[1] = 1;
c.v = _mm_add_epi8 (a.v, b.v);
printf("c0=%ld c1=%ld\n", c.l[0], c.l[1]);
There's an even more natural way to do this though using gcc's built-in
vector extensions without any of the Intel mmintrin.h stuff. This way
will result in code that will vectorize to altivec, sse2, spu, whatever
the machine supports, it's not hardware specific:
typedef int v4si __attribute__ ((vector_size (16)));
v4si a = { 1, 2, 3, 4 }, b = { 5, 6, 7, 8 }, c;
c = a + b;
You can use all the normal C operators like + and * as if they were
scalars but they will be compiled using the corresponding SIMD
instructions. See
<http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html> for more. If
you want access to the individual parts you can again use the union,
e.g.
union { v4si v; int i[4]; } u;
u.v = a + b;
printf ("%d,%d,%d,%d\n", v.i[0], v.i[1], v.i[2], v.i[3]);
Brian
next prev parent reply other threads:[~2008-03-13 0:28 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-12 23:31 JP Fournier
2008-03-13 0:28 ` Brian Dessent [this message]
2008-03-14 0:13 ` JP Fournier
2008-03-14 19:31 ` Brian Budge
2008-03-15 9:49 ` Andrew Haley
2008-03-15 0:10 ` Maximillian Murphy
2008-03-15 15:10 ` Maximillian Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47D8751F.722F7196@dessent.net \
--to=brian@dessent.net \
--cc=gcc-help@gcc.gnu.org \
--cc=jape41@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).