From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12517 invoked by alias); 13 Dec 2004 20:54:21 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 12485 invoked by uid 48); 13 Dec 2004 20:54:16 -0000 Date: Mon, 13 Dec 2004 20:54:00 -0000 Message-ID: <20041213205416.12484.qmail@sourceware.org> From: "bangerth at dealii dot org" To: gcc-bugs@gcc.gnu.org In-Reply-To: <20031105013127.12902.kbowers@lanl.gov> References: <20031105013127.12902.kbowers@lanl.gov> Reply-To: gcc-bugzilla@gcc.gnu.org Subject: [Bug target/12902] Invalid assembly generated when using SSE / xmmintrin.h X-Bugzilla-Reason: CC X-SW-Source: 2004-12/txt/msg01860.txt.bz2 List-Id: ------- Additional Comments From bangerth at dealii dot org 2004-12-13 20:54 ------- I really have not much of an idea what I am doing here, but this is a shorter testcase: ------------------------- #include typedef struct { int i; float f[3]; } a_t; typedef struct { float f[8]; } b_t; typedef union { int i[4]; float f[4]; __m128 v; } vector4; void swizzle( const void *a0, const void *a1, const void *a2, const void *a3, vector4 *a, vector4 *b, vector4 *c, vector4 *d ) { __m128 t, u; a->v = _mm_loadl_pi(a->v, (__m64 *)a0); c->v = _mm_loadl_pi(c->v,((__m64 *)a0)+1); a->v = _mm_loadh_pi(a->v, (__m64 *)a1); c->v = _mm_loadh_pi(c->v,((__m64 *)a1)+1); t = _mm_loadl_pi(b->v, (__m64 *)a2); u = _mm_loadl_pi(d->v,((__m64 *)a2)+1); a->v = _mm_shuffle_ps(a->v,t,0); b->v = _mm_shuffle_ps(b->v,t,0); c->v = _mm_shuffle_ps(c->v,u,0); d->v = _mm_shuffle_ps(d->v,u,0); } int main () { a_t a[128]; b_t b[128]; vector4 ai, a0, a1, a2, b0, v0, v1, v2; __m128 *p0, *p1, *p2, *p3; int n = 1; for(;n;n--) { swizzle(a,a+1,a+2,a+3,&ai,&a0,&a1,&a2); p0 = (__m128 *)(b + ai.i[0]); p1 = (__m128 *)(b + ai.i[1]); p2 = (__m128 *)(b + ai.i[2]); p3 = (__m128 *)(b + ai.i[3]); swizzle(p0++,p1++,p2++,p3++,&b0,&v0,&v1,&v2); _mm_add_ps(_mm_add_ps(b0.v,_mm_mul_ps(a1.v,v0.v)), _mm_mul_ps(a2.v,_mm_add_ps(v1.v,_mm_mul_ps(a1.v,v2.v)))); } } ----------------------------- It fails on 3.4 and mainline, but not with icc: g/x> /home/bangerth/bin/gcc-3.4.*-pre/bin/g++ -O -msse2 -g x.cc ; ./a.out Segmentation fault g/x> /home/bangerth/bin/gcc-4.*-pre/bin/g++ -O -msse2 -g x.cc ; ./a.out Segmentation fault g/x> icc x.cc ; ./a.out On mainline, I get this from a gdb session: (gdb) r Starting program: /home/bangerth/tmp/g/x/a.out Program received signal SIGSEGV, Segmentation fault. swizzle (a0=0xbfffe250, a1=0xbfffe260, a2=0xbfffe270, a3=0xbfffe280, a=0xbfffeac0, b=0xbfffeab0, c=0xbfffeaa0, d=0xbfffea90) at x.cc:23 (gdb) i reg [...] eip 0x80483ba 0x80483ba (gdb) disass 0x080483ba <_Z7swizzlePKvS0_S0_S0_P7vector4S2_S2_S2_+38>: movaps 0x8 (%esi),%xmm0 As I said, I have no idea what this program does, and if it is wellformed after my attempts to reduce it at all. Maybe it helps anyway. W. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12902