From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 19989 invoked by alias); 14 Dec 2004 10:54:00 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 19540 invoked by uid 48); 14 Dec 2004 10:52:38 -0000 Date: Tue, 14 Dec 2004 10:54:00 -0000 Message-ID: <20041214105238.19539.qmail@sourceware.org> From: "uros at kss-loka dot si" To: gcc-bugs@gcc.gnu.org In-Reply-To: <20031105013127.12902.kbowers@lanl.gov> References: <20031105013127.12902.kbowers@lanl.gov> Reply-To: gcc-bugzilla@gcc.gnu.org Subject: [Bug target/12902] Invalid assembly generated when using SSE / xmmintrin.h X-Bugzilla-Reason: CC X-SW-Source: 2004-12/txt/msg01967.txt.bz2 List-Id: ------- Additional Comments From uros at kss-loka dot si 2004-12-14 10:52 ------- The problem here is in combiner, which in combination with reload pass produce somehow incorrect pattern. The line that segfaults is: c->v = _mm_loadl_pi(c->v,((__m64 *)a0)+1); This line is represented with foloowing RTL sequence (pr12902.c.00.expand): (insn 26 24 27 1 (parallel [ (set (reg:SI 80) (plus:SI (reg:SI 70 [ a0.26 ]) (const_int 8 [0x8]))) (clobber (reg:CC 17 flags)) ]) -1 (nil) (nil)) (insn 27 26 28 1 (set (reg:SI 81) (reg:SI 80)) -1 (nil) (nil)) (insn 28 27 30 1 (set (reg:V4SF 60 [ D.3679 ]) (vec_merge:V4SF (mem/s:V4SF (reg/v/f:SI 77 [ c ]) [0 .v+0 S16 A128]) (mem:V4SF (reg:SI 81) [0 S16 A8]) (const_int 3 [0x3]))) -1 (nil) (nil)) (insn 30 28 32 1 (set (mem/s:V4SF (reg/v/f:SI 77 [ c ]) [0 .v+0 S16 A128]) (reg:V4SF 60 [ D.3679 ])) -1 (nil) (nil)) This whole sequence is combined into one RTL insn (pr12902.c.17.combine) that satisfies "sse_movlps" pattern constraints: (insn 30 28 35 0 (set (mem/s:V4SF (reg/v/f:SI 77 [ c ]) [0 .v+0 S16 A128]) (vec_merge:V4SF (mem/s:V4SF (reg/v/f:SI 77 [ c ]) [0 .v+0 S16 A128]) (mem:V4SF (plus:SI (reg/v/f:SI 71 [ a0 ]) (const_int 8 [0x8])) [0 S16 A8]) (const_int 3 [0x3]))) 541 {sse_movlps} (insn_list:REG_DEP_TRUE 12 (nil)) (expr_list:REG_DEAD (reg/v/f:SI 71 [ a0 ]) (nil))) Following this, reload generates what it thinks is the best reg/mem combination to satisfy register constraints (pr12902.c.24.postreload) of "sse_movlps" pattern (insn 80 28 30 0 (set (reg:V4SF 21 xmm0) (mem:V4SF (plus:SI (reg/v/f:SI 4 si [orig:71 a0 ] [71]) (const_int 8 [0x8])) [0 S16 A8])) 509 {movv4sf_internal} (nil) (nil)) (insn:HI 30 80 35 0 (set (mem/s:V4SF (reg/v/f:SI 1 dx [orig:77 c ] [77]) [0 .v+0 S16 A128]) (vec_merge:V4SF (mem/s:V4SF (reg/v/f:SI 1 dx [orig:77 c ] [77]) [0 .v+0 S16 A128]) (reg:V4SF 21 xmm0) (const_int 3 [0x3]))) 541 {sse_movlps} (insn_list:REG_DEP_TRUE 12 (nil)) (nil)) Unfortunatelly, insn 80 will crash, because it results in unaligned load: ... movaps 8(%esi), %xmm0 <- crash here movlps %xmm0, (%edx) ... Uros. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12902