From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15443 invoked by alias); 9 Oct 2009 00:14:26 -0000 Received: (qmail 15434 invoked by uid 22791); 9 Oct 2009 00:14:25 -0000 X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,SPF_FAIL X-Spam-Check-By: sourceware.org Received: from mx20.gnu.org (HELO mx20.gnu.org) (199.232.41.8) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 09 Oct 2009 00:14:20 +0000 Received: from [65.74.133.4] (helo=mail.codesourcery.com) by mx20.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1Mw37i-0004mJ-3b for gcc-patches@gcc.gnu.org; Thu, 08 Oct 2009 20:14:18 -0400 Received: (qmail 32218 invoked from network); 9 Oct 2009 00:14:16 -0000 Received: from unknown (HELO digraph.polyomino.org.uk) (joseph@127.0.0.2) by mail.codesourcery.com with ESMTPA; 9 Oct 2009 00:14:16 -0000 Received: from jsm28 (helo=localhost) by digraph.polyomino.org.uk with local-esmtp (Exim 4.69) (envelope-from ) id 1Mw37e-0005Eu-P5 for gcc-patches@gcc.gnu.org; Fri, 09 Oct 2009 00:14:14 +0000 Date: Fri, 09 Oct 2009 00:19:00 -0000 From: "Joseph S. Myers" To: gcc-patches@gcc.gnu.org Subject: Fix Thumb-2 NEON ICE Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-detected-operating-system: by mx20.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2009-10/txt/msg00567.txt.bz2 This patch fixes an ICE for Thumb-2 NEON on the included testcase. output_move_neon has operands[0] (mem:V16QI (plus:SI (reg:SI 3 r3) (const_int -256))) and operands[1] (reg:V16QI 95 d16) and calls adjust_address (mem, SImode, 8 * i). This ends up validating the address for SImode, but the offset of -256 is not valid for SImode for Thumb-2. The new address is only used with vldr/vstr instructions; there seems to be no need for it to be valid for SImode, so this patch makes it use DImode instead, for which all valid NEON offsets are accepted. Tested with no regressions with cross to arm-none-eabi. OK to commit? 2009-10-08 Joseph Myers * config/arm/arm.c (output_move_neon): Use DImode in call to adjust_address. testsuite: 2009-10-08 Joseph Myers * gcc.target/arm/neon-thumb2-move.c: New test. Index: gcc/config/arm/arm.c =================================================================== --- gcc/config/arm/arm.c (revision 152576) +++ gcc/config/arm/arm.c (working copy) @@ -12269,7 +12269,7 @@ output_move_neon (rtx *operands) { /* We're only using DImode here because it's a convenient size. */ ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * i); - ops[1] = adjust_address (mem, SImode, 8 * i); + ops[1] = adjust_address (mem, DImode, 8 * i); if (reg_overlap_mentioned_p (ops[0], mem)) { gcc_assert (overlap == -1); Index: gcc/testsuite/gcc.target/arm/neon-thumb2-move.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon-thumb2-move.c (revision 0) +++ gcc/testsuite/gcc.target/arm/neon-thumb2-move.c (revision 0) @@ -0,0 +1,98 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O2 -mthumb -march=armv7-a -mfloat-abi=softfp -mfpu=neon" } */ + +#include +#include + +void * +memset (DST, C, LENGTH) + void *DST; + int C; + size_t LENGTH; +{ + void* DST0 = DST; + unsigned char C_BYTE = C; + + + if (__builtin_expect(LENGTH < 4, 1)) { + size_t i = 0; + while (i < LENGTH) { + ((char*)DST)[i] = C_BYTE; + i++; + } + return DST; + } + + const char* DST_end = (char*)DST + LENGTH; + + + while ((uintptr_t)DST % 4 != 0) { + *(char*) (DST++) = C_BYTE; + } + + + uint32_t C_SHORTWORD = (uint32_t)(unsigned char)(C_BYTE) * 0x01010101; + + + if (__builtin_expect(DST_end - (char*)DST >= 16, 0)) { + while ((uintptr_t)DST % 16 != 0) { + *((uint32_t*)((char*)(DST) + (0))) = C_SHORTWORD; + DST += 4; + } + + + uint8x16_t C_WORD = vdupq_n_u8(C_BYTE); + + + + + + size_t i = 0; + LENGTH = DST_end - (char*)DST; + while (i + 16 * 16 <= LENGTH) { + *((uint8x16_t*)((char*)(DST) + (i))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 1))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 2))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 3))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 4))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 5))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 6))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 7))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 8))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 9))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 10))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 11))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 12))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 13))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 14))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 15))) = C_WORD; + i += 16 * 16; + } + while (i + 16 * 4 <= LENGTH) { + *((uint8x16_t*)((char*)(DST) + (i))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 1))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 2))) = C_WORD; + *((uint8x16_t*)((char*)(DST) + (i + 16 * 3))) = C_WORD; + i += 16 * 4; + } + while (i + 16 <= LENGTH) { + *((uint8x16_t*)((char*)(DST) + (i))) = C_WORD; + i += 16; + } + DST += i; + } + + while (4 <= DST_end - (char*)DST) { + *((uint32_t*)((char*)(DST) + (0))) = C_SHORTWORD; + DST += 4; + } + + + while ((char*)DST < DST_end) { + *((char*)DST) = C_BYTE; + DST++; + } + + return DST0; +} -- Joseph S. Myers joseph@codesourcery.com