From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21147 invoked by alias); 29 Mar 2011 11:52:25 -0000 Received: (qmail 21131 invoked by uid 22791); 29 Mar 2011 11:52:23 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from mail-wy0-f175.google.com (HELO mail-wy0-f175.google.com) (74.125.82.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 29 Mar 2011 11:52:18 +0000 Received: by wye20 with SMTP id 20so68510wye.20 for ; Tue, 29 Mar 2011 04:52:16 -0700 (PDT) Received: by 10.227.10.141 with SMTP id p13mr4941607wbp.75.1301399536824; Tue, 29 Mar 2011 04:52:16 -0700 (PDT) Received: from richards-thinkpad (gbibp9ph1--blueice2n1.emea.ibm.com [195.212.29.75]) by mx.google.com with ESMTPS id l24sm2462948wbc.64.2011.03.29.04.52.14 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 29 Mar 2011 04:52:15 -0700 (PDT) From: Richard Sandiford To: Richard Guenther Mail-Followup-To: Richard Guenther ,Chung-Lin Tang , gcc-patches , richard.sandiford@linaro.org Cc: Chung-Lin Tang , gcc-patches Subject: Re: [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g References: <4D85DEA1.6070606@codesourcery.com> Date: Tue, 29 Mar 2011 12:15:00 -0000 In-Reply-To: (Richard Guenther's message of "Tue, 29 Mar 2011 12:56:17 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-03/txt/msg01984.txt.bz2 --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-length: 2058 Richard Guenther writes: > On Thu, Mar 24, 2011 at 11:57 AM, Richard Sandiford > wrote: >> Chung-Lin Tang writes: >>> PR48183 is a case where ARM NEON instrinsics, under -O -g, produce debug >>> insns that tries to expand OImode (32-byte integer) zero constants, much >>> too large to represent as two HOST_WIDE_INTs; as the internals manual >>> indicates, such large constants are not supported in general, and ICEs >>> on the GET_MODE_BITSIZE(mode) =3D=3D 2*HOST_BITS_PER_WIDE_INT assertion. >>> >>> This patch allows the cases where the large integer constant is still >>> representable using a single CONST_INT, such as zero(0). Bootstrapped >>> and tested on i686 and x86_64, cross-tested on ARM, all without >>> regressions. Okay for trunk? >>> >>> Thanks, >>> Chung-Lin >>> >>> 2011-03-20 =C2=A0Chung-Lin Tang =C2=A0 >>> >>> =C2=A0 =C2=A0 =C2=A0 * emit-rtl.c (immed_double_const): Allow wider than >>> =C2=A0 =C2=A0 =C2=A0 2*HOST_BITS_PER_WIDE_INT mode constants when they = are >>> =C2=A0 =C2=A0 =C2=A0 representable as a single const_int RTX. >> >> I realise this might be seen as a good expedient fix, but it makes >> me a bit uneasy. =C2=A0Not a very constructive rationale, sorry. >> >> For this particular case, the problem is that vst2q_s32 and the >> like initialise a union directly: >> >> =C2=A0union { int32x4x2_t __i; __builtin_neon_oi __o; } __bu =3D { __b; = }; >> >> and this gets translated into a zeroing of the whole union followed >> by an assignment to __i: >> >> =C2=A0__bu =3D {}; >> =C2=A0__bu.__i =3D __b; > > Btw, this looks like a missed optimization in gimplification. Worth > a bugreport (or even a fix). Might be a target but as well, dependent > on how __builtin_neon_oi looks like. Do you have a complete testcase > that reproduces the above with a cross? Yeah, build cc1 for arm-linux-gnueabi and compile the attached testcase (from Chung-Lin) using: -O2 -g -mfpu=3Dneon -mfloat-abi=3Dsoftfp Rchard --=-=-= Content-Disposition: inline; filename=pr48183.c Content-length: 639 /* { dg-do compile } */ /* { dg-require-effective-target arm_neon_ok } */ /* { dg-options "-O -g" } */ /* { dg-add-options arm_neon } */ #include void move_16bit_to_32bit (int32_t *dst, const short *src, unsigned n) { unsigned i; int16x4x2_t input; int32x4x2_t mid; int32x4x2_t output; for (i = 0; i < n/2; i += 8) { input = vld2_s16(src + i); mid.val[0] = vmovl_s16(input.val[0]); mid.val[1] = vmovl_s16(input.val[1]); output.val[0] = vshlq_n_s32(mid.val[0], 8); output.val[1] = vshlq_n_s32(mid.val[1], 8); vst2q_s32((int32_t *)dst + i, output); } } --=-=-=--