* [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g
@ 2011-03-20 11:02 Chung-Lin Tang
2011-03-20 11:11 ` Richard Guenther
2011-03-24 10:57 ` Richard Sandiford
0 siblings, 2 replies; 10+ messages in thread
From: Chung-Lin Tang @ 2011-03-20 11:02 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 819 bytes --]
Hi,
PR48183 is a case where ARM NEON instrinsics, under -O -g, produce debug
insns that tries to expand OImode (32-byte integer) zero constants, much
too large to represent as two HOST_WIDE_INTs; as the internals manual
indicates, such large constants are not supported in general, and ICEs
on the GET_MODE_BITSIZE(mode) == 2*HOST_BITS_PER_WIDE_INT assertion.
This patch allows the cases where the large integer constant is still
representable using a single CONST_INT, such as zero(0). Bootstrapped
and tested on i686 and x86_64, cross-tested on ARM, all without
regressions. Okay for trunk?
Thanks,
Chung-Lin
2011-03-20 Chung-Lin Tang <cltang@codesourcery.com>
* emit-rtl.c (immed_double_const): Allow wider than
2*HOST_BITS_PER_WIDE_INT mode constants when they are
representable as a single const_int RTX.
[-- Attachment #2: pr48183.diff --]
[-- Type: text/plain, Size: 1558 bytes --]
Index: emit-rtl.c
===================================================================
--- emit-rtl.c (revision 171074)
+++ emit-rtl.c (working copy)
@@ -547,6 +547,13 @@
if (GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_WIDE_INT)
return gen_int_mode (i0, mode);
+ /* For modes larger than 2 * HOST_BITS_PER_WIDE_INT, the integer may
+ still be representable if it fits in one word. For other cases,
+ assert fail below. */
+ if (GET_MODE_BITSIZE (mode) > 2 * HOST_BITS_PER_WIDE_INT
+ && ((i1 == 0 && i0 >= 0) || (i1 == ~0 && i0 < 0)))
+ return GEN_INT (i0);
+
gcc_assert (GET_MODE_BITSIZE (mode) == 2 * HOST_BITS_PER_WIDE_INT);
}
Index: testsuite/gcc.target/arm/pr48183.c
===================================================================
--- testsuite/gcc.target/arm/pr48183.c (revision 0)
+++ testsuite/gcc.target/arm/pr48183.c (revision 0)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O -g" } */
+/* { dg-add-options arm_neon } */
+
+#include <arm_neon.h>
+
+void move_16bit_to_32bit (int32_t *dst, const short *src, unsigned n)
+{
+ unsigned i;
+ int16x4x2_t input;
+ int32x4x2_t mid;
+ int32x4x2_t output;
+
+ for (i = 0; i < n/2; i += 8) {
+ input = vld2_s16(src + i);
+ mid.val[0] = vmovl_s16(input.val[0]);
+ mid.val[1] = vmovl_s16(input.val[1]);
+ output.val[0] = vshlq_n_s32(mid.val[0], 8);
+ output.val[1] = vshlq_n_s32(mid.val[1], 8);
+ vst2q_s32((int32_t *)dst + i, output);
+ }
+}
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g
2011-03-20 11:02 [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g Chung-Lin Tang
@ 2011-03-20 11:11 ` Richard Guenther
2011-03-21 12:47 ` Chung-Lin Tang
2011-03-24 10:57 ` Richard Sandiford
1 sibling, 1 reply; 10+ messages in thread
From: Richard Guenther @ 2011-03-20 11:11 UTC (permalink / raw)
To: Chung-Lin Tang; +Cc: gcc-patches
On Sun, Mar 20, 2011 at 12:01 PM, Chung-Lin Tang
<cltang@codesourcery.com> wrote:
> Hi,
> PR48183 is a case where ARM NEON instrinsics, under -O -g, produce debug
> insns that tries to expand OImode (32-byte integer) zero constants, much
> too large to represent as two HOST_WIDE_INTs; as the internals manual
> indicates, such large constants are not supported in general, and ICEs
> on the GET_MODE_BITSIZE(mode) == 2*HOST_BITS_PER_WIDE_INT assertion.
>
> This patch allows the cases where the large integer constant is still
> representable using a single CONST_INT, such as zero(0). Bootstrapped
> and tested on i686 and x86_64, cross-tested on ARM, all without
> regressions. Okay for trunk?
Hum, if you make it fit into a CONST_INT why not make it fit into a CONST_DOUBLE
where it should _always_ fit because of the limits of the arguments given to
immed_double_const. Thus, why not simply remove the assert - also a question
for other reviewers? Or rather, why does immed_double_const have a mode
argument and does not require the caller to properly truncate?
Thanks,
Richard.
> Thanks,
> Chung-Lin
>
> 2011-03-20 Chung-Lin Tang <cltang@codesourcery.com>
>
> * emit-rtl.c (immed_double_const): Allow wider than
> 2*HOST_BITS_PER_WIDE_INT mode constants when they are
> representable as a single const_int RTX.
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g
2011-03-20 11:11 ` Richard Guenther
@ 2011-03-21 12:47 ` Chung-Lin Tang
0 siblings, 0 replies; 10+ messages in thread
From: Chung-Lin Tang @ 2011-03-21 12:47 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches
On 2011/3/20 08:11 PM, Richard Guenther wrote:
> On Sun, Mar 20, 2011 at 12:01 PM, Chung-Lin Tang
> <cltang@codesourcery.com> wrote:
>> Hi,
>> PR48183 is a case where ARM NEON instrinsics, under -O -g, produce debug
>> insns that tries to expand OImode (32-byte integer) zero constants, much
>> too large to represent as two HOST_WIDE_INTs; as the internals manual
>> indicates, such large constants are not supported in general, and ICEs
>> on the GET_MODE_BITSIZE(mode) == 2*HOST_BITS_PER_WIDE_INT assertion.
>>
>> This patch allows the cases where the large integer constant is still
>> representable using a single CONST_INT, such as zero(0). Bootstrapped
>> and tested on i686 and x86_64, cross-tested on ARM, all without
>> regressions. Okay for trunk?
>
> Hum, if you make it fit into a CONST_INT why not make it fit into a CONST_DOUBLE
> where it should _always_ fit because of the limits of the arguments given to
> immed_double_const. Thus, why not simply remove the assert - also a question
Well you're right, simply removing the assertion might work as well; my
patch sort of tried to solve the specific case at hand conservatively...
> for other reviewers? Or rather, why does immed_double_const have a mode
> argument and does not require the caller to properly truncate?
I guess the existence of such large integer modes are probably
relatively new, and usually not directly used this way?
Thanks,
Chung-Lin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g
2011-03-20 11:02 [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g Chung-Lin Tang
2011-03-20 11:11 ` Richard Guenther
@ 2011-03-24 10:57 ` Richard Sandiford
2011-03-24 13:49 ` Julian Brown
2011-03-29 11:12 ` Richard Guenther
1 sibling, 2 replies; 10+ messages in thread
From: Richard Sandiford @ 2011-03-24 10:57 UTC (permalink / raw)
To: Chung-Lin Tang; +Cc: gcc-patches
Chung-Lin Tang <cltang@codesourcery.com> writes:
> PR48183 is a case where ARM NEON instrinsics, under -O -g, produce debug
> insns that tries to expand OImode (32-byte integer) zero constants, much
> too large to represent as two HOST_WIDE_INTs; as the internals manual
> indicates, such large constants are not supported in general, and ICEs
> on the GET_MODE_BITSIZE(mode) == 2*HOST_BITS_PER_WIDE_INT assertion.
>
> This patch allows the cases where the large integer constant is still
> representable using a single CONST_INT, such as zero(0). Bootstrapped
> and tested on i686 and x86_64, cross-tested on ARM, all without
> regressions. Okay for trunk?
>
> Thanks,
> Chung-Lin
>
> 2011-03-20 Chung-Lin Tang <cltang@codesourcery.com>
>
> * emit-rtl.c (immed_double_const): Allow wider than
> 2*HOST_BITS_PER_WIDE_INT mode constants when they are
> representable as a single const_int RTX.
I realise this might be seen as a good expedient fix, but it makes
me a bit uneasy. Not a very constructive rationale, sorry.
For this particular case, the problem is that vst2q_s32 and the
like initialise a union directly:
union { int32x4x2_t __i; __builtin_neon_oi __o; } __bu = { __b; };
and this gets translated into a zeroing of the whole union followed
by an assignment to __i:
__bu = {};
__bu.__i = __b;
We later optimise away the first assignment, but it still exists
in the debug info.
Another expedient fix might be to replace these initialisations with:
union { int32x4x2_t __i; __builtin_neon_oi __o; } __bu;
__bu.__i = __b;
so that we never get a zeroing statement.
I realise both "fixes" are papering over the real problem. What we really
need is arbitrary-length constant integers, like we already have for vectors.
But that's going to be a much bigger patch. It just seems to me that,
if we're going for a work-around, the arm_neon.h change is neutral,
while changing immed_double_const feels more risky.
Richard
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g
2011-03-24 10:57 ` Richard Sandiford
@ 2011-03-24 13:49 ` Julian Brown
2011-03-29 10:56 ` Richard Sandiford
2011-03-29 11:12 ` Richard Guenther
1 sibling, 1 reply; 10+ messages in thread
From: Julian Brown @ 2011-03-24 13:49 UTC (permalink / raw)
To: Richard Sandiford; +Cc: Chung-Lin Tang, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1489 bytes --]
On Thu, 24 Mar 2011 10:57:06 +0000
Richard Sandiford <richard.sandiford@linaro.org> wrote:
> Chung-Lin Tang <cltang@codesourcery.com> writes:
> > PR48183 is a case where ARM NEON instrinsics, under -O -g, produce
> > debug insns that tries to expand OImode (32-byte integer) zero
> > constants, much too large to represent as two HOST_WIDE_INTs; as
> > the internals manual indicates, such large constants are not
> > supported in general, and ICEs on the GET_MODE_BITSIZE(mode) ==
> > 2*HOST_BITS_PER_WIDE_INT assertion.
> >
> > This patch allows the cases where the large integer constant is
> > still representable using a single CONST_INT, such as zero(0).
> > Bootstrapped and tested on i686 and x86_64, cross-tested on ARM,
> > all without regressions. Okay for trunk?
> >
> > Thanks,
> > Chung-Lin
> >
> > 2011-03-20 Chung-Lin Tang <cltang@codesourcery.com>
> >
> > * emit-rtl.c (immed_double_const): Allow wider than
> > 2*HOST_BITS_PER_WIDE_INT mode constants when they are
> > representable as a single const_int RTX.
>
> I realise this might be seen as a good expedient fix, but it makes
> me a bit uneasy. Not a very constructive rationale, sorry.
FWIW I also had a "fix" for this issue, which is equivalent to
Chung-Lin's patch apart from only allowing constant-zero (attached).
That's not really a vote from me for this approach, but maybe limiting
the extent to which we pretend to support wide-integer constants like
this is sensible, if we do go that way.
Julian
[-- Attachment #2: neon-wide-zero-expansion-2.diff --]
[-- Type: text/x-patch, Size: 677 bytes --]
--- gcc/expr.c (revision 314639)
+++ gcc/expr.c (working copy)
@@ -8458,6 +8458,18 @@ expand_expr_real_1 (tree exp, rtx target
return decl_rtl;
case INTEGER_CST:
+ if (GET_MODE_BITSIZE (mode) > 2 * HOST_BITS_PER_WIDE_INT)
+ {
+ /* FIXME: We can't generally represent wide integer constants,
+ but GCC sometimes tries to initialise wide integer values (such
+ as used by the ARM NEON support) with zero. Handle that as a
+ special case here. */
+ if (initializer_zerop (exp))
+ return CONST0_RTX (mode);
+
+ gcc_unreachable ();
+ }
+
temp = immed_double_const (TREE_INT_CST_LOW (exp),
TREE_INT_CST_HIGH (exp), mode);
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g
2011-03-24 13:49 ` Julian Brown
@ 2011-03-29 10:56 ` Richard Sandiford
2011-03-30 11:57 ` Julian Brown
0 siblings, 1 reply; 10+ messages in thread
From: Richard Sandiford @ 2011-03-29 10:56 UTC (permalink / raw)
To: Julian Brown; +Cc: Chung-Lin Tang, gcc-patches
Julian Brown <julian@codesourcery.com> writes:
> On Thu, 24 Mar 2011 10:57:06 +0000
> Richard Sandiford <richard.sandiford@linaro.org> wrote:
>
>> Chung-Lin Tang <cltang@codesourcery.com> writes:
>> > PR48183 is a case where ARM NEON instrinsics, under -O -g, produce
>> > debug insns that tries to expand OImode (32-byte integer) zero
>> > constants, much too large to represent as two HOST_WIDE_INTs; as
>> > the internals manual indicates, such large constants are not
>> > supported in general, and ICEs on the GET_MODE_BITSIZE(mode) ==
>> > 2*HOST_BITS_PER_WIDE_INT assertion.
>> >
>> > This patch allows the cases where the large integer constant is
>> > still representable using a single CONST_INT, such as zero(0).
>> > Bootstrapped and tested on i686 and x86_64, cross-tested on ARM,
>> > all without regressions. Okay for trunk?
>> >
>> > Thanks,
>> > Chung-Lin
>> >
>> > 2011-03-20 Chung-Lin Tang <cltang@codesourcery.com>
>> >
>> > * emit-rtl.c (immed_double_const): Allow wider than
>> > 2*HOST_BITS_PER_WIDE_INT mode constants when they are
>> > representable as a single const_int RTX.
>>
>> I realise this might be seen as a good expedient fix, but it makes
>> me a bit uneasy. Not a very constructive rationale, sorry.
>
> FWIW I also had a "fix" for this issue, which is equivalent to
> Chung-Lin's patch apart from only allowing constant-zero (attached).
> That's not really a vote from me for this approach, but maybe limiting
> the extent to which we pretend to support wide-integer constants like
> this is sensible, if we do go that way.
So that each suggestion has a patch, here's mine. I've snipped the
regenerated file from the testcase, but to get a flavour, the changes
are like this:
@@ -6099,63 +6099,72 @@ vtbl1_p8 (poly8x8_t __a, uint8x8_t __b)
__extension__ static __inline int8x8_t __attribute__ ((__always_inline__))
vtbl2_s8 (int8x8x2_t __a, int8x8_t __b)
{
- union { int8x8x2_t __i; __builtin_neon_ti __o; } __au = { __a };
+ union { int8x8x2_t __i; __builtin_neon_ti __o; } __au;
+ __au.__i = __a;
return (int8x8_t)__builtin_neon_vtbl2v8qi (__au.__o, __b);
}
Of course, Chung-Lin's testcase should be included whichever way we go.
Richard
gcc/
* config/arm/neon-gen.ml (params): Separate the union initialisation
from the union itself. Return a list of statements.
(print_variant): Update call accordingly. Add the returned statements
before the ones returned by "return".
* config/arm/arm_neon.h: Regenerate.
Index: gcc/config/arm/neon-gen.ml
===================================================================
--- gcc/config/arm/neon-gen.ml 2009-11-11 15:08:40.000000000 +0000
+++ gcc/config/arm/neon-gen.ml 2011-03-29 11:37:52.000000000 +0100
@@ -165,12 +165,14 @@ let rec element_type ctype =
let params return_by_ptr ps =
let pdecls = ref [] in
+ let pstmts = ref [] in
let ptype t p =
match t with
T_arrayof (num, elts) ->
- let uname = union_string num elts (p ^ "u") in
- let decl = Printf.sprintf "%s = { %s };" uname p in
+ let decl = union_string num elts (p ^ "u") ^ ";" in
+ let assignment = p ^ "u.__i = " ^ p ^ ";" in
pdecls := decl :: !pdecls;
+ pstmts := assignment :: !pstmts;
p ^ "u.__o"
| _ -> add_cast t p in
let plist = match ps with
@@ -183,10 +185,11 @@ let params return_by_ptr ps =
match ps with
Arity0 ret | Arity1 (ret, _) | Arity2 (ret, _, _) | Arity3 (ret, _, _, _)
| Arity4 (ret, _, _, _, _) ->
+ !pdecls, !pstmts,
if return_by_ptr then
- !pdecls, add_cast (T_ptrto (element_type ret)) "&__rv.val[0]" :: plist
+ add_cast (T_ptrto (element_type ret)) "&__rv.val[0]" :: plist
else
- !pdecls, plist
+ plist
let modify_params features plist =
let is_flipped =
@@ -243,14 +246,14 @@ let print_variant opcode features shape
let bits = infoword_value elttype features in
let modesuf = mode_suffix elttype shape in
let return_by_ptr = return_by_ptr features in
- let pdecls, paramlist = params return_by_ptr ctype in
+ let pdecls, pstmts, paramlist = params return_by_ptr ctype in
let paramlist' = modify_params features paramlist in
let paramlist'' = extra_word shape features paramlist' bits in
let parstr = String.concat ", " paramlist'' in
let builtin = Printf.sprintf "__builtin_neon_%s%s (%s)"
(builtin_name features name) modesuf parstr in
- let rdecls, stmts = return ctype return_by_ptr builtin in
- let body = pdecls @ rdecls @ stmts
+ let rdecls, rstmts = return ctype return_by_ptr builtin in
+ let body = pdecls @ rdecls @ pstmts @ rstmts
and fnname = (intrinsic_name name) ^ "_" ^ (string_of_elt elttype) in
print_function ctype fnname body
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g
2011-03-24 10:57 ` Richard Sandiford
2011-03-24 13:49 ` Julian Brown
@ 2011-03-29 11:12 ` Richard Guenther
2011-03-29 12:15 ` Richard Sandiford
1 sibling, 1 reply; 10+ messages in thread
From: Richard Guenther @ 2011-03-29 11:12 UTC (permalink / raw)
To: Chung-Lin Tang, gcc-patches, richard.sandiford
On Thu, Mar 24, 2011 at 11:57 AM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Chung-Lin Tang <cltang@codesourcery.com> writes:
>> PR48183 is a case where ARM NEON instrinsics, under -O -g, produce debug
>> insns that tries to expand OImode (32-byte integer) zero constants, much
>> too large to represent as two HOST_WIDE_INTs; as the internals manual
>> indicates, such large constants are not supported in general, and ICEs
>> on the GET_MODE_BITSIZE(mode) == 2*HOST_BITS_PER_WIDE_INT assertion.
>>
>> This patch allows the cases where the large integer constant is still
>> representable using a single CONST_INT, such as zero(0). Bootstrapped
>> and tested on i686 and x86_64, cross-tested on ARM, all without
>> regressions. Okay for trunk?
>>
>> Thanks,
>> Chung-Lin
>>
>> 2011-03-20 Chung-Lin Tang <cltang@codesourcery.com>
>>
>> * emit-rtl.c (immed_double_const): Allow wider than
>> 2*HOST_BITS_PER_WIDE_INT mode constants when they are
>> representable as a single const_int RTX.
>
> I realise this might be seen as a good expedient fix, but it makes
> me a bit uneasy. Not a very constructive rationale, sorry.
>
> For this particular case, the problem is that vst2q_s32 and the
> like initialise a union directly:
>
> union { int32x4x2_t __i; __builtin_neon_oi __o; } __bu = { __b; };
>
> and this gets translated into a zeroing of the whole union followed
> by an assignment to __i:
>
> __bu = {};
> __bu.__i = __b;
Btw, this looks like a missed optimization in gimplification. Worth
a bugreport (or even a fix). Might be a target but as well, dependent
on how __builtin_neon_oi looks like. Do you have a complete testcase
that reproduces the above with a cross?
Richard.
> We later optimise away the first assignment, but it still exists
> in the debug info.
>
> Another expedient fix might be to replace these initialisations with:
>
> union { int32x4x2_t __i; __builtin_neon_oi __o; } __bu;
> __bu.__i = __b;
>
> so that we never get a zeroing statement.
>
> I realise both "fixes" are papering over the real problem. What we really
> need is arbitrary-length constant integers, like we already have for vectors.
> But that's going to be a much bigger patch. It just seems to me that,
> if we're going for a work-around, the arm_neon.h change is neutral,
> while changing immed_double_const feels more risky.
>
> Richard
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g
2011-03-29 11:12 ` Richard Guenther
@ 2011-03-29 12:15 ` Richard Sandiford
2011-03-29 13:18 ` Richard Guenther
0 siblings, 1 reply; 10+ messages in thread
From: Richard Sandiford @ 2011-03-29 12:15 UTC (permalink / raw)
To: Richard Guenther; +Cc: Chung-Lin Tang, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 2030 bytes --]
Richard Guenther <richard.guenther@gmail.com> writes:
> On Thu, Mar 24, 2011 at 11:57 AM, Richard Sandiford
> <richard.sandiford@linaro.org> wrote:
>> Chung-Lin Tang <cltang@codesourcery.com> writes:
>>> PR48183 is a case where ARM NEON instrinsics, under -O -g, produce debug
>>> insns that tries to expand OImode (32-byte integer) zero constants, much
>>> too large to represent as two HOST_WIDE_INTs; as the internals manual
>>> indicates, such large constants are not supported in general, and ICEs
>>> on the GET_MODE_BITSIZE(mode) == 2*HOST_BITS_PER_WIDE_INT assertion.
>>>
>>> This patch allows the cases where the large integer constant is still
>>> representable using a single CONST_INT, such as zero(0). Bootstrapped
>>> and tested on i686 and x86_64, cross-tested on ARM, all without
>>> regressions. Okay for trunk?
>>>
>>> Thanks,
>>> Chung-Lin
>>>
>>> 2011-03-20 Chung-Lin Tang <cltang@codesourcery.com>
>>>
>>> * emit-rtl.c (immed_double_const): Allow wider than
>>> 2*HOST_BITS_PER_WIDE_INT mode constants when they are
>>> representable as a single const_int RTX.
>>
>> I realise this might be seen as a good expedient fix, but it makes
>> me a bit uneasy. Not a very constructive rationale, sorry.
>>
>> For this particular case, the problem is that vst2q_s32 and the
>> like initialise a union directly:
>>
>> union { int32x4x2_t __i; __builtin_neon_oi __o; } __bu = { __b; };
>>
>> and this gets translated into a zeroing of the whole union followed
>> by an assignment to __i:
>>
>> __bu = {};
>> __bu.__i = __b;
>
> Btw, this looks like a missed optimization in gimplification. Worth
> a bugreport (or even a fix). Might be a target but as well, dependent
> on how __builtin_neon_oi looks like. Do you have a complete testcase
> that reproduces the above with a cross?
Yeah, build cc1 for arm-linux-gnueabi and compile the attached
testcase (from Chung-Lin) using:
-O2 -g -mfpu=neon -mfloat-abi=softfp
Rchard
[-- Attachment #2: pr48183.c --]
[-- Type: text/plain, Size: 639 bytes --]
/* { dg-do compile } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-O -g" } */
/* { dg-add-options arm_neon } */
#include <arm_neon.h>
void move_16bit_to_32bit (int32_t *dst, const short *src, unsigned n)
{
unsigned i;
int16x4x2_t input;
int32x4x2_t mid;
int32x4x2_t output;
for (i = 0; i < n/2; i += 8) {
input = vld2_s16(src + i);
mid.val[0] = vmovl_s16(input.val[0]);
mid.val[1] = vmovl_s16(input.val[1]);
output.val[0] = vshlq_n_s32(mid.val[0], 8);
output.val[1] = vshlq_n_s32(mid.val[1], 8);
vst2q_s32((int32_t *)dst + i, output);
}
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g
2011-03-29 12:15 ` Richard Sandiford
@ 2011-03-29 13:18 ` Richard Guenther
0 siblings, 0 replies; 10+ messages in thread
From: Richard Guenther @ 2011-03-29 13:18 UTC (permalink / raw)
To: Richard Guenther, Chung-Lin Tang, gcc-patches, richard.sandiford
On Tue, Mar 29, 2011 at 1:52 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Guenther <richard.guenther@gmail.com> writes:
>> On Thu, Mar 24, 2011 at 11:57 AM, Richard Sandiford
>> <richard.sandiford@linaro.org> wrote:
>>> Chung-Lin Tang <cltang@codesourcery.com> writes:
>>>> PR48183 is a case where ARM NEON instrinsics, under -O -g, produce debug
>>>> insns that tries to expand OImode (32-byte integer) zero constants, much
>>>> too large to represent as two HOST_WIDE_INTs; as the internals manual
>>>> indicates, such large constants are not supported in general, and ICEs
>>>> on the GET_MODE_BITSIZE(mode) == 2*HOST_BITS_PER_WIDE_INT assertion.
>>>>
>>>> This patch allows the cases where the large integer constant is still
>>>> representable using a single CONST_INT, such as zero(0). Bootstrapped
>>>> and tested on i686 and x86_64, cross-tested on ARM, all without
>>>> regressions. Okay for trunk?
>>>>
>>>> Thanks,
>>>> Chung-Lin
>>>>
>>>> 2011-03-20 Chung-Lin Tang <cltang@codesourcery.com>
>>>>
>>>> * emit-rtl.c (immed_double_const): Allow wider than
>>>> 2*HOST_BITS_PER_WIDE_INT mode constants when they are
>>>> representable as a single const_int RTX.
>>>
>>> I realise this might be seen as a good expedient fix, but it makes
>>> me a bit uneasy. Not a very constructive rationale, sorry.
>>>
>>> For this particular case, the problem is that vst2q_s32 and the
>>> like initialise a union directly:
>>>
>>> union { int32x4x2_t __i; __builtin_neon_oi __o; } __bu = { __b; };
>>>
>>> and this gets translated into a zeroing of the whole union followed
>>> by an assignment to __i:
>>>
>>> __bu = {};
>>> __bu.__i = __b;
>>
>> Btw, this looks like a missed optimization in gimplification. Worth
>> a bugreport (or even a fix). Might be a target but as well, dependent
>> on how __builtin_neon_oi looks like. Do you have a complete testcase
>> that reproduces the above with a cross?
>
> Yeah, build cc1 for arm-linux-gnueabi and compile the attached
> testcase (from Chung-Lin) using:
>
> -O2 -g -mfpu=neon -mfloat-abi=softfp
It seems that count_type_elements is confused by unions and thus clearing
is always performed. I fail to see why count_type_elements could not
simply return 1 for all unions (non-initialized parts have undefined rather
than zero content).
categorize_ctor_elements also counts 8 elements for some reason.
The following fixes it for me:
Index: gcc/expr.c
===================================================================
--- gcc/expr.c (revision 171606)
+++ gcc/expr.c (working copy)
@@ -5059,7 +5059,7 @@ count_type_elements (const_tree type, bo
case UNION_TYPE:
case QUAL_UNION_TYPE:
- return -1;
+ return 1;
case COMPLEX_TYPE:
return 2;
disclaimer: completely untested, might confuse the hell out of
output_init_constructor and friends.
Richard.
> Rchard
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g
2011-03-29 10:56 ` Richard Sandiford
@ 2011-03-30 11:57 ` Julian Brown
0 siblings, 0 replies; 10+ messages in thread
From: Julian Brown @ 2011-03-30 11:57 UTC (permalink / raw)
To: Richard Sandiford; +Cc: Chung-Lin Tang, gcc-patches
On Tue, 29 Mar 2011 11:45:58 +0100
Richard Sandiford <richard.sandiford@linaro.org> wrote:
> Julian Brown <julian@codesourcery.com> writes:
> > On Thu, 24 Mar 2011 10:57:06 +0000
> > Richard Sandiford <richard.sandiford@linaro.org> wrote:
> >
> >> Chung-Lin Tang <cltang@codesourcery.com> writes:
> >> > PR48183 is a case where ARM NEON instrinsics, under -O -g,
> >> > produce debug insns that tries to expand OImode (32-byte
> >> > integer) zero constants, much too large to represent as two
> >> > HOST_WIDE_INTs; as the internals manual indicates, such large
> >> > constants are not supported in general, and ICEs on the
> >> > GET_MODE_BITSIZE(mode) == 2*HOST_BITS_PER_WIDE_INT assertion.
> >> >
> >> > This patch allows the cases where the large integer constant is
> >> > still representable using a single CONST_INT, such as zero(0).
> >> > Bootstrapped and tested on i686 and x86_64, cross-tested on ARM,
> >> > all without regressions. Okay for trunk?
> >> >
> >> > Thanks,
> >> > Chung-Lin
> >> >
> >> > 2011-03-20 Chung-Lin Tang <cltang@codesourcery.com>
> >> >
> >> > * emit-rtl.c (immed_double_const): Allow wider than
> >> > 2*HOST_BITS_PER_WIDE_INT mode constants when they are
> >> > representable as a single const_int RTX.
> >>
> >> I realise this might be seen as a good expedient fix, but it makes
> >> me a bit uneasy. Not a very constructive rationale, sorry.
> >
> > FWIW I also had a "fix" for this issue, which is equivalent to
> > Chung-Lin's patch apart from only allowing constant-zero (attached).
> > That's not really a vote from me for this approach, but maybe
> > limiting the extent to which we pretend to support wide-integer
> > constants like this is sensible, if we do go that way.
>
> So that each suggestion has a patch, here's mine. I've snipped the
> regenerated file from the testcase, but to get a flavour, the changes
> are like this:
The O'Caml changes look fine, FWIW (I can't officially approve them, of
course) -- and I certainly don't have a problem with avoiding my or
Chung-Lin's hacks, though of course it'd be nice to get rid of the
wide-integer constant "problem" properly at some point.
Cheers,
Julian
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-03-30 11:21 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-20 11:02 [patch] Fix PR48183, NEON ICE in emit-rtl.c:immed_double_const() under -g Chung-Lin Tang
2011-03-20 11:11 ` Richard Guenther
2011-03-21 12:47 ` Chung-Lin Tang
2011-03-24 10:57 ` Richard Sandiford
2011-03-24 13:49 ` Julian Brown
2011-03-29 10:56 ` Richard Sandiford
2011-03-30 11:57 ` Julian Brown
2011-03-29 11:12 ` Richard Guenther
2011-03-29 12:15 ` Richard Sandiford
2011-03-29 13:18 ` Richard Guenther
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).