public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] fix arm neon ICE by widening tree_type's precision field
@ 2009-06-08 15:32 Nathan Froyd
  2009-06-08 17:11 ` Richard Guenther
                   ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Nathan Froyd @ 2009-06-08 15:32 UTC (permalink / raw)
  To: gcc-patches

ARM's NEON support in GCC includes some rather wide types: XImode, the
biggest, is a 64-byte integer type.  Compiling the simple testcase in
the patch below, which uses XImode, results in an ICE.

The reason behind this is because XImode is defined as a
FRACTIONAL_INT_MODE of 511 bits, rather than an INT_MODE of 64 bytes.
This hack was done because the 'precision' field of tree_type is only 9
bits wide...and therefore can't store XImode's actual precision of 512
bits.  So when we ask mode_for_size for a MODE_INT of 512 bits, it
thinks that the only reasonable choice is BLKmode--since XImode is not
a MODE_INT mode.  We eventually wind up calling simplify_subreg with the
input mode of BLKmode, which triggers an ICE.

The fix is to widen the 'precision' field of tree_type to 10 bits.
Unfortunately, there's not much space in tree_type to widen that field;
the structure is nicely packed for 32-bit and 64-bit hosts.  The
approach I've taken in the patch below is to move packed_flag into
tree_base and rearrange the bitfields in tree_type.  I realize this is
slightly gross, but I don't see a better way--suggestions welcome.

Tested with cross to arm-none-eabi.  OK to commit?  (Need middle-end
approval for the tree.h changes and ARM approval for arm-modes.def.)

-Nathan

2009-06-08  Nathan Froyd  <froydnj@codesourcery.com>

	* tree.h (tree_base): Add packed_flag, moved from...
	(tree_type): ...here.  Widen precision field to 10 bits and
	rearrange fields to pack nicely.
	(TYPE_PACKED): Update for new location of packed_flag.
	* config/arm/arm-modes.def (XImode): Define as a INT_MODE.

Index: config/arm/arm-modes.def
===================================================================
--- config/arm/arm-modes.def	(revision 148220)
+++ config/arm/arm-modes.def	(working copy)
@@ -62,6 +62,4 @@ VECTOR_MODES (FLOAT, 16);     /*       V
 INT_MODE (EI, 24);
 INT_MODE (OI, 32);
 INT_MODE (CI, 48);
-/* ??? This should actually have 512 bits but the precision only has 9
-   bits.  */
-FRACTIONAL_INT_MODE (XI, 511, 64);
+INT_MODE (XI, 64);
Index: testsuite/ChangeLog
===================================================================
--- testsuite/ChangeLog	(revision 148220)
+++ testsuite/ChangeLog	(working copy)
@@ -1,3 +1,7 @@
+2009-06-08  Nathan Froyd  <froydnj@codesourcery.com>
+
+	* gcc.target/arm/neon-dse-1.c: New test.
+
 2009-06-05  Jakub Jelinek  <jakub@redhat.com>
 
 	PR middle-end/40340
Index: testsuite/gcc.target/arm/neon-dse-1.c
===================================================================
--- testsuite/gcc.target/arm/neon-dse-1.c	(revision 0)
+++ testsuite/gcc.target/arm/neon-dse-1.c	(revision 0)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O1" } */
+/* { dg-add-options arm_neon } */
+
+#include <arm_neon.h>
+
+void neon_internal_error(int *dst, int *src)
+{
+  uint16x8x4_t sval;
+
+  sval = vld4q_u16((void *)src);
+  vst4q_u16((void *)dst,sval);
+}
Index: tree.h
===================================================================
--- tree.h	(revision 148220)
+++ tree.h	(working copy)
@@ -373,8 +373,10 @@ struct GTY(()) tree_base {
   unsigned lang_flag_6 : 1;
 
   unsigned visited : 1;
+  /* For tree_type.  */
+  unsigned packed_flag : 1;
 
-  unsigned spare : 23;
+  unsigned spare : 22;
 
   union tree_ann_d *ann;
 };
@@ -2219,7 +2221,7 @@ extern enum machine_mode vector_type_mod
 
 /* Indicated that objects of this type should be laid out in as
    compact a way as possible.  */
-#define TYPE_PACKED(NODE) (TYPE_CHECK (NODE)->type.packed_flag)
+#define TYPE_PACKED(NODE) (TYPE_CHECK (NODE)->common.base.packed_flag)
 
 /* Used by type_contains_placeholder_p to avoid recomputation.
    Values are: 0 (unknown), 1 (false), 2 (true).  Never access
@@ -2237,17 +2239,16 @@ struct GTY(()) tree_type {
   tree attributes;
   unsigned int uid;
 
-  unsigned int precision : 9;
-  ENUM_BITFIELD(machine_mode) mode : 7;
-
-  unsigned string_flag : 1;
+  unsigned int precision : 10;
   unsigned no_force_blk_flag : 1;
   unsigned needs_constructing_flag : 1;
   unsigned transparent_union_flag : 1;
-  unsigned packed_flag : 1;
   unsigned restrict_flag : 1;
   unsigned contains_placeholder_bits : 2;
 
+  ENUM_BITFIELD(machine_mode) mode : 7;
+  unsigned string_flag : 1;
+
   unsigned lang_flag_0 : 1;
   unsigned lang_flag_1 : 1;
   unsigned lang_flag_2 : 1;

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-08 15:32 [PATCH] fix arm neon ICE by widening tree_type's precision field Nathan Froyd
@ 2009-06-08 17:11 ` Richard Guenther
  2009-06-08 18:04 ` Paolo Bonzini
  2009-06-08 19:40 ` [PATCH] fix arm neon ICE by widening tree_type's precision field Joseph S. Myers
  2 siblings, 0 replies; 39+ messages in thread
From: Richard Guenther @ 2009-06-08 17:11 UTC (permalink / raw)
  To: gcc-patches

On Mon, Jun 8, 2009 at 11:32 AM, Nathan Froyd<froydnj@codesourcery.com> wrote:
> ARM's NEON support in GCC includes some rather wide types: XImode, the
> biggest, is a 64-byte integer type.  Compiling the simple testcase in
> the patch below, which uses XImode, results in an ICE.
>
> The reason behind this is because XImode is defined as a
> FRACTIONAL_INT_MODE of 511 bits, rather than an INT_MODE of 64 bytes.
> This hack was done because the 'precision' field of tree_type is only 9
> bits wide...and therefore can't store XImode's actual precision of 512
> bits.  So when we ask mode_for_size for a MODE_INT of 512 bits, it
> thinks that the only reasonable choice is BLKmode--since XImode is not
> a MODE_INT mode.  We eventually wind up calling simplify_subreg with the
> input mode of BLKmode, which triggers an ICE.
>
> The fix is to widen the 'precision' field of tree_type to 10 bits.
> Unfortunately, there's not much space in tree_type to widen that field;
> the structure is nicely packed for 32-bit and 64-bit hosts.  The
> approach I've taken in the patch below is to move packed_flag into
> tree_base and rearrange the bitfields in tree_type.  I realize this is
> slightly gross, but I don't see a better way--suggestions welcome.

Ugh.

> Tested with cross to arm-none-eabi.  OK to commit?  (Need middle-end
> approval for the tree.h changes and ARM approval for arm-modes.def.)

Ok.

Thanks,
Richard.

> -Nathan
>
> 2009-06-08  Nathan Froyd  <froydnj@codesourcery.com>
>
>        * tree.h (tree_base): Add packed_flag, moved from...
>        (tree_type): ...here.  Widen precision field to 10 bits and
>        rearrange fields to pack nicely.
>        (TYPE_PACKED): Update for new location of packed_flag.
>        * config/arm/arm-modes.def (XImode): Define as a INT_MODE.
>
> Index: config/arm/arm-modes.def
> ===================================================================
> --- config/arm/arm-modes.def    (revision 148220)
> +++ config/arm/arm-modes.def    (working copy)
> @@ -62,6 +62,4 @@ VECTOR_MODES (FLOAT, 16);     /*       V
>  INT_MODE (EI, 24);
>  INT_MODE (OI, 32);
>  INT_MODE (CI, 48);
> -/* ??? This should actually have 512 bits but the precision only has 9
> -   bits.  */
> -FRACTIONAL_INT_MODE (XI, 511, 64);
> +INT_MODE (XI, 64);
> Index: testsuite/ChangeLog
> ===================================================================
> --- testsuite/ChangeLog (revision 148220)
> +++ testsuite/ChangeLog (working copy)
> @@ -1,3 +1,7 @@
> +2009-06-08  Nathan Froyd  <froydnj@codesourcery.com>
> +
> +       * gcc.target/arm/neon-dse-1.c: New test.
> +
>  2009-06-05  Jakub Jelinek  <jakub@redhat.com>
>
>        PR middle-end/40340
> Index: testsuite/gcc.target/arm/neon-dse-1.c
> ===================================================================
> --- testsuite/gcc.target/arm/neon-dse-1.c       (revision 0)
> +++ testsuite/gcc.target/arm/neon-dse-1.c       (revision 0)
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_neon_ok } */
> +/* { dg-options "-O1" } */
> +/* { dg-add-options arm_neon } */
> +
> +#include <arm_neon.h>
> +
> +void neon_internal_error(int *dst, int *src)
> +{
> +  uint16x8x4_t sval;
> +
> +  sval = vld4q_u16((void *)src);
> +  vst4q_u16((void *)dst,sval);
> +}
> Index: tree.h
> ===================================================================
> --- tree.h      (revision 148220)
> +++ tree.h      (working copy)
> @@ -373,8 +373,10 @@ struct GTY(()) tree_base {
>   unsigned lang_flag_6 : 1;
>
>   unsigned visited : 1;
> +  /* For tree_type.  */
> +  unsigned packed_flag : 1;
>
> -  unsigned spare : 23;
> +  unsigned spare : 22;
>
>   union tree_ann_d *ann;
>  };
> @@ -2219,7 +2221,7 @@ extern enum machine_mode vector_type_mod
>
>  /* Indicated that objects of this type should be laid out in as
>    compact a way as possible.  */
> -#define TYPE_PACKED(NODE) (TYPE_CHECK (NODE)->type.packed_flag)
> +#define TYPE_PACKED(NODE) (TYPE_CHECK (NODE)->common.base.packed_flag)
>
>  /* Used by type_contains_placeholder_p to avoid recomputation.
>    Values are: 0 (unknown), 1 (false), 2 (true).  Never access
> @@ -2237,17 +2239,16 @@ struct GTY(()) tree_type {
>   tree attributes;
>   unsigned int uid;
>
> -  unsigned int precision : 9;
> -  ENUM_BITFIELD(machine_mode) mode : 7;
> -
> -  unsigned string_flag : 1;
> +  unsigned int precision : 10;
>   unsigned no_force_blk_flag : 1;
>   unsigned needs_constructing_flag : 1;
>   unsigned transparent_union_flag : 1;
> -  unsigned packed_flag : 1;
>   unsigned restrict_flag : 1;
>   unsigned contains_placeholder_bits : 2;
>
> +  ENUM_BITFIELD(machine_mode) mode : 7;
> +  unsigned string_flag : 1;
> +
>   unsigned lang_flag_0 : 1;
>   unsigned lang_flag_1 : 1;
>   unsigned lang_flag_2 : 1;
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-08 15:32 [PATCH] fix arm neon ICE by widening tree_type's precision field Nathan Froyd
  2009-06-08 17:11 ` Richard Guenther
@ 2009-06-08 18:04 ` Paolo Bonzini
  2009-06-08 20:28   ` Nathan Froyd
  2009-06-08 19:40 ` [PATCH] fix arm neon ICE by widening tree_type's precision field Joseph S. Myers
  2 siblings, 1 reply; 39+ messages in thread
From: Paolo Bonzini @ 2009-06-08 18:04 UTC (permalink / raw)
  To: gcc-patches


> The fix is to widen the 'precision' field of tree_type to 10 bits.
> Unfortunately, there's not much space in tree_type to widen that field;
> the structure is nicely packed for 32-bit and 64-bit hosts.  The
> approach I've taken in the patch below is to move packed_flag into
> tree_base and rearrange the bitfields in tree_type.  I realize this is
> slightly gross, but I don't see a better way--suggestions welcome.

What about moving some language-specific bits from type to base?

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-08 15:32 [PATCH] fix arm neon ICE by widening tree_type's precision field Nathan Froyd
  2009-06-08 17:11 ` Richard Guenther
  2009-06-08 18:04 ` Paolo Bonzini
@ 2009-06-08 19:40 ` Joseph S. Myers
  2009-06-08 19:52   ` Daniel Jacobowitz
  2 siblings, 1 reply; 39+ messages in thread
From: Joseph S. Myers @ 2009-06-08 19:40 UTC (permalink / raw)
  To: Nathan Froyd; +Cc: gcc-patches

On Mon, 8 Jun 2009, Nathan Froyd wrote:

> ARM's NEON support in GCC includes some rather wide types: XImode, the
> biggest, is a 64-byte integer type.  Compiling the simple testcase in
> the patch below, which uses XImode, results in an ICE.

To be clear for those not already familiar with the NEON support: this is 
not an integer type in the sense of being able to carry out any arithmetic 
on it, it's a mode needed for the insn patterns for intrinsics for certain 
NEON instructions that load or store 64 bytes in consecutive NEON 
registers.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision  field
  2009-06-08 19:40 ` [PATCH] fix arm neon ICE by widening tree_type's precision field Joseph S. Myers
@ 2009-06-08 19:52   ` Daniel Jacobowitz
  2009-06-08 20:12     ` Andrew Pinski
  2009-06-08 20:20     ` Jakub Jelinek
  0 siblings, 2 replies; 39+ messages in thread
From: Daniel Jacobowitz @ 2009-06-08 19:52 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Nathan Froyd, gcc-patches

On Mon, Jun 08, 2009 at 07:40:45PM +0000, Joseph S. Myers wrote:
> On Mon, 8 Jun 2009, Nathan Froyd wrote:
> 
> > ARM's NEON support in GCC includes some rather wide types: XImode, the
> > biggest, is a 64-byte integer type.  Compiling the simple testcase in
> > the patch below, which uses XImode, results in an ICE.
> 
> To be clear for those not already familiar with the NEON support: this is 
> not an integer type in the sense of being able to carry out any arithmetic 
> on it, it's a mode needed for the insn patterns for intrinsics for certain 
> NEON instructions that load or store 64 bytes in consecutive NEON 
> registers.

Which, by the way, I'd love to be rid of.  I think we'd get better
NEON code if we could tell the compiler what we were actually doing.
Last time I looked at this I ended up way over my head, though.

NEON doesn't have any obvious instructions for vector interleaving
in the order the vectorizer wants.  It's probably possible with the
existing instructions, but it would be inefficient and un-NEON-like;
instead, there are special load instructions VLD2, VLD3, and VLD4.
These support loading interleaved vectors from memory directly to
consecutive vector registers.  You can use VLD3 to load 24 bytes into
either three consecutive registers (e.g. d0, d1, d2) or into three
registers with one-register gaps to accomodate 16-byte vectors
(e.g. vld3 into d0, d2, d4 with post-increment; then vld3 into
d1, d3, d5).

In order to get good code out of these, I think we'd need to represent
early on that the single gimple operation set three different vectors
(SSA?  What SSA?)  Also we'd need to somehow do sensible register
allocation for these constraints.

Instead, we do not support these in the vectorizer; use unions for
the intrinsics (which do not get scalarized, so perhaps the new SRA
will help here), and fake it with these huge partial modes during RTL
expansion.  See the XImode patterns in neon.md for examples.

Any ideas? :-)

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-08 19:52   ` Daniel Jacobowitz
@ 2009-06-08 20:12     ` Andrew Pinski
  2009-06-08 20:20     ` Jakub Jelinek
  1 sibling, 0 replies; 39+ messages in thread
From: Andrew Pinski @ 2009-06-08 20:12 UTC (permalink / raw)
  To: Joseph S. Myers, Nathan Froyd, gcc-patches

On Mon, Jun 8, 2009 at 3:52 PM, Daniel Jacobowitz<drow@false.org> wrote:
> Any ideas? :-)

Maybe adding support for non power of 2 vectors.  But that seems a big rewrite.

-- Pinski

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision  field
  2009-06-08 19:52   ` Daniel Jacobowitz
  2009-06-08 20:12     ` Andrew Pinski
@ 2009-06-08 20:20     ` Jakub Jelinek
  2009-06-08 20:33       ` Daniel Jacobowitz
  1 sibling, 1 reply; 39+ messages in thread
From: Jakub Jelinek @ 2009-06-08 20:20 UTC (permalink / raw)
  To: Joseph S. Myers, Nathan Froyd, gcc-patches

On Mon, Jun 08, 2009 at 03:52:11PM -0400, Daniel Jacobowitz wrote:
> In order to get good code out of these, I think we'd need to represent
> early on that the single gimple operation set three different vectors
> (SSA?  What SSA?)  Also we'd need to somehow do sensible register
> allocation for these constraints.
> 
> Instead, we do not support these in the vectorizer; use unions for
> the intrinsics (which do not get scalarized, so perhaps the new SRA
> will help here), and fake it with these huge partial modes during RTL
> expansion.  See the XImode patterns in neon.md for examples.
> 
> Any ideas? :-)

Why do you need the big modes?  If the insn does load from memory into
a couple of registers (or stores from couple of registers into memory),
why can't you just use a parallel with all those stores in it?
See e.g. s390.md for load_multiple and store_multiple...

	Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-08 18:04 ` Paolo Bonzini
@ 2009-06-08 20:28   ` Nathan Froyd
  2009-06-08 20:32     ` Steven Bosscher
  2009-06-09  6:36     ` Paolo Bonzini
  0 siblings, 2 replies; 39+ messages in thread
From: Nathan Froyd @ 2009-06-08 20:28 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc-patches

On Mon, Jun 08, 2009 at 08:04:34PM +0200, Paolo Bonzini wrote:
> >The fix is to widen the 'precision' field of tree_type to 10 bits.
> >Unfortunately, there's not much space in tree_type to widen that field;
> >the structure is nicely packed for 32-bit and 64-bit hosts.  The
> >approach I've taken in the patch below is to move packed_flag into
> >tree_base and rearrange the bitfields in tree_type.  I realize this is
> >slightly gross, but I don't see a better way--suggestions welcome.
> 
> What about moving some language-specific bits from type to base?

I suppose that's doable but:

- Doesn't that make base bigger (i.e. it affects everything, whereas
  the proposed patch is neutral wrt space)?

- I personally don't know which bits could be safely moved.

-Nathan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-08 20:28   ` Nathan Froyd
@ 2009-06-08 20:32     ` Steven Bosscher
  2009-06-09  6:36     ` Paolo Bonzini
  1 sibling, 0 replies; 39+ messages in thread
From: Steven Bosscher @ 2009-06-08 20:32 UTC (permalink / raw)
  To: Paolo Bonzini, gcc-patches

On Mon, Jun 8, 2009 at 10:27 PM, Nathan Froyd<froydnj@codesourcery.com> wrote:
>> What about moving some language-specific bits from type to base?
>
> I suppose that's doable but:
>
> - Doesn't that make base bigger (i.e. it affects everything, whereas
>  the proposed patch is neutral wrt space)?

Yes, but you are moving a type-specific bit into the generic type
node.  If we go down this path, we'll get the messy trees like before.
You do this once now, and everyone who needs a bit will use your patch
as an example for an excuse...


> - I personally don't know which bits could be safely moved.

Probably no-one knows. That's why you have a salary: Gives you time to
figure it out ;-)

Ciao!
Steven

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision  field
  2009-06-08 20:20     ` Jakub Jelinek
@ 2009-06-08 20:33       ` Daniel Jacobowitz
  0 siblings, 0 replies; 39+ messages in thread
From: Daniel Jacobowitz @ 2009-06-08 20:33 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Joseph S. Myers, Nathan Froyd, gcc-patches

On Mon, Jun 08, 2009 at 10:18:17PM +0200, Jakub Jelinek wrote:
> On Mon, Jun 08, 2009 at 03:52:11PM -0400, Daniel Jacobowitz wrote:
> > In order to get good code out of these, I think we'd need to represent
> > early on that the single gimple operation set three different vectors
> > (SSA?  What SSA?)  Also we'd need to somehow do sensible register
> > allocation for these constraints.
> > 
> > Instead, we do not support these in the vectorizer; use unions for
> > the intrinsics (which do not get scalarized, so perhaps the new SRA
> > will help here), and fake it with these huge partial modes during RTL
> > expansion.  See the XImode patterns in neon.md for examples.
> > 
> > Any ideas? :-)
> 
> Why do you need the big modes?  If the insn does load from memory into
> a couple of registers (or stores from couple of registers into memory),
> why can't you just use a parallel with all those stores in it?
> See e.g. s390.md for load_multiple and store_multiple...

Two reasons.  One, for the intrinsics, which use __builtin_neon_xi and
so forth.  The other, for the register allocation constraints.  I
don't see any way we could tell the register allocator "these three
operands must have consecutive register numbers" (or, even worse,
consecutive with stride 2).  S/390 handles this by generating
load_multiple directly for hard registers, only if reload_completed.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-08 20:28   ` Nathan Froyd
  2009-06-08 20:32     ` Steven Bosscher
@ 2009-06-09  6:36     ` Paolo Bonzini
  2009-06-09 14:54       ` Nathan Froyd
  1 sibling, 1 reply; 39+ messages in thread
From: Paolo Bonzini @ 2009-06-09  6:36 UTC (permalink / raw)
  To: Paolo Bonzini, gcc-patches


> I suppose that's doable but:
> 
> - Doesn't that make base bigger (i.e. it affects everything, whereas
>   the proposed patch is neutral wrt space)?

No, there are spare bits in base.

> - I personally don't know which bits could be safely moved.

Any?  What I mean is, instead of moving packed_flag from type to 
type.common.base.packed_flag, you move lang_flag_6 from type to 
type.common.base.lang_flag_7.  Instead of having 7 language fields in 
base and 7 in type, you have 8 and 6 respectively.

Actually, what about moving all 7 language specific fields to base? ;-) 
  We have 23 spare bits in base and 0 in type, that would even the field 
a bit.

Unless I'm missing something of course.

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-09  6:36     ` Paolo Bonzini
@ 2009-06-09 14:54       ` Nathan Froyd
  2009-06-09 15:00         ` Richard Guenther
  2009-06-09 15:40         ` Steven Bosscher
  0 siblings, 2 replies; 39+ messages in thread
From: Nathan Froyd @ 2009-06-09 14:54 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc-patches, richard.guenther, stevenb.gcc

On Tue, Jun 09, 2009 at 08:36:27AM +0200, Paolo Bonzini wrote:
> Any?  What I mean is, instead of moving packed_flag from type to 
> type.common.base.packed_flag, you move lang_flag_6 from type to 
> type.common.base.lang_flag_7.  Instead of having 7 language fields in 
> base and 7 in type, you have 8 and 6 respectively.
> 
> Actually, what about moving all 7 language specific fields to base? ;-) 
>  We have 23 spare bits in base and 0 in type, that would even the field 
> a bit.

Ah, OK.  Yes, that's rather obvious.

I don't see how this proposal is immune to Steven's complaint that
type-specific bits are moving into the generic bits.  I think it is less
gross, though.  I don't like Steven's approach of just making type
bigger; I think making all architectures pay for the needs of one
architecture is poor form, especially when there are other approaches
that avoid the cost.

Richard, Steven, do either of you have complaints about moving the
lang_flag_* fields from type into base (renaming them into
type_lang_flag_*) to free up space instead of moving packed_flag?

-Nathan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-09 14:54       ` Nathan Froyd
@ 2009-06-09 15:00         ` Richard Guenther
  2009-06-09 15:07           ` Richard Guenther
                             ` (2 more replies)
  2009-06-09 15:40         ` Steven Bosscher
  1 sibling, 3 replies; 39+ messages in thread
From: Richard Guenther @ 2009-06-09 15:00 UTC (permalink / raw)
  To: Paolo Bonzini, gcc-patches, richard.guenther, stevenb.gcc, Eric Botcazou

On Tue, Jun 9, 2009 at 10:54 AM, Nathan Froyd<froydnj@codesourcery.com> wrote:
> On Tue, Jun 09, 2009 at 08:36:27AM +0200, Paolo Bonzini wrote:
>> Any?  What I mean is, instead of moving packed_flag from type to
>> type.common.base.packed_flag, you move lang_flag_6 from type to
>> type.common.base.lang_flag_7.  Instead of having 7 language fields in
>> base and 7 in type, you have 8 and 6 respectively.
>>
>> Actually, what about moving all 7 language specific fields to base? ;-)
>>  We have 23 spare bits in base and 0 in type, that would even the field
>> a bit.
>
> Ah, OK.  Yes, that's rather obvious.
>
> I don't see how this proposal is immune to Steven's complaint that
> type-specific bits are moving into the generic bits.  I think it is less
> gross, though.  I don't like Steven's approach of just making type
> bigger; I think making all architectures pay for the needs of one
> architecture is poor form, especially when there are other approaches
> that avoid the cost.
>
> Richard, Steven, do either of you have complaints about moving the
> lang_flag_* fields from type into base (renaming them into
> type_lang_flag_*) to free up space instead of moving packed_flag?

Hmm.  I guess a more proper approach would be to try to remove
one of the lang-specific flags and instead force the frontends to use
flags in their TYPE_LANG_SPECIFIC structure.  Ada seems to be
the only one using all lang-specific flags - maybe there is an obvious
candidate.  Eric?

Thanks,
Richard.

> -Nathan
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-09 15:00         ` Richard Guenther
@ 2009-06-09 15:07           ` Richard Guenther
  2009-06-09 15:44           ` Steven Bosscher
  2009-06-10  2:50           ` Eric Botcazou
  2 siblings, 0 replies; 39+ messages in thread
From: Richard Guenther @ 2009-06-09 15:07 UTC (permalink / raw)
  To: Paolo Bonzini, gcc-patches, richard.guenther, stevenb.gcc, Eric Botcazou

On Tue, Jun 9, 2009 at 11:00 AM, Richard
Guenther<richard.guenther@gmail.com> wrote:
> On Tue, Jun 9, 2009 at 10:54 AM, Nathan Froyd<froydnj@codesourcery.com> wrote:
>> On Tue, Jun 09, 2009 at 08:36:27AM +0200, Paolo Bonzini wrote:
>>> Any?  What I mean is, instead of moving packed_flag from type to
>>> type.common.base.packed_flag, you move lang_flag_6 from type to
>>> type.common.base.lang_flag_7.  Instead of having 7 language fields in
>>> base and 7 in type, you have 8 and 6 respectively.
>>>
>>> Actually, what about moving all 7 language specific fields to base? ;-)
>>>  We have 23 spare bits in base and 0 in type, that would even the field
>>> a bit.
>>
>> Ah, OK.  Yes, that's rather obvious.
>>
>> I don't see how this proposal is immune to Steven's complaint that
>> type-specific bits are moving into the generic bits.  I think it is less
>> gross, though.  I don't like Steven's approach of just making type
>> bigger; I think making all architectures pay for the needs of one
>> architecture is poor form, especially when there are other approaches
>> that avoid the cost.
>>
>> Richard, Steven, do either of you have complaints about moving the
>> lang_flag_* fields from type into base (renaming them into
>> type_lang_flag_*) to free up space instead of moving packed_flag?
>
> Hmm.  I guess a more proper approach would be to try to remove
> one of the lang-specific flags and instead force the frontends to use
> flags in their TYPE_LANG_SPECIFIC structure.  Ada seems to be
> the only one using all lang-specific flags - maybe there is an obvious
> candidate.  Eric?

Or instead move the user_align flags from tree_type and tree_decl_common
to a common bit in tree_base.

Richard.

> Thanks,
> Richard.
>
>> -Nathan
>>
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-09 14:54       ` Nathan Froyd
  2009-06-09 15:00         ` Richard Guenther
@ 2009-06-09 15:40         ` Steven Bosscher
  2009-06-09 15:53           ` Richard Guenther
  1 sibling, 1 reply; 39+ messages in thread
From: Steven Bosscher @ 2009-06-09 15:40 UTC (permalink / raw)
  To: gcc-patches, richard.guenther, stevenb.gcc, Nathan Froyd

On Tue, Jun 9, 2009 at 4:54 PM, Nathan Froyd<froydnj@codesourcery.com> wrote:
>> Actually, what about moving all 7 language specific fields to base? ;-)
>>  We have 23 spare bits in base and 0 in type, that would even the field
>> a bit.
>
> Ah, OK.  Yes, that's rather obvious.
>
> I don't see how this proposal is immune to Steven's complaint that
> type-specific bits are moving into the generic bits.

It's not a complaint, it's a comment.  I just think it would be best
to keep the bits that are only used for types in the structs specific
for type nodes.


>I think it is less
> gross, though.  I don't like Steven's approach of just making type
> bigger; I think making all architectures pay for the needs of one
> architecture is poor form, especially when there are other approaches
> that avoid the cost.

Don't put words in my mouth that I did not say, thank you very much.

I did not propose "just making type bigger". I just said I dislike
your argument that you didn't know or understand the data structures
very well, and therefore you take the easy way out (same as what
happened with the loop index promotion pass). It leads to messy code
in the long run and that is all I am trying to avoid. There is
pragmatism, and there is quick and dirty hacks.  The former is OK, the
latter is not.

To quote what we said, specifically: You said: "> - I personally don't
know which bits could be safely moved," and I replied "Probably no-one
knows. That's why you have a salary: Gives you time to figure it out
;-)".  This does not say anywhere "please just make type bigger". It
says: "Try harder, figure out which bits could be safely moved."

And BTW, your approach also comes with a cost: a few mor patches like
yours and we'll be back to square one with incomprehensible tree data
structures. You seem to think that what is cheap and works now is also
cheap and will also work in the long run. Clearly you haven't been
around long enough to see the what mess that results in.  One little
bit is a bike shed now but a problem, say, 5 years from now when
tree_code is one bit too small for someone's next crazy idea...


> Richard, Steven, do either of you have complaints about moving the
> lang_flag_* fields from type into base (renaming them into
> type_lang_flag_*) to free up space instead of moving packed_flag?

I don't like it but there doesn't seem to be a better alternative. All
the TYPE_LANG_FLAG_x flags are in use, and so are all other bits (I
actually checked that).  Someone already suspiciously squeezed out a
bit from TYPE_MODE.

So if moving the type.lang_flag bits to tree_base is what you end up
doing, can you please also give one bit back to TYPE_MODE so that it
is 8 bit again just like all other ENUM_BITFIELD(machine_mode) (only
recog has 16 bits for machine mode -- will we ever need >256 modes?),
and reshuffle the bitfields a bit, to make the access to the
most-often used fields cheaper?

Ciao!
Steven


P.S. please fix your reply-to field!  ;-)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-09 15:00         ` Richard Guenther
  2009-06-09 15:07           ` Richard Guenther
@ 2009-06-09 15:44           ` Steven Bosscher
  2009-06-10  2:50           ` Eric Botcazou
  2 siblings, 0 replies; 39+ messages in thread
From: Steven Bosscher @ 2009-06-09 15:44 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Paolo Bonzini, gcc-patches, Eric Botcazou

On Tue, Jun 9, 2009 at 5:00 PM, Richard
Guenther<richard.guenther@gmail.com> wrote:
> Hmm.  I guess a more proper approach would be to try to remove
> one of the lang-specific flags and instead force the frontends to use
> flags in their TYPE_LANG_SPECIFIC structure.  Ada seems to be
> the only one using all lang-specific flags - maybe there is an obvious
> candidate.  Eric?

unless I grepped wrong, C++ also uses them all.

/me double checks

stevenb@gsyprf3:~/devel/trunk/gcc$ grep -r --exclude-dir=.svn
TYPE_LANG_FLAG_ * | grep cp
cp/ChangeLog-2003:      rather than TYPE_LANG_FLAG_0.
cp/cp-tree.h:   Usage of TYPE_LANG_FLAG_?:
cp/cp-tree.h:  (TYPE_LANG_FLAG_5 (T) = (VAL))
cp/cp-tree.h:  (RECORD_OR_UNION_CODE_P (TREE_CODE (T)) && TYPE_LANG_FLAG_5 (T))
cp/cp-tree.h:#define TYPE_FOR_JAVA(NODE) TYPE_LANG_FLAG_3 (NODE)
cp/cp-tree.h:#define TYPE_DEPENDENT_P(NODE) TYPE_LANG_FLAG_0 (NODE)
cp/cp-tree.h:#define TYPE_DEPENDENT_P_VALID(NODE) TYPE_LANG_FLAG_6(NODE)
cp/cp-tree.h:  (TREE_CODE (TYPE) == ENUMERAL_TYPE && TYPE_LANG_FLAG_5 (TYPE))
cp/cp-tree.h:  (TREE_CODE (TYPE) == ENUMERAL_TYPE && !TYPE_LANG_FLAG_5 (TYPE))
cp/cp-tree.h:  (TYPE_LANG_FLAG_5 (ENUMERAL_TYPE_CHECK (TYPE)) = (VAL))
cp/cp-tree.h:#define TYPE_HAS_USER_CONSTRUCTOR(NODE) (TYPE_LANG_FLAG_1 (NODE))
cp/cp-tree.h:  (TYPE_LANG_FLAG_4 (NODE))
cp/ChangeLog-2007:      * cp-tree.h: Update comment - TYPE_LANG_FLAG_0 is not
cp/ChangeLog-2008:      TYPE_LANG_FLAG_5.

OK, so C++ doesn't use TYPE_LANG_FLAG_2.  Oops :-)

Ciao!
Steven

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-09 15:40         ` Steven Bosscher
@ 2009-06-09 15:53           ` Richard Guenther
  2009-06-09 16:31             ` Nathan Froyd
  0 siblings, 1 reply; 39+ messages in thread
From: Richard Guenther @ 2009-06-09 15:53 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc-patches, Nathan Froyd

On Tue, Jun 9, 2009 at 11:40 AM, Steven Bosscher<stevenb.gcc@gmail.com> wrote:
> On Tue, Jun 9, 2009 at 4:54 PM, Nathan Froyd<froydnj@codesourcery.com> wrote:
>>> Actually, what about moving all 7 language specific fields to base? ;-)
>>>  We have 23 spare bits in base and 0 in type, that would even the field
>>> a bit.
>>
>> Ah, OK.  Yes, that's rather obvious.
>>
>> I don't see how this proposal is immune to Steven's complaint that
>> type-specific bits are moving into the generic bits.
>
> It's not a complaint, it's a comment.  I just think it would be best
> to keep the bits that are only used for types in the structs specific
> for type nodes.
>
>
>>I think it is less
>> gross, though.  I don't like Steven's approach of just making type
>> bigger; I think making all architectures pay for the needs of one
>> architecture is poor form, especially when there are other approaches
>> that avoid the cost.
>
> Don't put words in my mouth that I did not say, thank you very much.
>
> I did not propose "just making type bigger". I just said I dislike
> your argument that you didn't know or understand the data structures
> very well, and therefore you take the easy way out (same as what
> happened with the loop index promotion pass). It leads to messy code
> in the long run and that is all I am trying to avoid. There is
> pragmatism, and there is quick and dirty hacks.  The former is OK, the
> latter is not.
>
> To quote what we said, specifically: You said: "> - I personally don't
> know which bits could be safely moved," and I replied "Probably no-one
> knows. That's why you have a salary: Gives you time to figure it out
> ;-)".  This does not say anywhere "please just make type bigger". It
> says: "Try harder, figure out which bits could be safely moved."
>
> And BTW, your approach also comes with a cost: a few mor patches like
> yours and we'll be back to square one with incomprehensible tree data
> structures. You seem to think that what is cheap and works now is also
> cheap and will also work in the long run. Clearly you haven't been
> around long enough to see the what mess that results in.  One little
> bit is a bike shed now but a problem, say, 5 years from now when
> tree_code is one bit too small for someone's next crazy idea...
>
>
>> Richard, Steven, do either of you have complaints about moving the
>> lang_flag_* fields from type into base (renaming them into
>> type_lang_flag_*) to free up space instead of moving packed_flag?
>
> I don't like it but there doesn't seem to be a better alternative. All
> the TYPE_LANG_FLAG_x flags are in use, and so are all other bits (I
> actually checked that).  Someone already suspiciously squeezed out a
> bit from TYPE_MODE.
>
> So if moving the type.lang_flag bits to tree_base is what you end up
> doing, can you please also give one bit back to TYPE_MODE so that it
> is 8 bit again just like all other ENUM_BITFIELD(machine_mode) (only
> recog has 16 bits for machine mode -- will we ever need >256 modes?),
> and reshuffle the bitfields a bit, to make the access to the
> most-often used fields cheaper?

I don't like moving all lang-specific type flags to base.  There are some
bits in base used for non-generic things already, so moving a bit like
user_align there should be following existing practice, like
saturating_flag which is type-only as well (it could even
re-use like default_def_flag).

Richard.


> Ciao!
> Steven
>
>
> P.S. please fix your reply-to field!  ;-)
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-09 15:53           ` Richard Guenther
@ 2009-06-09 16:31             ` Nathan Froyd
  2009-06-09 16:34               ` Richard Guenther
                                 ` (3 more replies)
  0 siblings, 4 replies; 39+ messages in thread
From: Nathan Froyd @ 2009-06-09 16:31 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Steven Bosscher, gcc-patches, dnovillo

On Tue, Jun 09, 2009 at 11:45:09AM -0400, Richard Guenther wrote:
> >> Richard, Steven, do either of you have complaints about moving the
> >> lang_flag_* fields from type into base (renaming them into
> >> type_lang_flag_*) to free up space instead of moving packed_flag?
> >
> > I don't like it but there doesn't seem to be a better alternative. All
> > the TYPE_LANG_FLAG_x flags are in use, and so are all other bits (I
> > actually checked that).  Someone already suspiciously squeezed out a
> > bit from TYPE_MODE.
> >
> > So if moving the type.lang_flag bits to tree_base is what you end up
> > doing, can you please also give one bit back to TYPE_MODE so that it
> > is 8 bit again just like all other ENUM_BITFIELD(machine_mode) (only
> > recog has 16 bits for machine mode -- will we ever need >256 modes?),
> > and reshuffle the bitfields a bit, to make the access to the
> > most-often used fields cheaper?
> 
> I don't like moving all lang-specific type flags to base.  There are some
> bits in base used for non-generic things already, so moving a bit like
> user_align there should be following existing practice, like
> saturating_flag which is type-only as well (it could even
> re-use like default_def_flag).

OK, how about this slightly more aggressive patch, which:

- Moves packed_flag and user_align into tree_base;

- Frees enough space in tree_type to widen precision and mode;

- Aligns tree_type.mode on a byte boundary;

- Changes DECL_PACKED to use tree_base.packed_flag instead of some
  random flag in tree_decl_common; and

- Renumbers tree_decl_common fields and uses appropriately.

LTO folks, this is probably going to break a few things in the next
merge.  Sorry about that.

Sanity-checked on arm-none-linux-gnueabi; will run bootstrap and testing
on x86_64-unknown-linux-gnu.  OK to commit?

-Nathan

2009-06-09  Nathan Froyd  <froydnj@codesourcery.com>

	* tree.h (tree_base): Add packed_flag and user_align fields.
	Decrease size of spare field.
	(TYPE_USER_ALIGN): Use user_align from tree_base.
	(DECL_USER_ALIGN): Likewise.
	(TYPE_PACKED): Use packed_flag from tree_base.
	(DECL_PACKED): Likewise.
	(tree_type): Delete packed_flag and user_align fields.  Widen
	precision field.  Widen mode field and shuffle fields to align
	mode on an 8-bit boundary.
	(tree_decl_common): Delete decl_flag_1 and user_align fields.
	Renumber decl_flag_* fields.  Fix comments.  Widen
	decl_common_unused field.
	(DECL_HAS_VALUE_EXPR_P): Adjust for renumbering of decl_flag_*
	fields.
	(DECL_EXTERNAL): Likewise.
	(DECL_BIT_FIELD): Likewise.
	(DECL_NONADDRESSABLE_P): Likewise.
	(TYPE_DECL_SUPRESS_DEBUG): Likewise.
	* config/arm/arm-modes.def (XImode): Make it an INT_MODE.

Index: config/arm/arm-modes.def
===================================================================
--- config/arm/arm-modes.def	(revision 148302)
+++ config/arm/arm-modes.def	(working copy)
@@ -62,6 +62,4 @@ VECTOR_MODES (FLOAT, 16);     /*       V
 INT_MODE (EI, 24);
 INT_MODE (OI, 32);
 INT_MODE (CI, 48);
-/* ??? This should actually have 512 bits but the precision only has 9
-   bits.  */
-FRACTIONAL_INT_MODE (XI, 511, 64);
+INT_MODE (XI, 64);
Index: testsuite/ChangeLog
===================================================================
--- testsuite/ChangeLog	(revision 148302)
+++ testsuite/ChangeLog	(working copy)
@@ -1,3 +1,7 @@
+2009-06-09  Nathan Froyd  <froydnj@codesourcery.com>
+
+	* gcc.target/arm/neon-dse-1.c: New test.
+
 2009-06-08  Jan Hubicka  <jh@suse.cz>
 
 	PR debug/39834
Index: testsuite/gcc.target/arm/neon-dse-1.c
===================================================================
--- testsuite/gcc.target/arm/neon-dse-1.c	(revision 0)
+++ testsuite/gcc.target/arm/neon-dse-1.c	(revision 0)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O1" } */
+/* { dg-add-options arm_neon } */
+
+#include <arm_neon.h>
+
+void neon_internal_error(int *dst, int *src)
+{
+  uint16x8x4_t sval;
+
+  sval = vld4q_u16((void *)src);
+  vst4q_u16((void *)dst,sval);
+}
Index: tree.h
===================================================================
--- tree.h	(revision 148302)
+++ tree.h	(working copy)
@@ -373,8 +373,10 @@ struct GTY(()) tree_base {
   unsigned lang_flag_6 : 1;
 
   unsigned visited : 1;
+  unsigned packed_flag : 1;
+  unsigned user_align : 1;
 
-  unsigned spare : 23;
+  unsigned spare : 21;
 
   union tree_ann_d *ann;
 };
@@ -2108,7 +2110,7 @@ extern enum machine_mode vector_type_mod
 
 /* 1 if the alignment for this type was requested by "aligned" attribute,
    0 if it is the default for this type.  */
-#define TYPE_USER_ALIGN(NODE) (TYPE_CHECK (NODE)->type.user_align)
+#define TYPE_USER_ALIGN(NODE) (TYPE_CHECK (NODE)->common.base.user_align)
 
 /* The alignment for NODE, in bytes.  */
 #define TYPE_ALIGN_UNIT(NODE) (TYPE_ALIGN (NODE) / BITS_PER_UNIT)
@@ -2219,7 +2221,7 @@ extern enum machine_mode vector_type_mod
 
 /* Indicated that objects of this type should be laid out in as
    compact a way as possible.  */
-#define TYPE_PACKED(NODE) (TYPE_CHECK (NODE)->type.packed_flag)
+#define TYPE_PACKED(NODE) (TYPE_CHECK (NODE)->common.base.packed_flag)
 
 /* Used by type_contains_placeholder_p to avoid recomputation.
    Values are: 0 (unknown), 1 (false), 2 (true).  Never access
@@ -2237,17 +2239,16 @@ struct GTY(()) tree_type {
   tree attributes;
   unsigned int uid;
 
-  unsigned int precision : 9;
-  ENUM_BITFIELD(machine_mode) mode : 7;
-
-  unsigned string_flag : 1;
+  unsigned int precision : 10;
   unsigned no_force_blk_flag : 1;
   unsigned needs_constructing_flag : 1;
   unsigned transparent_union_flag : 1;
-  unsigned packed_flag : 1;
   unsigned restrict_flag : 1;
   unsigned contains_placeholder_bits : 2;
 
+  ENUM_BITFIELD(machine_mode) mode : 8;
+
+  unsigned string_flag : 1;
   unsigned lang_flag_0 : 1;
   unsigned lang_flag_1 : 1;
   unsigned lang_flag_2 : 1;
@@ -2255,7 +2256,6 @@ struct GTY(()) tree_type {
   unsigned lang_flag_4 : 1;
   unsigned lang_flag_5 : 1;
   unsigned lang_flag_6 : 1;
-  unsigned user_align : 1;
 
   unsigned int align;
   alias_set_type alias_set;
@@ -2509,7 +2509,7 @@ struct GTY(()) tree_decl_minimal {
 #define DECL_ALIGN_UNIT(NODE) (DECL_ALIGN (NODE) / BITS_PER_UNIT)
 /* Set if the alignment of this DECL has been set by the user, for
    example with an 'aligned' attribute.  */
-#define DECL_USER_ALIGN(NODE) (DECL_COMMON_CHECK (NODE)->decl_common.user_align)
+#define DECL_USER_ALIGN(NODE) (DECL_COMMON_CHECK (NODE)->common.base.user_align)
 /* Holds the machine mode corresponding to the declaration of a variable or
    field.  Always equal to TYPE_MODE (TREE_TYPE (decl)) except for a
    FIELD_DECL.  */
@@ -2546,7 +2546,7 @@ struct GTY(()) tree_decl_minimal {
    example, for a FUNCTION_DECL, DECL_SAVED_TREE may be non-NULL and
    DECL_EXTERNAL may be true simultaneously; that can be the case for
    a C99 "extern inline" function.  */
-#define DECL_EXTERNAL(NODE) (DECL_COMMON_CHECK (NODE)->decl_common.decl_flag_2)
+#define DECL_EXTERNAL(NODE) (DECL_COMMON_CHECK (NODE)->decl_common.decl_flag_1)
 
 /* Nonzero in a ..._DECL means this variable is ref'd from a nested function.
    For VAR_DECL nodes, PARM_DECL nodes, and FUNCTION_DECL nodes.
@@ -2615,7 +2615,6 @@ struct GTY(()) tree_decl_common {
   unsigned ignored_flag : 1;
   unsigned abstract_flag : 1;
   unsigned artificial_flag : 1;
-  unsigned user_align : 1;
   unsigned preserve_flag: 1;
   unsigned debug_expr_is_from : 1;
 
@@ -2631,22 +2630,20 @@ struct GTY(()) tree_decl_common {
   /* In LABEL_DECL, this is DECL_ERROR_ISSUED.
      In VAR_DECL and PARM_DECL, this is DECL_REGISTER.  */
   unsigned decl_flag_0 : 1;
-  /* In FIELD_DECL, this is DECL_PACKED.  */
-  unsigned decl_flag_1 : 1;
   /* In FIELD_DECL, this is DECL_BIT_FIELD
      In VAR_DECL and FUNCTION_DECL, this is DECL_EXTERNAL.
-     In TYPE_DECL, this is TYPE_DECL_SUPRESS_DEBUG.  */
-  unsigned decl_flag_2 : 1;
+     In TYPE_DECL, this is TYPE_DECL_SUPPRESS_DEBUG.  */
+  unsigned decl_flag_1 : 1;
   /* In FIELD_DECL, this is DECL_NONADDRESSABLE_P
-     In VAR_DECL and PARM_DECL, this is DECL_HAS_VALUE_EXPR.  */
-  unsigned decl_flag_3 : 1;
+     In VAR_DECL and PARM_DECL, this is DECL_HAS_VALUE_EXPR_P.  */
+  unsigned decl_flag_2 : 1;
   /* Logically, these two would go in a theoretical base shared by var and
      parm decl. */
   unsigned gimple_reg_flag : 1;
   /* In VAR_DECL, PARM_DECL and RESULT_DECL, this is DECL_BY_REFERENCE.  */
   unsigned decl_by_reference_flag : 1;
   /* Padding so that 'off_align' can be on a 32-bit boundary.  */
-  unsigned decl_common_unused : 2;
+  unsigned decl_common_unused : 4;
 
   /* DECL_OFFSET_ALIGN, used only for FIELD_DECLs.  */
   unsigned int off_align : 8;
@@ -2672,7 +2669,7 @@ extern void decl_value_expr_insert (tree
    decl itself.  This should only be used for debugging; once this field has
    been set, the decl itself may not legitimately appear in the function.  */
 #define DECL_HAS_VALUE_EXPR_P(NODE) \
-  (TREE_CHECK2 (NODE, VAR_DECL, PARM_DECL)->decl_common.decl_flag_3)
+  (TREE_CHECK2 (NODE, VAR_DECL, PARM_DECL)->decl_common.decl_flag_2)
 #define DECL_VALUE_EXPR(NODE) \
   (decl_value_expr_lookup (DECL_WRTL_CHECK (NODE)))
 #define SET_DECL_VALUE_EXPR(NODE, VAL)			\
@@ -2750,11 +2747,11 @@ struct GTY(()) tree_decl_with_rtl {
 #define DECL_FCONTEXT(NODE) (FIELD_DECL_CHECK (NODE)->field_decl.fcontext)
 
 /* In a FIELD_DECL, indicates this field should be bit-packed.  */
-#define DECL_PACKED(NODE) (FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_1)
+#define DECL_PACKED(NODE) (FIELD_DECL_CHECK (NODE)->common.base.packed_flag)
 
 /* Nonzero in a FIELD_DECL means it is a bit field, and must be accessed
    specially.  */
-#define DECL_BIT_FIELD(NODE) (FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_2)
+#define DECL_BIT_FIELD(NODE) (FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_1)
 
 /* Used in a FIELD_DECL to indicate that we cannot form the address of
    this component.  This makes it possible for Type-Based Alias Analysis
@@ -2772,7 +2769,7 @@ struct GTY(()) tree_decl_with_rtl {
    accesses to s.i must not be given the alias set of the type of 'i'
    (int) but instead directly that of the type of 's' (struct S).  */
 #define DECL_NONADDRESSABLE_P(NODE) \
-  (FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_3)
+  (FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_2)
 
 struct GTY(()) tree_field_decl {
   struct tree_decl_common common;
@@ -3242,7 +3239,7 @@ struct GTY(()) tree_function_decl {
    into stabs.  Instead it will generate cross reference ('x') of names.
    This uses the same flag as DECL_EXTERNAL.  */
 #define TYPE_DECL_SUPPRESS_DEBUG(NODE) \
-  (TYPE_DECL_CHECK (NODE)->decl_common.decl_flag_2)
+  (TYPE_DECL_CHECK (NODE)->decl_common.decl_flag_1)
 
 /* Getter of the imported declaration associated to the
    IMPORTED_DECL node.  */

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-09 16:31             ` Nathan Froyd
@ 2009-06-09 16:34               ` Richard Guenther
  2009-06-09 16:36               ` Diego Novillo
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 39+ messages in thread
From: Richard Guenther @ 2009-06-09 16:34 UTC (permalink / raw)
  To: Richard Guenther, Steven Bosscher, gcc-patches, dnovillo

On Tue, Jun 9, 2009 at 12:31 PM, Nathan Froyd<froydnj@codesourcery.com> wrote:
> On Tue, Jun 09, 2009 at 11:45:09AM -0400, Richard Guenther wrote:
>> >> Richard, Steven, do either of you have complaints about moving the
>> >> lang_flag_* fields from type into base (renaming them into
>> >> type_lang_flag_*) to free up space instead of moving packed_flag?
>> >
>> > I don't like it but there doesn't seem to be a better alternative. All
>> > the TYPE_LANG_FLAG_x flags are in use, and so are all other bits (I
>> > actually checked that).  Someone already suspiciously squeezed out a
>> > bit from TYPE_MODE.
>> >
>> > So if moving the type.lang_flag bits to tree_base is what you end up
>> > doing, can you please also give one bit back to TYPE_MODE so that it
>> > is 8 bit again just like all other ENUM_BITFIELD(machine_mode) (only
>> > recog has 16 bits for machine mode -- will we ever need >256 modes?),
>> > and reshuffle the bitfields a bit, to make the access to the
>> > most-often used fields cheaper?
>>
>> I don't like moving all lang-specific type flags to base.  There are some
>> bits in base used for non-generic things already, so moving a bit like
>> user_align there should be following existing practice, like
>> saturating_flag which is type-only as well (it could even
>> re-use like default_def_flag).
>
> OK, how about this slightly more aggressive patch, which:
>
> - Moves packed_flag and user_align into tree_base;
>
> - Frees enough space in tree_type to widen precision and mode;
>
> - Aligns tree_type.mode on a byte boundary;
>
> - Changes DECL_PACKED to use tree_base.packed_flag instead of some
>  random flag in tree_decl_common; and
>
> - Renumbers tree_decl_common fields and uses appropriately.
>
> LTO folks, this is probably going to break a few things in the next
> merge.  Sorry about that.
>
> Sanity-checked on arm-none-linux-gnueabi; will run bootstrap and testing
> on x86_64-unknown-linux-gnu.  OK to commit?

Works for me.  Ok after 24h if it bootstraps/regtests ok to let other people
have time to comment.

Thanks,
Richard.

> -Nathan
>
> 2009-06-09  Nathan Froyd  <froydnj@codesourcery.com>
>
>        * tree.h (tree_base): Add packed_flag and user_align fields.
>        Decrease size of spare field.
>        (TYPE_USER_ALIGN): Use user_align from tree_base.
>        (DECL_USER_ALIGN): Likewise.
>        (TYPE_PACKED): Use packed_flag from tree_base.
>        (DECL_PACKED): Likewise.
>        (tree_type): Delete packed_flag and user_align fields.  Widen
>        precision field.  Widen mode field and shuffle fields to align
>        mode on an 8-bit boundary.
>        (tree_decl_common): Delete decl_flag_1 and user_align fields.
>        Renumber decl_flag_* fields.  Fix comments.  Widen
>        decl_common_unused field.
>        (DECL_HAS_VALUE_EXPR_P): Adjust for renumbering of decl_flag_*
>        fields.
>        (DECL_EXTERNAL): Likewise.
>        (DECL_BIT_FIELD): Likewise.
>        (DECL_NONADDRESSABLE_P): Likewise.
>        (TYPE_DECL_SUPRESS_DEBUG): Likewise.
>        * config/arm/arm-modes.def (XImode): Make it an INT_MODE.
>
> Index: config/arm/arm-modes.def
> ===================================================================
> --- config/arm/arm-modes.def    (revision 148302)
> +++ config/arm/arm-modes.def    (working copy)
> @@ -62,6 +62,4 @@ VECTOR_MODES (FLOAT, 16);     /*       V
>  INT_MODE (EI, 24);
>  INT_MODE (OI, 32);
>  INT_MODE (CI, 48);
> -/* ??? This should actually have 512 bits but the precision only has 9
> -   bits.  */
> -FRACTIONAL_INT_MODE (XI, 511, 64);
> +INT_MODE (XI, 64);
> Index: testsuite/ChangeLog
> ===================================================================
> --- testsuite/ChangeLog (revision 148302)
> +++ testsuite/ChangeLog (working copy)
> @@ -1,3 +1,7 @@
> +2009-06-09  Nathan Froyd  <froydnj@codesourcery.com>
> +
> +       * gcc.target/arm/neon-dse-1.c: New test.
> +
>  2009-06-08  Jan Hubicka  <jh@suse.cz>
>
>        PR debug/39834
> Index: testsuite/gcc.target/arm/neon-dse-1.c
> ===================================================================
> --- testsuite/gcc.target/arm/neon-dse-1.c       (revision 0)
> +++ testsuite/gcc.target/arm/neon-dse-1.c       (revision 0)
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_neon_ok } */
> +/* { dg-options "-O1" } */
> +/* { dg-add-options arm_neon } */
> +
> +#include <arm_neon.h>
> +
> +void neon_internal_error(int *dst, int *src)
> +{
> +  uint16x8x4_t sval;
> +
> +  sval = vld4q_u16((void *)src);
> +  vst4q_u16((void *)dst,sval);
> +}
> Index: tree.h
> ===================================================================
> --- tree.h      (revision 148302)
> +++ tree.h      (working copy)
> @@ -373,8 +373,10 @@ struct GTY(()) tree_base {
>   unsigned lang_flag_6 : 1;
>
>   unsigned visited : 1;
> +  unsigned packed_flag : 1;
> +  unsigned user_align : 1;
>
> -  unsigned spare : 23;
> +  unsigned spare : 21;
>
>   union tree_ann_d *ann;
>  };
> @@ -2108,7 +2110,7 @@ extern enum machine_mode vector_type_mod
>
>  /* 1 if the alignment for this type was requested by "aligned" attribute,
>    0 if it is the default for this type.  */
> -#define TYPE_USER_ALIGN(NODE) (TYPE_CHECK (NODE)->type.user_align)
> +#define TYPE_USER_ALIGN(NODE) (TYPE_CHECK (NODE)->common.base.user_align)
>
>  /* The alignment for NODE, in bytes.  */
>  #define TYPE_ALIGN_UNIT(NODE) (TYPE_ALIGN (NODE) / BITS_PER_UNIT)
> @@ -2219,7 +2221,7 @@ extern enum machine_mode vector_type_mod
>
>  /* Indicated that objects of this type should be laid out in as
>    compact a way as possible.  */
> -#define TYPE_PACKED(NODE) (TYPE_CHECK (NODE)->type.packed_flag)
> +#define TYPE_PACKED(NODE) (TYPE_CHECK (NODE)->common.base.packed_flag)
>
>  /* Used by type_contains_placeholder_p to avoid recomputation.
>    Values are: 0 (unknown), 1 (false), 2 (true).  Never access
> @@ -2237,17 +2239,16 @@ struct GTY(()) tree_type {
>   tree attributes;
>   unsigned int uid;
>
> -  unsigned int precision : 9;
> -  ENUM_BITFIELD(machine_mode) mode : 7;
> -
> -  unsigned string_flag : 1;
> +  unsigned int precision : 10;
>   unsigned no_force_blk_flag : 1;
>   unsigned needs_constructing_flag : 1;
>   unsigned transparent_union_flag : 1;
> -  unsigned packed_flag : 1;
>   unsigned restrict_flag : 1;
>   unsigned contains_placeholder_bits : 2;
>
> +  ENUM_BITFIELD(machine_mode) mode : 8;
> +
> +  unsigned string_flag : 1;
>   unsigned lang_flag_0 : 1;
>   unsigned lang_flag_1 : 1;
>   unsigned lang_flag_2 : 1;
> @@ -2255,7 +2256,6 @@ struct GTY(()) tree_type {
>   unsigned lang_flag_4 : 1;
>   unsigned lang_flag_5 : 1;
>   unsigned lang_flag_6 : 1;
> -  unsigned user_align : 1;
>
>   unsigned int align;
>   alias_set_type alias_set;
> @@ -2509,7 +2509,7 @@ struct GTY(()) tree_decl_minimal {
>  #define DECL_ALIGN_UNIT(NODE) (DECL_ALIGN (NODE) / BITS_PER_UNIT)
>  /* Set if the alignment of this DECL has been set by the user, for
>    example with an 'aligned' attribute.  */
> -#define DECL_USER_ALIGN(NODE) (DECL_COMMON_CHECK (NODE)->decl_common.user_align)
> +#define DECL_USER_ALIGN(NODE) (DECL_COMMON_CHECK (NODE)->common.base.user_align)
>  /* Holds the machine mode corresponding to the declaration of a variable or
>    field.  Always equal to TYPE_MODE (TREE_TYPE (decl)) except for a
>    FIELD_DECL.  */
> @@ -2546,7 +2546,7 @@ struct GTY(()) tree_decl_minimal {
>    example, for a FUNCTION_DECL, DECL_SAVED_TREE may be non-NULL and
>    DECL_EXTERNAL may be true simultaneously; that can be the case for
>    a C99 "extern inline" function.  */
> -#define DECL_EXTERNAL(NODE) (DECL_COMMON_CHECK (NODE)->decl_common.decl_flag_2)
> +#define DECL_EXTERNAL(NODE) (DECL_COMMON_CHECK (NODE)->decl_common.decl_flag_1)
>
>  /* Nonzero in a ..._DECL means this variable is ref'd from a nested function.
>    For VAR_DECL nodes, PARM_DECL nodes, and FUNCTION_DECL nodes.
> @@ -2615,7 +2615,6 @@ struct GTY(()) tree_decl_common {
>   unsigned ignored_flag : 1;
>   unsigned abstract_flag : 1;
>   unsigned artificial_flag : 1;
> -  unsigned user_align : 1;
>   unsigned preserve_flag: 1;
>   unsigned debug_expr_is_from : 1;
>
> @@ -2631,22 +2630,20 @@ struct GTY(()) tree_decl_common {
>   /* In LABEL_DECL, this is DECL_ERROR_ISSUED.
>      In VAR_DECL and PARM_DECL, this is DECL_REGISTER.  */
>   unsigned decl_flag_0 : 1;
> -  /* In FIELD_DECL, this is DECL_PACKED.  */
> -  unsigned decl_flag_1 : 1;
>   /* In FIELD_DECL, this is DECL_BIT_FIELD
>      In VAR_DECL and FUNCTION_DECL, this is DECL_EXTERNAL.
> -     In TYPE_DECL, this is TYPE_DECL_SUPRESS_DEBUG.  */
> -  unsigned decl_flag_2 : 1;
> +     In TYPE_DECL, this is TYPE_DECL_SUPPRESS_DEBUG.  */
> +  unsigned decl_flag_1 : 1;
>   /* In FIELD_DECL, this is DECL_NONADDRESSABLE_P
> -     In VAR_DECL and PARM_DECL, this is DECL_HAS_VALUE_EXPR.  */
> -  unsigned decl_flag_3 : 1;
> +     In VAR_DECL and PARM_DECL, this is DECL_HAS_VALUE_EXPR_P.  */
> +  unsigned decl_flag_2 : 1;
>   /* Logically, these two would go in a theoretical base shared by var and
>      parm decl. */
>   unsigned gimple_reg_flag : 1;
>   /* In VAR_DECL, PARM_DECL and RESULT_DECL, this is DECL_BY_REFERENCE.  */
>   unsigned decl_by_reference_flag : 1;
>   /* Padding so that 'off_align' can be on a 32-bit boundary.  */
> -  unsigned decl_common_unused : 2;
> +  unsigned decl_common_unused : 4;
>
>   /* DECL_OFFSET_ALIGN, used only for FIELD_DECLs.  */
>   unsigned int off_align : 8;
> @@ -2672,7 +2669,7 @@ extern void decl_value_expr_insert (tree
>    decl itself.  This should only be used for debugging; once this field has
>    been set, the decl itself may not legitimately appear in the function.  */
>  #define DECL_HAS_VALUE_EXPR_P(NODE) \
> -  (TREE_CHECK2 (NODE, VAR_DECL, PARM_DECL)->decl_common.decl_flag_3)
> +  (TREE_CHECK2 (NODE, VAR_DECL, PARM_DECL)->decl_common.decl_flag_2)
>  #define DECL_VALUE_EXPR(NODE) \
>   (decl_value_expr_lookup (DECL_WRTL_CHECK (NODE)))
>  #define SET_DECL_VALUE_EXPR(NODE, VAL)                 \
> @@ -2750,11 +2747,11 @@ struct GTY(()) tree_decl_with_rtl {
>  #define DECL_FCONTEXT(NODE) (FIELD_DECL_CHECK (NODE)->field_decl.fcontext)
>
>  /* In a FIELD_DECL, indicates this field should be bit-packed.  */
> -#define DECL_PACKED(NODE) (FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_1)
> +#define DECL_PACKED(NODE) (FIELD_DECL_CHECK (NODE)->common.base.packed_flag)
>
>  /* Nonzero in a FIELD_DECL means it is a bit field, and must be accessed
>    specially.  */
> -#define DECL_BIT_FIELD(NODE) (FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_2)
> +#define DECL_BIT_FIELD(NODE) (FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_1)
>
>  /* Used in a FIELD_DECL to indicate that we cannot form the address of
>    this component.  This makes it possible for Type-Based Alias Analysis
> @@ -2772,7 +2769,7 @@ struct GTY(()) tree_decl_with_rtl {
>    accesses to s.i must not be given the alias set of the type of 'i'
>    (int) but instead directly that of the type of 's' (struct S).  */
>  #define DECL_NONADDRESSABLE_P(NODE) \
> -  (FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_3)
> +  (FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_2)
>
>  struct GTY(()) tree_field_decl {
>   struct tree_decl_common common;
> @@ -3242,7 +3239,7 @@ struct GTY(()) tree_function_decl {
>    into stabs.  Instead it will generate cross reference ('x') of names.
>    This uses the same flag as DECL_EXTERNAL.  */
>  #define TYPE_DECL_SUPPRESS_DEBUG(NODE) \
> -  (TYPE_DECL_CHECK (NODE)->decl_common.decl_flag_2)
> +  (TYPE_DECL_CHECK (NODE)->decl_common.decl_flag_1)
>
>  /* Getter of the imported declaration associated to the
>    IMPORTED_DECL node.  */
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-09 16:31             ` Nathan Froyd
  2009-06-09 16:34               ` Richard Guenther
@ 2009-06-09 16:36               ` Diego Novillo
  2009-06-09 17:30               ` Paolo Bonzini
  2010-07-28 19:01               ` [4.5 regression] C++ ignores some aligned attributes (Re: [PATCH] fix arm neon ICE by widening tree_type's precision field) Ulrich Weigand
  3 siblings, 0 replies; 39+ messages in thread
From: Diego Novillo @ 2009-06-09 16:36 UTC (permalink / raw)
  To: Richard Guenther, Steven Bosscher, gcc-patches, dnovillo

On Tue, Jun 9, 2009 at 12:31, Nathan Froyd<froydnj@codesourcery.com> wrote:

> LTO folks, this is probably going to break a few things in the next
> merge.  Sorry about that.

Thanks for the heads up.  I'll reflect these changes in the streamer
on the next merge.


Diego.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-09 16:31             ` Nathan Froyd
  2009-06-09 16:34               ` Richard Guenther
  2009-06-09 16:36               ` Diego Novillo
@ 2009-06-09 17:30               ` Paolo Bonzini
  2010-07-28 19:01               ` [4.5 regression] C++ ignores some aligned attributes (Re: [PATCH] fix arm neon ICE by widening tree_type's precision field) Ulrich Weigand
  3 siblings, 0 replies; 39+ messages in thread
From: Paolo Bonzini @ 2009-06-09 17:30 UTC (permalink / raw)
  To: Richard Guenther, Steven Bosscher, gcc-patches, dnovillo


> OK, how about this slightly more aggressive patch, which:
> 
> - Moves packed_flag and user_align into tree_base;

I won't complain but, at some point, I'd rather see the flags in 
tree_base called base_flag_{0,...} if we decide to go this way...

(or base_had_some_spare_bits_lets_use_them_for_flag_0). :-)

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
  2009-06-09 15:00         ` Richard Guenther
  2009-06-09 15:07           ` Richard Guenther
  2009-06-09 15:44           ` Steven Bosscher
@ 2009-06-10  2:50           ` Eric Botcazou
  2 siblings, 0 replies; 39+ messages in thread
From: Eric Botcazou @ 2009-06-10  2:50 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Paolo Bonzini, stevenb.gcc

> Hmm.  I guess a more proper approach would be to try to remove
> one of the lang-specific flags and instead force the frontends to use
> flags in their TYPE_LANG_SPECIFIC structure.  Ada seems to be
> the only one using all lang-specific flags - maybe there is an obvious
> candidate.  Eric?

We generate a lot of types in Ada so I wouldn't be thrilled with having to 
waste one full word for only one bit.  The last resort solution would be to 
eliminate the use of TYPE_LANG_FLAG_6 and replace it with something else, but 
the user_align tweak seems clearly better to me. :-)

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [4.5 regression] C++ ignores some aligned attributes (Re: [PATCH] fix arm neon ICE by widening tree_type's precision field)
  2009-06-09 16:31             ` Nathan Froyd
                                 ` (2 preceding siblings ...)
  2009-06-09 17:30               ` Paolo Bonzini
@ 2010-07-28 19:01               ` Ulrich Weigand
  2010-07-28 19:40                 ` Richard Guenther
  2010-07-28 22:07                 ` [PATCH] [4.5 regression] C++ ignores some aligned attributes Ulrich Weigand
  3 siblings, 2 replies; 39+ messages in thread
From: Ulrich Weigand @ 2010-07-28 19:01 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Guenther, Steven Bosscher, dnovillo, froydnj

Nathan Froyd wrote:

> 2009-06-09  Nathan Froyd  <froydnj@codesourcery.com>
> 
> 	* tree.h (tree_base): Add packed_flag and user_align fields.
> 	Decrease size of spare field.
> 	(TYPE_USER_ALIGN): Use user_align from tree_base.
> 	(DECL_USER_ALIGN): Likewise.

It seems this broke attribute ((aligned)) handling for certain
cases in C++, which causes crashes in the Mozilla JavaScript
interpreter under some circumstances.

I've opened PR c++/45112 for this problem.  The bug occurs with
code like the following:

struct JSString
{
  unsigned char mLength;
  static JSString unitStringTable[];
};

JSString JSString::unitStringTable[] __attribute__ ((aligned (8))) = { 1 };

(extracted and simplified from Mozilla), where the aligned attribute
seems to be simply ignored from 4.5 on.

The C++ front-end sees two DECLs here, one for the declaration (with
DECL_USER_ALIGN cleared) and one for the definition (with DECL_USER_ALIGN
set).  These are supposed to be merged by cp-decl.c:duplicate_decls.

Before this patch, this would happen inside the following memcpy:

      memcpy ((char *) olddecl + sizeof (struct tree_common),
              (char *) newdecl + sizeof (struct tree_common),
              sizeof (struct tree_decl_common) - sizeof (struct tree_common));

which copied everything in tree_decl_common, *except* what is
in tree_common.

Now, the patch unfortunately moves the user_align (and packed_flag)
fields out of the area copied by this memcpy, and into tree_common.

It seems this means it now ought to be handled manually, but this
code is missing ...

I'll try to come up with a fix.  Any thoughts from C++ folks?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [4.5 regression] C++ ignores some aligned attributes (Re: [PATCH]  fix arm neon ICE by widening tree_type's precision field)
  2010-07-28 19:01               ` [4.5 regression] C++ ignores some aligned attributes (Re: [PATCH] fix arm neon ICE by widening tree_type's precision field) Ulrich Weigand
@ 2010-07-28 19:40                 ` Richard Guenther
  2010-07-29 12:29                   ` [4.5 regression] C++ ignores some aligned attributes (Re: [PATCH] fix arm neon ICE by widening tree_type's precision field Ulrich Weigand
  2010-07-28 22:07                 ` [PATCH] [4.5 regression] C++ ignores some aligned attributes Ulrich Weigand
  1 sibling, 1 reply; 39+ messages in thread
From: Richard Guenther @ 2010-07-28 19:40 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc-patches, Steven Bosscher, dnovillo, froydnj

On Wed, Jul 28, 2010 at 8:36 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> Nathan Froyd wrote:
>
>> 2009-06-09  Nathan Froyd  <froydnj@codesourcery.com>
>>
>>       * tree.h (tree_base): Add packed_flag and user_align fields.
>>       Decrease size of spare field.
>>       (TYPE_USER_ALIGN): Use user_align from tree_base.
>>       (DECL_USER_ALIGN): Likewise.
>
> It seems this broke attribute ((aligned)) handling for certain
> cases in C++, which causes crashes in the Mozilla JavaScript
> interpreter under some circumstances.
>
> I've opened PR c++/45112 for this problem.  The bug occurs with
> code like the following:
>
> struct JSString
> {
>  unsigned char mLength;
>  static JSString unitStringTable[];
> };
>
> JSString JSString::unitStringTable[] __attribute__ ((aligned (8))) = { 1 };
>
> (extracted and simplified from Mozilla), where the aligned attribute
> seems to be simply ignored from 4.5 on.
>
> The C++ front-end sees two DECLs here, one for the declaration (with
> DECL_USER_ALIGN cleared) and one for the definition (with DECL_USER_ALIGN
> set).  These are supposed to be merged by cp-decl.c:duplicate_decls.
>
> Before this patch, this would happen inside the following memcpy:
>
>      memcpy ((char *) olddecl + sizeof (struct tree_common),
>              (char *) newdecl + sizeof (struct tree_common),
>              sizeof (struct tree_decl_common) - sizeof (struct tree_common));
>
> which copied everything in tree_decl_common, *except* what is
> in tree_common.

Ick - that looks extremely fragile.  Can we make it less so?

Richard.

> Now, the patch unfortunately moves the user_align (and packed_flag)
> fields out of the area copied by this memcpy, and into tree_common.
>
> It seems this means it now ought to be handled manually, but this
> code is missing ...
>
> I'll try to come up with a fix.  Any thoughts from C++ folks?
>
> Bye,
> Ulrich
>
> --
>  Dr. Ulrich Weigand
>  GNU Toolchain for Linux on System z and Cell BE
>  Ulrich.Weigand@de.ibm.com
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH] [4.5 regression] C++ ignores some aligned attributes
  2010-07-28 19:01               ` [4.5 regression] C++ ignores some aligned attributes (Re: [PATCH] fix arm neon ICE by widening tree_type's precision field) Ulrich Weigand
  2010-07-28 19:40                 ` Richard Guenther
@ 2010-07-28 22:07                 ` Ulrich Weigand
  2010-07-30 16:00                   ` Ulrich Weigand
                                     ` (2 more replies)
  1 sibling, 3 replies; 39+ messages in thread
From: Ulrich Weigand @ 2010-07-28 22:07 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Guenther, Steven Bosscher, dnovillo, froydnj

> I'll try to come up with a fix.  Any thoughts from C++ folks?

It seems this is not the first time this happened, given this comment:

  /* Init priority used to be merged from newdecl to olddecl by the memcpy,
     so keep this behavior.  */
  if (TREE_CODE (newdecl) == VAR_DECL && DECL_HAS_INIT_PRIORITY_P (newdecl))
    {
      SET_DECL_INIT_PRIORITY (olddecl, DECL_INIT_PRIORITY (newdecl));
      DECL_HAS_INIT_PRIORITY_P (olddecl) = 1;
    }

I've now tried adding similar merges of DECL_USER_ALIGN and DECL_PACKED
at this location, and this does indeed appear to fix the problem for me.

Tested on s390x-ibm-linux with no regressions, fixes the new test.

Does this look correct? If so, OK for mainline and 4.5?

Bye,
Ulrich


ChangeLog:

gcc/
	PR c++/45112
	* cp/decl.c (duplicate_decls): Merge DECL_USER_ALIGN and DECL_PACKED.

gcc/testsuite/
	PR c++/45112
	* testsuite/g++.dg/pr45112.C: New test.


Index: gcc/cp/decl.c
===================================================================
*** gcc/cp/decl.c	(revision 162649)
--- gcc/cp/decl.c	(working copy)
*************** duplicate_decls (tree newdecl, tree oldd
*** 2113,2118 ****
--- 2113,2122 ----
        SET_DECL_INIT_PRIORITY (olddecl, DECL_INIT_PRIORITY (newdecl));
        DECL_HAS_INIT_PRIORITY_P (olddecl) = 1;
      }
+   /* Likewise for DECL_USER_ALIGN and DECL_PACKED.  */
+   DECL_USER_ALIGN (olddecl) = DECL_USER_ALIGN (newdecl);
+   if (TREE_CODE (newdecl) == FIELD_DECL)
+     DECL_PACKED (olddecl) = DECL_PACKED (newdecl);
  
    /* The DECL_LANG_SPECIFIC information in OLDDECL will be replaced
       with that from NEWDECL below.  */
Index: gcc/testsuite/g++.dg/pr45112.C
===================================================================
*** gcc/testsuite/g++.dg/pr45112.C	(revision 0)
--- gcc/testsuite/g++.dg/pr45112.C	(revision 0)
***************
*** 0 ****
--- 1,12 ----
+ /* { dg-do compile } */
+ 
+ struct JSString
+ {
+   unsigned char mLength;
+   static JSString unitStringTable[];
+ };
+ 
+ JSString JSString::unitStringTable[] __attribute__ ((aligned (8))) = { 1 };
+ 
+ int bug [__alignof__ (JSString::unitStringTable) >= 8 ? 1 : -1];
+ 

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [4.5 regression] C++ ignores some aligned attributes (Re: [PATCH]   fix arm neon ICE by widening tree_type's precision field
  2010-07-28 19:40                 ` Richard Guenther
@ 2010-07-29 12:29                   ` Ulrich Weigand
  0 siblings, 0 replies; 39+ messages in thread
From: Ulrich Weigand @ 2010-07-29 12:29 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Steven Bosscher, dnovillo, froydnj

Richard Guenther wrote:
> On Wed, Jul 28, 2010 at 8:36 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> > Before this patch, this would happen inside the following memcpy:
> >
> > memcpy ((char *) olddecl + sizeof (struct tree_common),
> >         (char *) newdecl + sizeof (struct tree_common),
> >         sizeof (struct tree_decl_common) - sizeof (struct tree_common));
> >
> > which copied everything in tree_decl_common, *except* what is
> > in tree_common.
> 
> Ick - that looks extremely fragile.  Can we make it less so?

I guess, but it seems this is rather a question to the C++ front-end
maintainers ...   I'm not sure I fully understand the detailed logic
behind the decl merging process.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] [4.5 regression] C++ ignores some aligned attributes
  2010-07-28 22:07                 ` [PATCH] [4.5 regression] C++ ignores some aligned attributes Ulrich Weigand
@ 2010-07-30 16:00                   ` Ulrich Weigand
  2010-07-30 16:22                     ` Richard Guenther
  2010-07-31 17:45                   ` Eric Botcazou
  2010-08-04 14:03                   ` Paul Brook
  2 siblings, 1 reply; 39+ messages in thread
From: Ulrich Weigand @ 2010-07-30 16:00 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Guenther, Steven Bosscher, dnovillo, froydnj

> ChangeLog:
> 
> gcc/
> 	PR c++/45112
> 	* cp/decl.c (duplicate_decls): Merge DECL_USER_ALIGN and DECL_PACKED.
> 
> gcc/testsuite/
> 	PR c++/45112
> 	* testsuite/g++.dg/pr45112.C: New test.

I've checked this in to mainline now after an off-line approval
by Mark Mitchell.

Should I check this in to 4.5 as well at this point?  The status
says "frozen for release preparation" ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] [4.5 regression] C++ ignores some aligned attributes
  2010-07-30 16:22                     ` Richard Guenther
@ 2010-07-30 16:22                       ` Ulrich Weigand
  0 siblings, 0 replies; 39+ messages in thread
From: Ulrich Weigand @ 2010-07-30 16:22 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Steven Bosscher, dnovillo, froydnj

Richard Guenter wrote:
> On Fri, Jul 30, 2010 at 5:56 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> > Should I check this in to 4.5 as well at this point? =A0The status
> > says "frozen for release preparation" ...
> 
> Please wait until after the release.

Will do, thanks.  I've updated the bugzilla target milestone accordingly.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] [4.5 regression] C++ ignores some aligned attributes
  2010-07-30 16:00                   ` Ulrich Weigand
@ 2010-07-30 16:22                     ` Richard Guenther
  2010-07-30 16:22                       ` Ulrich Weigand
  0 siblings, 1 reply; 39+ messages in thread
From: Richard Guenther @ 2010-07-30 16:22 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc-patches, Steven Bosscher, dnovillo, froydnj

On Fri, Jul 30, 2010 at 5:56 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
>> ChangeLog:
>>
>> gcc/
>>       PR c++/45112
>>       * cp/decl.c (duplicate_decls): Merge DECL_USER_ALIGN and DECL_PACKED.
>>
>> gcc/testsuite/
>>       PR c++/45112
>>       * testsuite/g++.dg/pr45112.C: New test.
>
> I've checked this in to mainline now after an off-line approval
> by Mark Mitchell.
>
> Should I check this in to 4.5 as well at this point?  The status
> says "frozen for release preparation" ...

Please wait until after the release.

Thanks,
Richard.

> Bye,
> Ulrich
>
> --
>  Dr. Ulrich Weigand
>  GNU Toolchain for Linux on System z and Cell BE
>  Ulrich.Weigand@de.ibm.com
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] [4.5 regression] C++ ignores some aligned attributes
  2010-07-28 22:07                 ` [PATCH] [4.5 regression] C++ ignores some aligned attributes Ulrich Weigand
  2010-07-30 16:00                   ` Ulrich Weigand
@ 2010-07-31 17:45                   ` Eric Botcazou
  2010-07-31 19:38                     ` Ulrich Weigand
  2010-08-04 14:03                   ` Paul Brook
  2 siblings, 1 reply; 39+ messages in thread
From: Eric Botcazou @ 2010-07-31 17:45 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: gcc-patches, Richard Guenther, Steven Bosscher, dnovillo, froydnj

> ChangeLog:
>
> gcc/
> 	PR c++/45112
> 	* cp/decl.c (duplicate_decls): Merge DECL_USER_ALIGN and DECL_PACKED.

This should go in cp/ChangeLog without the cp/ though.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] [4.5 regression] C++ ignores some aligned attributes
  2010-07-31 17:45                   ` Eric Botcazou
@ 2010-07-31 19:38                     ` Ulrich Weigand
  0 siblings, 0 replies; 39+ messages in thread
From: Ulrich Weigand @ 2010-07-31 19:38 UTC (permalink / raw)
  To: Eric Botcazou
  Cc: gcc-patches, Richard Guenther, Steven Bosscher, dnovillo, froydnj

Eric Botcazou wrote:
> > gcc/
> > 	PR c++/45112
> > 	* cp/decl.c (duplicate_decls): Merge DECL_USER_ALIGN and DECL_PACKED.
> 
> This should go in cp/ChangeLog without the cp/ though.

D'oh!  Thanks for pointing this out.  Fixed now.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] [4.5 regression] C++ ignores some aligned attributes
  2010-07-28 22:07                 ` [PATCH] [4.5 regression] C++ ignores some aligned attributes Ulrich Weigand
  2010-07-30 16:00                   ` Ulrich Weigand
  2010-07-31 17:45                   ` Eric Botcazou
@ 2010-08-04 14:03                   ` Paul Brook
  2010-08-04 14:19                     ` Ulrich Weigand
  2 siblings, 1 reply; 39+ messages in thread
From: Paul Brook @ 2010-08-04 14:03 UTC (permalink / raw)
  To: gcc-patches
  Cc: Ulrich Weigand, Richard Guenther, Steven Bosscher, dnovillo,
	froydnj, Mark Mitchell

> > I'll try to come up with a fix.  Any thoughts from C++ folks?
> 
> It seems this is not the first time this happened, given this comment:
> 
>   /* Init priority used to be merged from newdecl to olddecl by the memcpy,
>      so keep this behavior.  */
>   if (TREE_CODE (newdecl) == VAR_DECL && DECL_HAS_INIT_PRIORITY_P
> (newdecl)) {
>       SET_DECL_INIT_PRIORITY (olddecl, DECL_INIT_PRIORITY (newdecl));
>       DECL_HAS_INIT_PRIORITY_P (olddecl) = 1;
>     }
> 
> I've now tried adding similar merges of DECL_USER_ALIGN and DECL_PACKED
> at this location, and this does indeed appear to fix the problem for me.
> 
> Tested on s390x-ibm-linux with no regressions, fixes the new test.
> 
> Does this look correct? If so, OK for mainline and 4.5?

I'm not so sure. Consider the following example:

struct JSString
{
  unsigned char mLength;
  static  JSString unitStringTable[];
};

int var1 =__alignof__ (JSString::unitStringTable);

JSString JSString::unitStringTable[] __attribute__ ((aligned (8))) = { 1 };
int var2 =__alignof__ (JSString::unitStringTable);

int main()
{
  assert(var1 == var2);
}

Note that var1 may be in a separate file, and never see the additional 
attribute.

Should we be rejecting this mismatch, rather than papering over what looks 
like user error?  If the packed attribute can reduce alignment then this is 
going to cause havoc on STRICT_ALIGN targets.

Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] [4.5 regression] C++ ignores some aligned attributes
  2010-08-04 14:03                   ` Paul Brook
@ 2010-08-04 14:19                     ` Ulrich Weigand
  2010-08-04 14:27                       ` Mark Mitchell
                                         ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Ulrich Weigand @ 2010-08-04 14:19 UTC (permalink / raw)
  To: Paul Brook
  Cc: gcc-patches, Richard Guenther, Steven Bosscher, dnovillo,
	froydnj, Mark Mitchell

Paul Brook wrote:

> I'm not so sure. Consider the following example:
> 
> struct JSString
> {
>   unsigned char mLength;
>   static  JSString unitStringTable[];
> };
> 
> int var1 =__alignof__ (JSString::unitStringTable);
> 
> JSString JSString::unitStringTable[] __attribute__ ((aligned (8))) = { 1 };
> int var2 =__alignof__ (JSString::unitStringTable);
> 
> int main()
> {
>   assert(var1 == var2);
> }
> 
> Note that var1 may be in a separate file, and never see the additional 
> attribute.
> 
> Should we be rejecting this mismatch, rather than papering over what looks 
> like user error?  If the packed attribute can reduce alignment then this is 
> going to cause havoc on STRICT_ALIGN targets.

Well, the only thing my patch does is to revert behavior back to what it was
in 4.4 and earlier, after having been (clearly inadvertently) changed by an
unrelated patch.  At least Firefox relies on that behavior; Firefox builds
would certainly break if we add the error you suggest ...

That said, I have no strong opinion on what the behavior *should* be in
the case of differing aligned attributes between declaration and definition.
(However, I would suggest to keep the behavior the same between C and C++
whereever that makes sense.)

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] [4.5 regression] C++ ignores some aligned attributes
  2010-08-04 14:19                     ` Ulrich Weigand
@ 2010-08-04 14:27                       ` Mark Mitchell
  2010-08-04 15:04                       ` Paul Brook
  2010-08-04 15:33                       ` Martin Sebor
  2 siblings, 0 replies; 39+ messages in thread
From: Mark Mitchell @ 2010-08-04 14:27 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: Paul Brook, gcc-patches, Richard Guenther, Steven Bosscher,
	dnovillo, froydnj

Ulrich Weigand wrote:

> That said, I have no strong opinion on what the behavior *should* be in
> the case of differing aligned attributes between declaration and definition.
> (However, I would suggest to keep the behavior the same between C and C++
> whereever that makes sense.)

If this were part of the standard, I'd argue it should be one of these
things that is an error but does not require a diagnostic.  Leaving out
the "packed" attribute prior in one file is like declaring a variable
with "int" in one place and "char" in another.  A linker that cares to
check and issue an error is welcome to do so, but that's not a requirement.

-- 
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] [4.5 regression] C++ ignores some aligned attributes
  2010-08-04 14:19                     ` Ulrich Weigand
  2010-08-04 14:27                       ` Mark Mitchell
@ 2010-08-04 15:04                       ` Paul Brook
  2010-08-04 16:42                         ` Ulrich Weigand
  2010-08-04 15:33                       ` Martin Sebor
  2 siblings, 1 reply; 39+ messages in thread
From: Paul Brook @ 2010-08-04 15:04 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: gcc-patches, Richard Guenther, Steven Bosscher, dnovillo,
	froydnj, Mark Mitchell

> > Should we be rejecting this mismatch, rather than papering over what
> > looks like user error?  If the packed attribute can reduce alignment
> > then this is going to cause havoc on STRICT_ALIGN targets.
> 
> Well, the only thing my patch does is to revert behavior back to what it
> was in 4.4 and earlier, after having been (clearly inadvertently) changed
> by an unrelated patch.  At least Firefox relies on that behavior; Firefox
> builds would certainly break if we add the error you suggest ...

I'd argue that this is a feature.  Code already broke (PR45112) once due to 
mismatched alignment assumptions, so who knows what other bugs are lurking 
behind this mismatch.  Better to fail at compile time than runtime.
It seems rather inconsistent to require any particular behavior in the 
presence of this inconsistency, but not provide a diagnostic for all the other 
closely related cases that we know are still going to break. If nothing else 
this seems like a missed optimization opportunity.

> That said, I have no strong opinion on what the behavior should be in
> the case of differing aligned attributes between declaration and
> definition. (However, I would suggest to keep the behavior the same
> between C and C++ whereever that makes sense.)

Fair point.

Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] [4.5 regression] C++ ignores some aligned attributes
  2010-08-04 14:19                     ` Ulrich Weigand
  2010-08-04 14:27                       ` Mark Mitchell
  2010-08-04 15:04                       ` Paul Brook
@ 2010-08-04 15:33                       ` Martin Sebor
  2010-08-04 15:47                         ` Mark Mitchell
  2 siblings, 1 reply; 39+ messages in thread
From: Martin Sebor @ 2010-08-04 15:33 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: Paul Brook, gcc-patches, Richard Guenther, Steven Bosscher,
	dnovillo, froydnj, Mark Mitchell

On 08/04/2010 08:19 AM, Ulrich Weigand wrote:
> Paul Brook wrote:
>
>> I'm not so sure. Consider the following example:
>>
>> struct JSString
>> {
>>    unsigned char mLength;
>>    static  JSString unitStringTable[];
>> };
>>
>> int var1 =__alignof__ (JSString::unitStringTable);
>>
>> JSString JSString::unitStringTable[] __attribute__ ((aligned (8))) = { 1 };
>> int var2 =__alignof__ (JSString::unitStringTable);
>>
>> int main()
>> {
>>    assert(var1 == var2);
>> }
>>
>> Note that var1 may be in a separate file, and never see the additional
>> attribute.
>>
>> Should we be rejecting this mismatch, rather than papering over what looks
>> like user error?  If the packed attribute can reduce alignment then this is
>> going to cause havoc on STRICT_ALIGN targets.
>
> Well, the only thing my patch does is to revert behavior back to what it was
> in 4.4 and earlier, after having been (clearly inadvertently) changed by an
> unrelated patch.  At least Firefox relies on that behavior; Firefox builds
> would certainly break if we add the error you suggest ...
>
> That said, I have no strong opinion on what the behavior *should* be in
> the case of differing aligned attributes between declaration and definition.
> (However, I would suggest to keep the behavior the same between C and C++
> whereever that makes sense.)

In C++ 0x, the (presumably equivalent) alignof operator had better
yield the same result across all translation units to preserve the
ODR. Consider:

struct S { static int i; };

template <int> struct X { };
template <> struct X<alignof S::i> { enum { e }; };

int S::i [[align(sizeof (int) * 2)]];

int main() {
     return X<alignof S::i>::e;
}

Martin

>
> Bye,
> Ulrich
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] [4.5 regression] C++ ignores some aligned attributes
  2010-08-04 15:33                       ` Martin Sebor
@ 2010-08-04 15:47                         ` Mark Mitchell
  2010-08-05  9:02                           ` Martin Sebor
  0 siblings, 1 reply; 39+ messages in thread
From: Mark Mitchell @ 2010-08-04 15:47 UTC (permalink / raw)
  To: Martin Sebor
  Cc: Ulrich Weigand, Paul Brook, gcc-patches, Richard Guenther,
	Steven Bosscher, dnovillo, froydnj

Martin Sebor wrote:

> In C++ 0x, the (presumably equivalent) alignof operator had better
> yield the same result across all translation units to preserve the
> ODR.

Indeed -- but that's independent of whether or not a diagnostic is
required when you add the alignment attribute on something other than
the first declaration within a given translation unit.

-- 
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] [4.5 regression] C++ ignores some aligned attributes
  2010-08-04 15:04                       ` Paul Brook
@ 2010-08-04 16:42                         ` Ulrich Weigand
  0 siblings, 0 replies; 39+ messages in thread
From: Ulrich Weigand @ 2010-08-04 16:42 UTC (permalink / raw)
  To: Paul Brook
  Cc: gcc-patches, Richard Guenther, Steven Bosscher, dnovillo,
	froydnj, Mark Mitchell

Paul Brook wrote:
> > Well, the only thing my patch does is to revert behavior back to what it
> > was in 4.4 and earlier, after having been (clearly inadvertently) changed
> > by an unrelated patch.  At least Firefox relies on that behavior; Firefox
> > builds would certainly break if we add the error you suggest ...
> 
> I'd argue that this is a feature.  Code already broke (PR45112) once due to 
> mismatched alignment assumptions, so who knows what other bugs are lurking 
> behind this mismatch.  Better to fail at compile time than runtime.

With this patch to fix PR45112 (which restores prior behavior), there *is*
no bug lurking in Firefox; the Firefox code seems perfectly fine to me,
and I see no reason to deliberately break Firefox builds here.

PR45112 did not break because of mismatched alignment attributes; rather
it broke because the compiler *ignored* the (one) alignment attribute on the
definition, due to an unintended effect of the patch moving bits around.

There is no code in Firefox that relies on having an alignment attribute
visible on the declaration.  There certainly is the (implicit) assumption
that all JSVAL pointers must be 8 byte aligned, but this is not really
up to the compiler to enforce (most such pointers come from malloc anyway);
it is a requirement upon users of that interface.  (Which they tried to
satisfy using an aligned attibute -- which was then ignored due to the
GCC bug.)

> It seems rather inconsistent to require any particular behavior in the 
> presence of this inconsistency, but not provide a diagnostic for all the other 
> closely related cases that we know are still going to break. If nothing else 
> this seems like a missed optimization opportunity.

I'm not sure I agree that there is an inconsistency.  In general, in C
(or C++), declarations need not necessarily agree on each property they
specify; for example, you can have a declaration with "static" and one
without "static" for the same function.  These properties are then merged.
I do not see why aligned attributes should necessarily be different here;
introducing this as requirement seems to be a change with the potential to
break existing code bases that otherwise wouldn't have any problems.

Now, this property does of course give rise to the possibility that it
can be used in incorrect ways that introduce bugs.  But I don't agree
that the Firefox usage is of that type.  It simply requests that the
variable is actually allocated using stricter alignment requirements
than would otherwise be the case.  (Note that you could have the same
effect by specifying that extra alignment in a linker script!)


I guess I could see an argument to be made for a diagnostic if the alignment
at the definition is *less* strict than the one at some other declaration,
and we know about it at the point.  This diagnostic in fact would not break
Firefox either, because that's not the case there.

However, as Mark mentioned, that diagnostic wouldn't be a sure way to
recognize all instances, because that other declaration might not
even be visible at the site of the definition ...


[ In either case, that seems a completely separate argument from whether
the original patch in this thread ought to be applied -- which it of
course already is.  In my mind that patch fixes a wrong-code regression
by restoring prior behaviour.  If we decide that additional diagnostics
are useful in certain cases, that would be an enhancement on top.  ]

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] [4.5 regression] C++ ignores some aligned attributes
  2010-08-04 15:47                         ` Mark Mitchell
@ 2010-08-05  9:02                           ` Martin Sebor
  0 siblings, 0 replies; 39+ messages in thread
From: Martin Sebor @ 2010-08-05  9:02 UTC (permalink / raw)
  To: Mark Mitchell
  Cc: Ulrich Weigand, Paul Brook, gcc-patches, Richard Guenther,
	Steven Bosscher, dnovillo, froydnj

On 08/04/2010 09:47 AM, Mark Mitchell wrote:
> Martin Sebor wrote:
>
>> In C++ 0x, the (presumably equivalent) alignof operator had better
>> yield the same result across all translation units to preserve the
>> ODR.
>
> Indeed -- but that's independent of whether or not a diagnostic is
> required when you add the alignment attribute on something other than
> the first declaration within a given translation unit.
>

In C++ 0x it's valid to specify the align attribute on a definition
an object after a declaration without the attribute has been seen.
The attribute may also be applied to classes and enums, but not to
fundamental types.

The attribute apparently has no effect on the result of the alignof
operator: it always yields the alignment of the type of its argument
even if the argument is an overaligned object. This makes it easier
for an implementation to preserve the ODR in cases like the one
being discussed. IIUC, that doesn't appear to be so for __alignof__.

Martin

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2010-08-05  9:02 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-08 15:32 [PATCH] fix arm neon ICE by widening tree_type's precision field Nathan Froyd
2009-06-08 17:11 ` Richard Guenther
2009-06-08 18:04 ` Paolo Bonzini
2009-06-08 20:28   ` Nathan Froyd
2009-06-08 20:32     ` Steven Bosscher
2009-06-09  6:36     ` Paolo Bonzini
2009-06-09 14:54       ` Nathan Froyd
2009-06-09 15:00         ` Richard Guenther
2009-06-09 15:07           ` Richard Guenther
2009-06-09 15:44           ` Steven Bosscher
2009-06-10  2:50           ` Eric Botcazou
2009-06-09 15:40         ` Steven Bosscher
2009-06-09 15:53           ` Richard Guenther
2009-06-09 16:31             ` Nathan Froyd
2009-06-09 16:34               ` Richard Guenther
2009-06-09 16:36               ` Diego Novillo
2009-06-09 17:30               ` Paolo Bonzini
2010-07-28 19:01               ` [4.5 regression] C++ ignores some aligned attributes (Re: [PATCH] fix arm neon ICE by widening tree_type's precision field) Ulrich Weigand
2010-07-28 19:40                 ` Richard Guenther
2010-07-29 12:29                   ` [4.5 regression] C++ ignores some aligned attributes (Re: [PATCH] fix arm neon ICE by widening tree_type's precision field Ulrich Weigand
2010-07-28 22:07                 ` [PATCH] [4.5 regression] C++ ignores some aligned attributes Ulrich Weigand
2010-07-30 16:00                   ` Ulrich Weigand
2010-07-30 16:22                     ` Richard Guenther
2010-07-30 16:22                       ` Ulrich Weigand
2010-07-31 17:45                   ` Eric Botcazou
2010-07-31 19:38                     ` Ulrich Weigand
2010-08-04 14:03                   ` Paul Brook
2010-08-04 14:19                     ` Ulrich Weigand
2010-08-04 14:27                       ` Mark Mitchell
2010-08-04 15:04                       ` Paul Brook
2010-08-04 16:42                         ` Ulrich Weigand
2010-08-04 15:33                       ` Martin Sebor
2010-08-04 15:47                         ` Mark Mitchell
2010-08-05  9:02                           ` Martin Sebor
2009-06-08 19:40 ` [PATCH] fix arm neon ICE by widening tree_type's precision field Joseph S. Myers
2009-06-08 19:52   ` Daniel Jacobowitz
2009-06-08 20:12     ` Andrew Pinski
2009-06-08 20:20     ` Jakub Jelinek
2009-06-08 20:33       ` Daniel Jacobowitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).