public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on  the stack for greater than required alignment
@ 2007-01-30  2:42 Andrew_Pinski
  2007-01-31 15:20 ` Meissner, Michael
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Andrew_Pinski @ 2007-01-30  2:42 UTC (permalink / raw)
  To: gcc-patches; +Cc: Trevor_Smigiel, Russell_Olsen

[-- Attachment #1: Type: text/plain, Size: 2204 bytes --]

Hi,
  This patch fixes PR 16660 where we ignore the user supplied allignment 
on stack variables if it is greater than the required alignment on the 
specific target.  This fixes the issue by oversizing the stack variable 
and then manually aligning the variable.

I had to remove some asserts from the x86 back-end as they are no longer 
valid from the point of view of the middle-end could have a case where 
preferred_alignment is greater than the target's preferred alignment and 
the stack alignment that is needed can be greater than the target's 
preferred alignment.


OK? Bootstrapped and tested on i686-linux-gnu and powerpc-linux-gnu with 
no regressions.

Thanks,
Andrew Pinski

:ADDPATCH middle-end:

ChangeLog:

        * expr.h (allocate_dynamic_stack_space_1): New function prototype.
        * functionc.c (assign_temp): Take into account the alignment
        of the temp if it is greater than the target's preferred 
alignment.
        * cfgexpand.c: Include optabs.h.
        (get_decl_align_unit): Update comment and don't lower the 
alignment
        if it is greater than the target's preferred alignment.
        (alloc_stack_frame_space): Take an unsigned HOST_WIDE_INT for 
align.
        Take into account the variable's alignment if it is greater than
        the target's preferred alignment.
        (expand_one_stack_var_at): Likewise.
        (defer_stack_allocation): Defer variables who's alignment is 
greater
        than the target's preferred alignment.
        * explow.c (allocate_dynamic_stack_space_1): Split out from
        allocate_dynamic_stack_space and take into acount the required
        alignment.
        (allocate_dynamic_stack_space): Call 
allocate_dynamic_stack_space_1.
        * Makefile.in (cfgexpand.o): Update dependecies.
        * stmt.c (expand_decl):  Take into account the alignment
        of the variable if it is greater than the target's preferred
        alignment.
        * config/i386/i386.c (ix86_compute_frame_layout): Don't assert
        that preferred alignment is greater than the normal preferred
        alignment.  Don't assert that the stack alignment needed is 
greater
        than the normal preferred alignment. 



[-- Attachment #2: attributealignedstack.diff.txt --]
[-- Type: text/plain, Size: 12640 bytes --]

Index: testsuite/gcc.c-torture/execute/pr16660-1.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
@@ -0,0 +1,11 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  int a[sizeof(int)/ALIGNMENT] __attribute__((aligned(ALIGNMENT)));
+  size_t b = (size_t)&a;
+  if (b&(ALIGNMENT-1))
+    return 1;
+  return 0;
+}
+
Index: expr.h
===================================================================
--- expr.h	(revision 121163)
+++ expr.h	(working copy)
@@ -707,6 +707,7 @@ extern void update_nonlocal_goto_save_ar
 /* Allocate some space on the stack dynamically and return its address.  An rtx
    says how many bytes.  */
 extern rtx allocate_dynamic_stack_space (rtx, rtx, int);
+extern rtx allocate_dynamic_stack_space_1 (rtx, rtx, int, int);
 
 /* Probe a range of stack addresses from FIRST to FIRST+SIZE, inclusive.
    FIRST is a constant and size is a Pmode RTX.  These are offsets from the
Index: function.c
===================================================================
--- function.c	(revision 121163)
+++ function.c	(working copy)
@@ -812,6 +812,7 @@ assign_temp (tree type_or_decl, int keep
   if (mode == BLKmode || memory_required)
     {
       HOST_WIDE_INT size = int_size_in_bytes (type);
+      unsigned HOST_WIDE_INT align;
       rtx tmp;
 
       /* Zero sized arrays are GNU C extension.  Set size to 1 to avoid
@@ -836,7 +837,37 @@ assign_temp (tree type_or_decl, int keep
 	  size = 1;
 	}
 
-      tmp = assign_stack_temp_for_type (mode, size, keep, type);
+      align = TYPE_ALIGN (type);
+      if (decl && DECL_ALIGN (decl) > align)
+	align = DECL_ALIGN (decl);
+
+      /* When the user specifies an alignment larger than
+         BIGGEST_ALIGNMENT we need to allocate extra space so we can
+         adjust this decl's address.  */
+      if (align > PREFERRED_STACK_BOUNDARY)
+	{
+	  rtx addr;
+	  HOST_WIDE_INT pad = (align - PREFERRED_STACK_BOUNDARY) / BITS_PER_UNIT;
+	  size += pad;
+	  tmp = assign_stack_temp_for_type (mode, size, keep, type);
+	  addr = XEXP (tmp, 0);
+	  /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
+	     but we know it can't.  So add ourselves and then do
+	     TRUNC_DIV_EXPR.  */
+	  addr = expand_binop (Pmode, add_optab, addr,
+			       GEN_INT (align / BITS_PER_UNIT - 1),
+			       NULL_RTX, 1, OPTAB_LIB_WIDEN);
+	  addr = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, addr,
+			        GEN_INT (align / BITS_PER_UNIT),
+			        NULL_RTX, 1);
+	  addr = expand_mult (Pmode, addr,
+			      GEN_INT (align / BITS_PER_UNIT),
+			      NULL_RTX, 1);
+	  tmp = change_address (tmp, VOIDmode, addr);
+	}
+      else
+	tmp = assign_stack_temp_for_type (mode, size, keep, type);
+
       return tmp;
     }
 
Index: cfgexpand.c
===================================================================
--- cfgexpand.c	(revision 121163)
+++ cfgexpand.c	(working copy)
@@ -41,6 +41,7 @@ Boston, MA 02110-1301, USA.  */
 #include "params.h"
 #include "tree-inline.h"
 #include "value-prof.h"
+#include "optabs.h"
 
 /* Verify that there is exactly single jump instruction since last and attach
    REG_BR_PROB note specifying probability.
@@ -151,8 +152,7 @@ static bool has_protected_decls;
    smaller than our cutoff threshold.  Used for -Wstack-protector.  */
 static bool has_short_buffer;
 
-/* Discover the byte alignment to use for DECL.  Ignore alignment
-   we can't do with expected alignment of the stack boundary.  */
+/* Discover the byte alignment to use for DECL.  */
 
 static unsigned int
 get_decl_align_unit (tree decl)
@@ -161,8 +161,6 @@ get_decl_align_unit (tree decl)
 
   align = DECL_ALIGN (decl);
   align = LOCAL_ALIGNMENT (TREE_TYPE (decl), align);
-  if (align > PREFERRED_STACK_BOUNDARY)
-    align = PREFERRED_STACK_BOUNDARY;
   if (cfun->stack_alignment_needed < align)
     cfun->stack_alignment_needed = align;
 
@@ -173,11 +171,21 @@ get_decl_align_unit (tree decl)
    Return the frame offset.  */
 
 static HOST_WIDE_INT
-alloc_stack_frame_space (HOST_WIDE_INT size, HOST_WIDE_INT align)
+alloc_stack_frame_space (HOST_WIDE_INT size, unsigned HOST_WIDE_INT align)
 {
   HOST_WIDE_INT offset, new_frame_offset;
 
   new_frame_offset = frame_offset;
+
+  /* Allocate extra space if the alignment required is greater than the
+     stack boundary and then assume the RTL expansion of the variable, gets
+     the correct alignment.  */
+  if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      size += align;
+      align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
+    }
+
   if (FRAME_GROWS_DOWNWARD)
     {
       new_frame_offset -= size + frame_phase;
@@ -523,23 +531,31 @@ dump_stack_var_partition (void)
 static void
 expand_one_stack_var_at (tree decl, HOST_WIDE_INT offset)
 {
-  HOST_WIDE_INT align;
+  unsigned HOST_WIDE_INT align;
   rtx x;
 
   /* If this fails, we've overflowed the stack frame.  Error nicely?  */
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (virtual_stack_vars_rtx, offset);
-  x = gen_rtx_MEM (DECL_MODE (decl), x);
+  align = get_decl_align_unit (decl);
 
-  /* Set alignment we actually gave this decl.  */
-  offset -= frame_phase;
-  align = offset & -offset;
-  align *= BITS_PER_UNIT;
-  if (align > STACK_BOUNDARY || align == 0)
-    align = STACK_BOUNDARY;
-  DECL_ALIGN (decl) = align;
-  DECL_USER_ALIGN (decl) = 0;
+  if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      rtx addr = x;
+      addr = expand_binop (Pmode, add_optab, addr,
+			   GEN_INT (align - 1),
+			   NULL_RTX, 1, OPTAB_LIB_WIDEN);
+      addr = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, addr,
+			    GEN_INT (align),
+			    NULL_RTX, 1);
+      addr = expand_mult (Pmode, addr,
+			  GEN_INT (align),
+			  NULL_RTX, 1);
+      x = addr;
+    }
+
+  x = gen_rtx_MEM (DECL_MODE (decl), x);
 
   set_mem_attributes (x, decl, true);
   SET_DECL_RTL (decl, x);
@@ -740,6 +756,11 @@ defer_stack_allocation (tree var, bool t
   if (optimize == 0 && tree_low_cst (DECL_SIZE_UNIT (var), 1) < 32)
     return false;
 
+  /* When the variable needs additional code to be aligned on the
+     stack we allocate separately just for ease of implementation.  */
+  if (get_decl_align_unit (var) > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    return false;
+
   return true;
 }
 
Index: explow.c
===================================================================
--- explow.c	(revision 121163)
+++ explow.c	(working copy)
@@ -1090,16 +1090,28 @@ update_nonlocal_goto_save_area (void)
    TARGET is a place in which the address can be placed.
 
    KNOWN_ALIGN is the alignment (in bits) that we know SIZE has.  */
-
 rtx
 allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
 {
+  return allocate_dynamic_stack_space_1 (size, target, known_align,
+				       BIGGEST_ALIGNMENT);
+}
+
+/* Like allocate_dynamic_stack_space expect provide a way, REQUIRED_ALGIN,
+   to say the required alignment of the address.  */
+rtx
+allocate_dynamic_stack_space_1 (rtx size, rtx target, int known_align,
+				int required_align)
+{
   /* If we're asking for zero bytes, it doesn't matter what we point
      to since we can't dereference it.  But return a reasonable
      address anyway.  */
   if (size == const0_rtx)
     return virtual_stack_dynamic_rtx;
 
+  if (required_align < BIGGEST_ALIGNMENT)
+    required_align = BIGGEST_ALIGNMENT;
+
   /* Otherwise, show we're calling alloca or equivalent.  */
   current_function_calls_alloca = 1;
 
@@ -1128,13 +1140,13 @@ allocate_dynamic_stack_space (rtx size, 
 #if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
 #define MUST_ALIGN 1
 #else
-#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < BIGGEST_ALIGNMENT)
+#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < required_align)
 #endif
 
   if (MUST_ALIGN)
     size
       = force_operand (plus_constant (size,
-				      BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+				      required_align / BITS_PER_UNIT - 1),
 		       NULL_RTX);
 
 #ifdef SETJMP_VIA_SAVE_AREA
@@ -1294,13 +1306,13 @@ allocate_dynamic_stack_space (rtx size, 
 	 but we know it can't.  So add ourselves and then do
 	 TRUNC_DIV_EXPR.  */
       target = expand_binop (Pmode, add_optab, target,
-			     GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+			     GEN_INT (required_align / BITS_PER_UNIT - 1),
 			     NULL_RTX, 1, OPTAB_LIB_WIDEN);
       target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
-			      GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
+			      GEN_INT (required_align / BITS_PER_UNIT),
 			      NULL_RTX, 1);
       target = expand_mult (Pmode, target,
-			    GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
+			    GEN_INT (required_align / BITS_PER_UNIT),
 			    NULL_RTX, 1);
     }
 
Index: Makefile.in
===================================================================
--- Makefile.in	(revision 121163)
+++ Makefile.in	(working copy)
@@ -2565,7 +2565,7 @@ cfgexpand.o : cfgexpand.c $(TREE_FLOW_H)
    $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \
    coretypes.h $(TREE_DUMP_H) except.h langhooks.h tree-pass.h $(RTL_H) \
    $(DIAGNOSTIC_H) toplev.h $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
-   value-prof.h
+   value-prof.h $(OPTABS_H)
 cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
    $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
    output.h toplev.h $(FUNCTION_H) except.h $(TM_P_H) insn-config.h $(EXPR_H) \
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 121163)
+++ config/i386/i386.c	(working copy)
@@ -5320,9 +5320,6 @@ ix86_compute_frame_layout (struct ix86_f
 
   gcc_assert (!size || stack_alignment_needed);
   gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (preferred_alignment <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (stack_alignment_needed
-	      <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
 
   if (stack_alignment_needed < STACK_BOUNDARY / BITS_PER_UNIT)
     stack_alignment_needed = STACK_BOUNDARY / BITS_PER_UNIT;
Index: stmt.c
===================================================================
--- stmt.c	(revision 121163)
+++ stmt.c	(working copy)
@@ -1848,6 +1848,7 @@ void
 expand_decl (tree decl)
 {
   tree type;
+  unsigned HOST_WIDE_INT align;
 
   type = TREE_TYPE (decl);
 
@@ -1872,6 +1873,13 @@ expand_decl (tree decl)
   if (TREE_STATIC (decl) || DECL_EXTERNAL (decl))
     return;
 
+  if (DECL_USER_ALIGN (decl))
+    align = DECL_ALIGN (decl);
+  else if (DECL_MODE (decl) == BLKmode)
+    align = BIGGEST_ALIGNMENT;
+  else
+    align = TYPE_ALIGN (type);
+
   /* Create the RTL representation for the variable.  */
 
   if (type == error_mark_node)
@@ -1892,7 +1900,8 @@ expand_decl (tree decl)
       set_mem_attributes (x, decl, 1);
       SET_DECL_RTL (decl, x);
     }
-  else if (use_register_for_decl (decl))
+  else if (use_register_for_decl (decl)
+	   && align <= PREFERRED_STACK_BOUNDARY)
     {
       /* Automatic variable that can go in a register.  */
       int unsignedp = TYPE_UNSIGNED (type);
@@ -1939,10 +1948,16 @@ expand_decl (tree decl)
 	  oldaddr = XEXP (DECL_RTL (decl), 0);
 	}
 
-      /* Set alignment we actually gave this decl.  */
-      DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
-			   : GET_MODE_BITSIZE (DECL_MODE (decl)));
-      DECL_USER_ALIGN (decl) = 0;
+      /* Set alignment we actually gave this decl if we don't have
+	 an user specific alignment and the alignment is less than
+	 the biggest alignment.  */
+      if (! DECL_USER_ALIGN (decl)
+	  || DECL_ALIGN (decl) <= BIGGEST_ALIGNMENT)
+        {
+	  DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
+			       : GET_MODE_BITSIZE (DECL_MODE (decl)));
+	  DECL_USER_ALIGN (decl) = 0;
+	}
 
       x = assign_temp (decl, 1, 1, 1);
       set_mem_attributes (x, decl, 1);
@@ -1973,8 +1988,10 @@ expand_decl (tree decl)
 	 DECL_ALIGN says how the variable is to be aligned and we
 	 cannot use it to conclude anything about the alignment of
 	 the size.  */
-      address = allocate_dynamic_stack_space (size, NULL_RTX,
-					      TYPE_ALIGN (TREE_TYPE (decl)));
+      address = allocate_dynamic_stack_space_1 (size, NULL_RTX,
+						TYPE_ALIGN (TREE_TYPE (decl)),
+						MAX (TYPE_ALIGN (TREE_TYPE (decl)),
+						align));
 
       /* Reference the variable indirect through that rtx.  */
       x = gen_rtx_MEM (DECL_MODE (decl), address);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for  variables on the stack for greater than required alignment
  2007-01-30  2:42 [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment Andrew_Pinski
@ 2007-01-31 15:20 ` Meissner, Michael
  2007-02-02 23:49 ` Richard Henderson
  2007-02-05 22:39 ` Dorit Nuzman
  2 siblings, 0 replies; 27+ messages in thread
From: Meissner, Michael @ 2007-01-31 15:20 UTC (permalink / raw)
  To: Andrew_Pinski, gcc-patches; +Cc: Trevor_Smigiel, Russell_Olsen

> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org
[mailto:gcc-patches-owner@gcc.gnu.org]
> On Behalf Of Andrew_Pinski@PlayStation.Sony.com
> Sent: Monday, January 29, 2007 9:31 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Trevor_Smigiel@PlayStation.Sony.com;
> Russell_Olsen@PlayStation.Sony.com
> Subject: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for
> variables on the stack for greater than required alignment
> 
> Hi,
>   This patch fixes PR 16660 where we ignore the user supplied
allignment
> on stack variables if it is greater than the required alignment on the
> specific target.  This fixes the issue by oversizing the stack
variable
> and then manually aligning the variable.
> 
> I had to remove some asserts from the x86 back-end as they are no
longer
> valid from the point of view of the middle-end could have a case where
> preferred_alignment is greater than the target's preferred alignment
and
> the stack alignment that is needed can be greater than the target's
> preferred alignment.
> 
> 
> OK? Bootstrapped and tested on i686-linux-gnu and powerpc-linux-gnu
with
> no regressions.
> 
> Thanks,
> Andrew Pinski
> 
> :ADDPATCH middle-end:
> 
> ChangeLog:
> 
>         * expr.h (allocate_dynamic_stack_space_1): New function
prototype.
>         * functionc.c (assign_temp): Take into account the alignment
>         of the temp if it is greater than the target's preferred
> alignment.
>         * cfgexpand.c: Include optabs.h.
>         (get_decl_align_unit): Update comment and don't lower the
> alignment
>         if it is greater than the target's preferred alignment.
>         (alloc_stack_frame_space): Take an unsigned HOST_WIDE_INT for
> align.
>         Take into account the variable's alignment if it is greater
than
>         the target's preferred alignment.
>         (expand_one_stack_var_at): Likewise.
>         (defer_stack_allocation): Defer variables who's alignment is
> greater
>         than the target's preferred alignment.
>         * explow.c (allocate_dynamic_stack_space_1): Split out from
>         allocate_dynamic_stack_space and take into acount the required
>         alignment.
>         (allocate_dynamic_stack_space): Call
> allocate_dynamic_stack_space_1.
>         * Makefile.in (cfgexpand.o): Update dependecies.
>         * stmt.c (expand_decl):  Take into account the alignment
>         of the variable if it is greater than the target's preferred
>         alignment.
>         * config/i386/i386.c (ix86_compute_frame_layout): Don't assert
>         that preferred alignment is greater than the normal preferred
>         alignment.  Don't assert that the stack alignment needed is
> greater
>         than the normal preferred alignment.
> 

I'm starting to look at patches after a long absence.  I looked at this
patch, and it looks ok.  I would suggest changing the comment in
function.c where you talk about comparing against BIGGEST_ALIGNMENT but
the code actually tests against PREFERRED_STACK_BOUNDARY.

--
Michael Meissner
AMD, MS 83-29
90 Central Street
Boxborough, MA 01719



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on  the stack for greater than required alignment
  2007-01-30  2:42 [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment Andrew_Pinski
  2007-01-31 15:20 ` Meissner, Michael
@ 2007-02-02 23:49 ` Richard Henderson
  2007-05-17 21:43   ` Andrew_Pinski
       [not found]   ` <OF15BFC61A.42B05901-ON882572DE.000355EF-882572DE.007752F8@playstation.sony.com>
  2007-02-05 22:39 ` Dorit Nuzman
  2 siblings, 2 replies; 27+ messages in thread
From: Richard Henderson @ 2007-02-02 23:49 UTC (permalink / raw)
  To: Andrew_Pinski; +Cc: gcc-patches, Trevor_Smigiel, Russell_Olsen

On Mon, Jan 29, 2007 at 06:31:00PM -0800, Andrew_Pinski@PlayStation.Sony.com wrote:
> +	  tmp = assign_stack_temp_for_type (mode, size, keep, type);
> +	  addr = XEXP (tmp, 0);
> +	  /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
> +	     but we know it can't.  So add ourselves and then do
> +	     TRUNC_DIV_EXPR.  */
> +	  addr = expand_binop (Pmode, add_optab, addr,
> +			       GEN_INT (align / BITS_PER_UNIT - 1),
> +			       NULL_RTX, 1, OPTAB_LIB_WIDEN);
> +	  addr = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, addr,
> +			        GEN_INT (align / BITS_PER_UNIT),
> +			        NULL_RTX, 1);
> +	  addr = expand_mult (Pmode, addr,
> +			      GEN_INT (align / BITS_PER_UNIT),
> +			      NULL_RTX, 1);
...
> +alloc_stack_frame_space (HOST_WIDE_INT size, unsigned HOST_WIDE_INT align)
>  {
>    HOST_WIDE_INT offset, new_frame_offset;
>  
>    new_frame_offset = frame_offset;
> +
> +  /* Allocate extra space if the alignment required is greater than the
> +     stack boundary and then assume the RTL expansion of the variable, gets
> +     the correct alignment.  */
> +  if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
> +    {
> +      size += align;
> +      align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;

I don't like how these two things are split apart.  Either cfgexpand.c
should handle both the addition and the rounding, or alloc_stack_frame_space
should.


> +  /* When the variable needs additional code to be aligned on the
> +     stack we allocate separately just for ease of implementation.  */
> +  if (get_decl_align_unit (var) > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
> +    return false;

Which does suggest it'd be interesting to create a block containing
all of the over-aligned data, create one REG that contains the 
rounded address of the base of that block, and all of the data
therein is offsets from that one REG.

Most of the infrastructure for that is already present, what with
the partitioning and sorting that we already do...

> @@ -1892,7 +1900,8 @@ expand_decl (tree decl)
>        set_mem_attributes (x, decl, 1);
>        SET_DECL_RTL (decl, x);
>      }
> -  else if (use_register_for_decl (decl))
> +  else if (use_register_for_decl (decl)
> +	   && align <= PREFERRED_STACK_BOUNDARY)

Why do we need to check alignment for register data?



r~

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables  on  the stack for greater than required alignment
  2007-01-30  2:42 [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment Andrew_Pinski
  2007-01-31 15:20 ` Meissner, Michael
  2007-02-02 23:49 ` Richard Henderson
@ 2007-02-05 22:39 ` Dorit Nuzman
  2 siblings, 0 replies; 27+ messages in thread
From: Dorit Nuzman @ 2007-02-05 22:39 UTC (permalink / raw)
  To: Andrew_Pinski; +Cc: gcc-patches, Russell_Olsen, Trevor_Smigiel

> Hi,
>   This patch fixes PR 16660 where we ignore the user supplied allignment
> on stack variables if it is greater than the required alignment on the
> specific target.  This fixes the issue by oversizing the stack variable
> and then manually aligning the variable.
>

does this mean that during vectorization we could avoid doing this in
'vect_can_force_dr_alignment_p':
      return (alignment <= PREFERRED_STACK_BOUNDARY);
and just return 1 instead, relying on your patch to take care of things,
since we mark the decl as having a user-supplied-alignment:
      DECL_USER_ALIGN (base) = 1;
?
(i.e. - with your patch, would it be ok to let the vectorizer increase the
alignment of arrays (on the stack) even if the required alignment is
greater than PREFERRED_STACK_BOUNDARY?)

thanks,
dorit

> I had to remove some asserts from the x86 back-end as they are no longer
> valid from the point of view of the middle-end could have a case where
> preferred_alignment is greater than the target's preferred alignment and
> the stack alignment that is needed can be greater than the target's
> preferred alignment.
>
>
> OK? Bootstrapped and tested on i686-linux-gnu and powerpc-linux-gnu with
> no regressions.
>
> Thanks,
> Andrew Pinski
>
> :ADDPATCH middle-end:
>
> ChangeLog:
>
>         * expr.h (allocate_dynamic_stack_space_1): New function
prototype.
>         * functionc.c (assign_temp): Take into account the alignment
>         of the temp if it is greater than the target's preferred
> alignment.
>         * cfgexpand.c: Include optabs.h.
>         (get_decl_align_unit): Update comment and don't lower the
> alignment
>         if it is greater than the target's preferred alignment.
>         (alloc_stack_frame_space): Take an unsigned HOST_WIDE_INT for
> align.
>         Take into account the variable's alignment if it is greater than
>         the target's preferred alignment.
>         (expand_one_stack_var_at): Likewise.
>         (defer_stack_allocation): Defer variables who's alignment is
> greater
>         than the target's preferred alignment.
>         * explow.c (allocate_dynamic_stack_space_1): Split out from
>         allocate_dynamic_stack_space and take into acount the required
>         alignment.
>         (allocate_dynamic_stack_space): Call
> allocate_dynamic_stack_space_1.
>         * Makefile.in (cfgexpand.o): Update dependecies.
>         * stmt.c (expand_decl):  Take into account the alignment
>         of the variable if it is greater than the target's preferred
>         alignment.
>         * config/i386/i386.c (ix86_compute_frame_layout): Don't assert
>         that preferred alignment is greater than the normal preferred
>         alignment.  Don't assert that the stack alignment needed is
> greater
>         than the normal preferred alignment.
>
>
> [attachment "attributealignedstack.diff.txt" deleted by Dorit
> Nuzman/Haifa/IBM]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables  on  the stack for greater than required alignment
  2007-02-02 23:49 ` Richard Henderson
@ 2007-05-17 21:43   ` Andrew_Pinski
       [not found]   ` <OF15BFC61A.42B05901-ON882572DE.000355EF-882572DE.007752F8@playstation.sony.com>
  1 sibling, 0 replies; 27+ messages in thread
From: Andrew_Pinski @ 2007-05-17 21:43 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches, Russell_Olsen, Trevor_Smigiel

[-- Attachment #1: Type: text/plain, Size: 2934 bytes --]

Richard Henderson <rth@redhat.com> wrote on 02/02/2007 03:49:10 PM:
> I don't like how these two things are split apart.  Either cfgexpand.c
> should handle both the addition and the rounding, or 
alloc_stack_frame_space
> should.

I don't see any other way to implement this because the offset needs to 
take into account the oversize correctly.

> Which does suggest it'd be interesting to create a block containing
> all of the over-aligned data, create one REG that contains the 
> rounded address of the base of that block, and all of the data
> therein is offsets from that one REG.
> 
> Most of the infrastructure for that is already present, what with
> the partitioning and sorting that we already do...

In fact getting rid of this works and I was able to get the correct 
alignment still.  The patch includes a testcase for this case also.
 
> > @@ -1892,7 +1900,8 @@ expand_decl (tree decl)
> >        set_mem_attributes (x, decl, 1);
> >        SET_DECL_RTL (decl, x);
> >      }
> > -  else if (use_register_for_decl (decl))
> > +  else if (use_register_for_decl (decl)
> > +      && align <= PREFERRED_STACK_BOUNDARY)
> 
> Why do we need to check alignment for register data?

Good point, I removed this code too.  I blindly copied this part of the 
code from our local tree thinking it might have been a good idea but I was 
clearly wrong.

OK? Bootstrapped and tested on i686-linux-gnu with no regressions.

Thanks,
Andrew Pinski

ChangeLog:

        * expr.h  (allocate_dynamic_stack_space_1): New function 
prototype.
        * functionc.c (assign_temp): Take into account the alignment
        of the temp if it is greater than the target's preferred 
alignment.
        * cfgexpand.c: Include optabs.h.
        (get_decl_align_unit): Update comment and don't lower the 
alignment
        if it is greater than the target's preferred alignment.
        (alloc_stack_frame_space): Take an unsigned HOST_WIDE_INT for 
align.
        Take into account the variable's alignment if it is greater than
        the target's preferred alignment.
        (expand_one_stack_var_at):  Likewise.
        * explow.c (allocate_dynamic_stack_space_1): Split out from
        allocate_dynamic_stack_space and take into acount the required
        alignment.
        (allocate_dynamic_stack_space): Call 
allocate_dynamic_stack_space_1.
        * Makefile.in (cfgexpand.o): Update dependecies.
        * stmt.c (expand_decl):  Take into account the alignment
        of the variable if it is greater than the target's preferred
        alignment.
        * config/i386/i386.c (ix86_compute_frame_layout): Don't assert
        that preferred alignment is greater than the normal preferred
        alignment.  Don't assert that the stack alignment needed is 
greater
        than the normal preferred alignment. 

        * gcc.c-torture/execute/pr16660-1.c: New testcase.
        * gcc.c-torture/execute/pr16660-2.c: New testcase.


[-- Attachment #2: fixpr16660.diff.txt --]
[-- Type: text/plain, Size: 12576 bytes --]

Index: testsuite/gcc.c-torture/execute/pr16660-1.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
@@ -0,0 +1,11 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+  size_t b = (size_t)&a;
+  if (b&(ALIGNMENT-1))
+    return 1;
+  return 0;
+}
+
Index: testsuite/gcc.c-torture/execute/pr16660-2.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
@@ -0,0 +1,19 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  {
+    int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+    size_t b = (size_t)&a;
+    if (b&(ALIGNMENT-1))
+      return 1;
+  }
+  {
+    int a1[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+    size_t b = (size_t)&a1;
+    if (b&(ALIGNMENT-1))
+      return 1;
+  }
+  return 0;
+}
+
Index: expr.h
===================================================================
--- expr.h	(revision 124657)
+++ expr.h	(working copy)
@@ -712,6 +712,7 @@ extern void update_nonlocal_goto_save_ar
 /* Allocate some space on the stack dynamically and return its address.  An rtx
    says how many bytes.  */
 extern rtx allocate_dynamic_stack_space (rtx, rtx, int);
+extern rtx allocate_dynamic_stack_space_1 (rtx, rtx, int, int);
 
 /* Probe a range of stack addresses from FIRST to FIRST+SIZE, inclusive.
    FIRST is a constant and size is a Pmode RTX.  These are offsets from the
Index: function.c
===================================================================
--- function.c	(revision 124657)
+++ function.c	(working copy)
@@ -813,6 +813,7 @@ assign_temp (tree type_or_decl, int keep
   if (mode == BLKmode || memory_required)
     {
       HOST_WIDE_INT size = int_size_in_bytes (type);
+      unsigned HOST_WIDE_INT align;
       rtx tmp;
 
       /* Zero sized arrays are GNU C extension.  Set size to 1 to avoid
@@ -837,7 +838,37 @@ assign_temp (tree type_or_decl, int keep
 	  size = 1;
 	}
 
-      tmp = assign_stack_temp_for_type (mode, size, keep, type);
+      align = TYPE_ALIGN (type);
+      if (decl && DECL_ALIGN (decl) > align)
+	align = DECL_ALIGN (decl);
+
+      /* When the user specifies an alignment larger than
+         BIGGEST_ALIGNMENT we need to allocate extra space so we can
+         adjust this decl's address.  */
+      if (align > PREFERRED_STACK_BOUNDARY)
+	{
+	  rtx addr;
+	  HOST_WIDE_INT pad = (align - PREFERRED_STACK_BOUNDARY) / BITS_PER_UNIT;
+	  size += pad;
+	  tmp = assign_stack_temp_for_type (mode, size, keep, type);
+	  addr = XEXP (tmp, 0);
+	  /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
+	     but we know it can't.  So add ourselves and then do
+	     TRUNC_DIV_EXPR.  */
+	  addr = expand_binop (Pmode, add_optab, addr,
+			       GEN_INT (align / BITS_PER_UNIT - 1),
+			       NULL_RTX, 1, OPTAB_LIB_WIDEN);
+	  addr = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, addr,
+			        GEN_INT (align / BITS_PER_UNIT),
+			        NULL_RTX, 1);
+	  addr = expand_mult (Pmode, addr,
+			      GEN_INT (align / BITS_PER_UNIT),
+			      NULL_RTX, 1);
+	  tmp = change_address (tmp, VOIDmode, addr);
+	}
+      else
+	tmp = assign_stack_temp_for_type (mode, size, keep, type);
+
       return tmp;
     }
 
Index: cfgexpand.c
===================================================================
--- cfgexpand.c	(revision 124657)
+++ cfgexpand.c	(working copy)
@@ -41,6 +41,7 @@ Boston, MA 02110-1301, USA.  */
 #include "params.h"
 #include "tree-inline.h"
 #include "value-prof.h"
+#include "optabs.h"
 
 /* Verify that there is exactly single jump instruction since last and attach
    REG_BR_PROB note specifying probability.
@@ -151,8 +152,7 @@ static bool has_protected_decls;
    smaller than our cutoff threshold.  Used for -Wstack-protector.  */
 static bool has_short_buffer;
 
-/* Discover the byte alignment to use for DECL.  Ignore alignment
-   we can't do with expected alignment of the stack boundary.  */
+/* Discover the byte alignment to use for DECL.  */
 
 static unsigned int
 get_decl_align_unit (tree decl)
@@ -161,8 +161,6 @@ get_decl_align_unit (tree decl)
 
   align = DECL_ALIGN (decl);
   align = LOCAL_ALIGNMENT (TREE_TYPE (decl), align);
-  if (align > PREFERRED_STACK_BOUNDARY)
-    align = PREFERRED_STACK_BOUNDARY;
   if (cfun->stack_alignment_needed < align)
     cfun->stack_alignment_needed = align;
 
@@ -173,11 +171,21 @@ get_decl_align_unit (tree decl)
    Return the frame offset.  */
 
 static HOST_WIDE_INT
-alloc_stack_frame_space (HOST_WIDE_INT size, HOST_WIDE_INT align)
+alloc_stack_frame_space (HOST_WIDE_INT size, unsigned HOST_WIDE_INT align)
 {
   HOST_WIDE_INT offset, new_frame_offset;
 
   new_frame_offset = frame_offset;
+
+  /* Allocate extra space if the alignment required is greater than the
+     stack boundary and then assume the RTL expansion of the variable, gets
+     the correct alignment.  */
+  if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      size += align;
+      align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
+    }
+
   if (FRAME_GROWS_DOWNWARD)
     {
       new_frame_offset -= size + frame_phase;
@@ -523,23 +531,27 @@ dump_stack_var_partition (void)
 static void
 expand_one_stack_var_at (tree decl, HOST_WIDE_INT offset)
 {
-  HOST_WIDE_INT align;
+  unsigned HOST_WIDE_INT align;
   rtx x;
 
   /* If this fails, we've overflowed the stack frame.  Error nicely?  */
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (virtual_stack_vars_rtx, offset);
-  x = gen_rtx_MEM (DECL_MODE (decl), x);
+  align = get_decl_align_unit (decl);
+
+   if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      rtx addr = x;
+      addr = expand_binop (Pmode, add_optab, addr,
+			   GEN_INT (align - 1),
+			   NULL_RTX, 1, OPTAB_LIB_WIDEN);
+      addr = expand_and (Pmode, addr, GEN_INT (-align),
+			    NULL_RTX);
+      x = addr;
+    } 
 
-  /* Set alignment we actually gave this decl.  */
-  offset -= frame_phase;
-  align = offset & -offset;
-  align *= BITS_PER_UNIT;
-  if (align > STACK_BOUNDARY || align == 0)
-    align = STACK_BOUNDARY;
-  DECL_ALIGN (decl) = align;
-  DECL_USER_ALIGN (decl) = 0;
+  x = gen_rtx_MEM (DECL_MODE (decl), x);
 
   set_mem_attributes (x, decl, true);
   SET_DECL_RTL (decl, x);
Index: explow.c
===================================================================
--- explow.c	(revision 124657)
+++ explow.c	(working copy)
@@ -1093,16 +1093,28 @@ update_nonlocal_goto_save_area (void)
    TARGET is a place in which the address can be placed.
 
    KNOWN_ALIGN is the alignment (in bits) that we know SIZE has.  */
-
 rtx
 allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
 {
+  return allocate_dynamic_stack_space_1 (size, target, known_align,
+				       BIGGEST_ALIGNMENT);
+}
+
+/* Like allocate_dynamic_stack_space expect provide a way, REQUIRED_ALGIN,
+   to say the required alignment of the address.  */
+rtx
+allocate_dynamic_stack_space_1 (rtx size, rtx target, int known_align,
+				int required_align)
+{
   /* If we're asking for zero bytes, it doesn't matter what we point
      to since we can't dereference it.  But return a reasonable
      address anyway.  */
   if (size == const0_rtx)
     return virtual_stack_dynamic_rtx;
 
+  if (required_align < BIGGEST_ALIGNMENT)
+    required_align = BIGGEST_ALIGNMENT;
+
   /* Otherwise, show we're calling alloca or equivalent.  */
   current_function_calls_alloca = 1;
 
@@ -1131,13 +1143,13 @@ allocate_dynamic_stack_space (rtx size, 
 #if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
 #define MUST_ALIGN 1
 #else
-#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < BIGGEST_ALIGNMENT)
+#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < required_align)
 #endif
 
   if (MUST_ALIGN)
     size
       = force_operand (plus_constant (size,
-				      BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+				      required_align / BITS_PER_UNIT - 1),
 		       NULL_RTX);
 
 #ifdef SETJMP_VIA_SAVE_AREA
@@ -1293,18 +1305,12 @@ allocate_dynamic_stack_space (rtx size, 
 
   if (MUST_ALIGN)
     {
-      /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
-	 but we know it can't.  So add ourselves and then do
-	 TRUNC_DIV_EXPR.  */
       target = expand_binop (Pmode, add_optab, target,
-			     GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+			     GEN_INT (required_align / BITS_PER_UNIT - 1),
 			     NULL_RTX, 1, OPTAB_LIB_WIDEN);
-      target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
-			      GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
-			      NULL_RTX, 1);
-      target = expand_mult (Pmode, target,
-			    GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
-			    NULL_RTX, 1);
+      target = expand_and (Pmode, target,
+			   GEN_INT (-required_align / BITS_PER_UNIT),
+			   NULL_RTX);
     }
 
   /* Record the new stack level for nonlocal gotos.  */
Index: Makefile.in
===================================================================
--- Makefile.in	(revision 124657)
+++ Makefile.in	(working copy)
@@ -2553,7 +2553,7 @@ cfgexpand.o : cfgexpand.c $(TREE_FLOW_H)
    $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \
    coretypes.h $(TREE_DUMP_H) except.h langhooks.h tree-pass.h $(RTL_H) \
    $(DIAGNOSTIC_H) toplev.h $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
-   value-prof.h
+   value-prof.h $(OPTABS_H)
 cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
    $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
    output.h toplev.h $(FUNCTION_H) except.h $(TM_P_H) insn-config.h $(EXPR_H) \
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 124657)
+++ config/i386/i386.c	(working copy)
@@ -5610,9 +5610,6 @@ ix86_compute_frame_layout (struct ix86_f
 
   gcc_assert (!size || stack_alignment_needed);
   gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (preferred_alignment <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (stack_alignment_needed
-	      <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
 
   if (stack_alignment_needed < STACK_BOUNDARY / BITS_PER_UNIT)
     stack_alignment_needed = STACK_BOUNDARY / BITS_PER_UNIT;
Index: stmt.c
===================================================================
--- stmt.c	(revision 124657)
+++ stmt.c	(working copy)
@@ -1850,6 +1850,7 @@ void
 expand_decl (tree decl)
 {
   tree type;
+  unsigned HOST_WIDE_INT align;
 
   type = TREE_TYPE (decl);
 
@@ -1874,6 +1875,13 @@ expand_decl (tree decl)
   if (TREE_STATIC (decl) || DECL_EXTERNAL (decl))
     return;
 
+  if (DECL_USER_ALIGN (decl))
+    align = DECL_ALIGN (decl);
+  else if (DECL_MODE (decl) == BLKmode)
+    align = BIGGEST_ALIGNMENT;
+  else
+    align = TYPE_ALIGN (type);
+
   /* Create the RTL representation for the variable.  */
 
   if (type == error_mark_node)
@@ -1941,10 +1949,16 @@ expand_decl (tree decl)
 	  oldaddr = XEXP (DECL_RTL (decl), 0);
 	}
 
-      /* Set alignment we actually gave this decl.  */
-      DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
-			   : GET_MODE_BITSIZE (DECL_MODE (decl)));
-      DECL_USER_ALIGN (decl) = 0;
+      /* Set alignment we actually gave this decl if we don't have
+	 an user specific alignment and the alignment is less than
+	 the biggest alignment.  */
+      if (! DECL_USER_ALIGN (decl)
+	  || DECL_ALIGN (decl) <= BIGGEST_ALIGNMENT)
+        {
+	  DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
+			       : GET_MODE_BITSIZE (DECL_MODE (decl)));
+	  DECL_USER_ALIGN (decl) = 0;
+	}
 
       x = assign_temp (decl, 1, 1, 1);
       set_mem_attributes (x, decl, 1);
@@ -1975,8 +1989,10 @@ expand_decl (tree decl)
 	 DECL_ALIGN says how the variable is to be aligned and we
 	 cannot use it to conclude anything about the alignment of
 	 the size.  */
-      address = allocate_dynamic_stack_space (size, NULL_RTX,
-					      TYPE_ALIGN (TREE_TYPE (decl)));
+      address = allocate_dynamic_stack_space_1 (size, NULL_RTX,
+						TYPE_ALIGN (TREE_TYPE (decl)),
+						MAX (TYPE_ALIGN (TREE_TYPE (decl)),
+						align));
 
       /* Reference the variable indirect through that rtx.  */
       x = gen_rtx_MEM (DECL_MODE (decl), address);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on  the stack for greater than required alignment
       [not found]   ` <OF15BFC61A.42B05901-ON882572DE.000355EF-882572DE.007752F8@playstation.sony.com>
@ 2007-05-17 23:50     ` trevor_smigiel
  2007-05-18  0:20       ` Andrew_Pinski
  2007-05-18 17:42       ` Andrew_Pinski
  0 siblings, 2 replies; 27+ messages in thread
From: trevor_smigiel @ 2007-05-17 23:50 UTC (permalink / raw)
  To: Andrew_Pinski; +Cc: Richard Henderson, gcc-patches, Russell_Olsen

Andrew,

It seems you missed merging a change that I made in our local tree.

In cfgexpand.c, you removed this code:

-  /* Set alignment we actually gave this decl.  */
-  offset -= frame_phase;
-  align = offset & -offset;
-  align *= BITS_PER_UNIT;
-  if (align > STACK_BOUNDARY || align == 0)
-    align = STACK_BOUNDARY;
-  DECL_ALIGN (decl) = align;
-  DECL_USER_ALIGN (decl) = 0;

In our local tree that code is conditioned on 
  if (!DECL_USER_ALIGN (decl))
(Which could be done more precisely.  Consider the case where a user
specified alignment is smaller than the decl's actual alignment on the
stack.)

Without it less efficient code can be generated because higher
alignments are not propagated.  At least, that's the behaviour on 4.1.1.

Trevor

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables  on  the stack for greater than required alignment
  2007-05-17 23:50     ` trevor_smigiel
@ 2007-05-18  0:20       ` Andrew_Pinski
  2007-05-18 17:42       ` Andrew_Pinski
  1 sibling, 0 replies; 27+ messages in thread
From: Andrew_Pinski @ 2007-05-18  0:20 UTC (permalink / raw)
  To: trevor_smigiel; +Cc: gcc-patches, Richard Henderson, Russell_Olsen

Trevor Smigiel/R&D/SCEA@Playstation wrote on 05/17/2007 04:49:53 PM:

> Andrew,
> 
> It seems you missed merging a change that I made in our local tree.
> 
> In cfgexpand.c, you removed this code:
> 
> -  /* Set alignment we actually gave this decl.  */
> -  offset -= frame_phase;
> -  align = offset & -offset;
> -  align *= BITS_PER_UNIT;
> -  if (align > STACK_BOUNDARY || align == 0)
> -    align = STACK_BOUNDARY;
> -  DECL_ALIGN (decl) = align;
> -  DECL_USER_ALIGN (decl) = 0;
> 
> In our local tree that code is conditioned on 
>   if (!DECL_USER_ALIGN (decl))
> (Which could be done more precisely.  Consider the case where a user
> specified alignment is smaller than the decl's actual alignment on the
> stack.)
> 
> Without it less efficient code can be generated because higher
> alignments are not propagated.  At least, that's the behaviour on 4.1.1.

Here is a testcase which shows that even checking for DECL_USER_ALIGN is 
wrong, we need to check for align < STACK_BOUNDARY also.  Before my patch, 
we would get the mem for accessing i with an alignment of 32 bits but 
after this patch, we get an alignment of 8.  I will fix this testcase now 
too.

Thanks,
Andrew Pinski


/* { dg-do compile } */
/* { dg-options "-O2 -fdump-rtl-expand" } */
int g(int *);
int f(void)
{
  int i __attribute__((aligned(1) ));
  int *a = &i;
  g(a);
  return *a;
}
/* Even though the user supplied an alignment of 1, the memory
   location for i should have the natural alignment of int.   */

/* { dg-final { scan-rtl-dump-not "A8" "expand" } } */
/* { dg-final { scan-rtl-dump "A32" "expand" } } */
/* { dg-final { cleanup-rtl-dump "expand" } } */

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables  on  the stack for greater than required alignment
  2007-05-17 23:50     ` trevor_smigiel
  2007-05-18  0:20       ` Andrew_Pinski
@ 2007-05-18 17:42       ` Andrew_Pinski
  2007-07-02 23:56         ` Andrew Pinski
  1 sibling, 1 reply; 27+ messages in thread
From: Andrew_Pinski @ 2007-05-18 17:42 UTC (permalink / raw)
  To: trevor_smigiel; +Cc: gcc-patches, Richard Henderson, Russell_Olsen

[-- Attachment #1: Type: text/plain, Size: 3121 bytes --]

Trevor Smigiel/R&D/SCEA@Playstation wrote on 05/17/2007 04:49:53 PM:

> Andrew,
> 
> It seems you missed merging a change that I made in our local tree.
> 
> In cfgexpand.c, you removed this code:
> 
> -  /* Set alignment we actually gave this decl.  */
> -  offset -= frame_phase;
> -  align = offset & -offset;
> -  align *= BITS_PER_UNIT;
> -  if (align > STACK_BOUNDARY || align == 0)
> -    align = STACK_BOUNDARY;
> -  DECL_ALIGN (decl) = align;
> -  DECL_USER_ALIGN (decl) = 0;
> 
> In our local tree that code is conditioned on 
>   if (!DECL_USER_ALIGN (decl))
> (Which could be done more precisely.  Consider the case where a user
> specified alignment is smaller than the decl's actual alignment on the
> stack.)
> 
> Without it less efficient code can be generated because higher
> alignments are not propagated.  At least, that's the behaviour on 4.1.1.

Here is the new patch which fixes that problem and we get the get the 
correct alignment on the variables now that we were getting a smaller 
alignment on with my older version of the patch.  It adds a testcase which 
verifies this is the case too (I only check on ia32/x86_64, powerpc* and 
spu as those are the only targets which I know the normal alignment of 
stack is and the only targets I could test the testcase on).

OK? Bootstrapped and tested on i686-linux-gnu with no regressions.

Thanks,
Andrew Pinski

ChangeLog:

        * expr.h  (allocate_dynamic_stack_space_1): New function 
prototype.
        * functionc.c (assign_temp): Take into account the alignment
        of the temp if it is greater than the target's preferred 
alignment.
        * cfgexpand.c: Include optabs.h.
        (get_decl_align_unit): Update comment and don't lower the 
alignment
        if it is greater than the target's preferred alignment.
        (alloc_stack_frame_space): Take an unsigned HOST_WIDE_INT for 
align.
        Take into account the variable's alignment if it is greater than
        the target's preferred alignment.
        (expand_one_stack_var_at):  Likewise.  Only reset the alignment of 
the decl to given
        alignment if the alignment is less than the target's preferred 
alignment.
        * explow.c (allocate_dynamic_stack_space_1): Split out from
        allocate_dynamic_stack_space and take into acount the required
        alignment.  Use AND opcode instead of shifting left and shifting
        back right.
        (allocate_dynamic_stack_space): Call 
allocate_dynamic_stack_space_1.
        * Makefile.in (cfgexpand.o): Update dependecies.
        * stmt.c (expand_decl):  Take into account the alignment
        of the variable if it is greater than the target's preferred
        alignment.
        * config/i386/i386.c (ix86_compute_frame_layout): Don't assert
        that preferred alignment is greater than the normal preferred
        alignment.  Don't assert that the stack alignment needed is 
greater
        than the normal preferred alignment. 

        * gcc.c-torture/execute/pr16660-1.c: New testcase.
        * gcc.c-torture/execute/pr16660-2.c: New testcase.
        * gcc.dg/pr16660-1.c: New testcase.



[-- Attachment #2: fixpr16660.diff.txt --]
[-- Type: text/plain, Size: 13936 bytes --]

Index: testsuite/gcc.c-torture/execute/pr16660-1.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
@@ -0,0 +1,11 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+  size_t b = (size_t)&a;
+  if (b&(ALIGNMENT-1))
+    return 1;
+  return 0;
+}
+
Index: testsuite/gcc.c-torture/execute/pr16660-2.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
@@ -0,0 +1,19 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  {
+    int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+    size_t b = (size_t)&a;
+    if (b&(ALIGNMENT-1))
+      return 1;
+  }
+  {
+    int a1[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+    size_t b = (size_t)&a1;
+    if (b&(ALIGNMENT-1))
+      return 1;
+  }
+  return 0;
+}
+
Index: testsuite/gcc.dg/pr16660-1.c
===================================================================
--- testsuite/gcc.dg/pr16660-1.c	(revision 0)
+++ testsuite/gcc.dg/pr16660-1.c	(revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-expand" } */
+int g(int *);
+int f(void)
+{
+ int i __attribute__((aligned(1) ));
+ int *a = &i;
+ g(a);
+ return *a;
+}
+/* Even though the user supplied an alignment of 1, the memory
+  location for i should have the natural alignment of the stack.
+  Test only on x86, spu and powerpc for now.  */
+
+/* { dg-final { scan-rtl-dump-not "2 i\\\+0 S4 A8" "expand" } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A32" "expand" { target i?86-*-* x86_64-*-* } } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A64" "expand" { target { { rs6000-*-* powerpc*-*-* } && ilp32 } } } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A128" "expand" { target { spu-*-* } || { { rs6000-*-* powerpc*-*-* } && lp64 } } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
Index: expr.h
===================================================================
--- expr.h	(revision 124657)
+++ expr.h	(working copy)
@@ -712,6 +712,7 @@ extern void update_nonlocal_goto_save_ar
 /* Allocate some space on the stack dynamically and return its address.  An rtx
    says how many bytes.  */
 extern rtx allocate_dynamic_stack_space (rtx, rtx, int);
+extern rtx allocate_dynamic_stack_space_1 (rtx, rtx, int, int);
 
 /* Probe a range of stack addresses from FIRST to FIRST+SIZE, inclusive.
    FIRST is a constant and size is a Pmode RTX.  These are offsets from the
Index: function.c
===================================================================
--- function.c	(revision 124657)
+++ function.c	(working copy)
@@ -813,6 +813,7 @@ assign_temp (tree type_or_decl, int keep
   if (mode == BLKmode || memory_required)
     {
       HOST_WIDE_INT size = int_size_in_bytes (type);
+      unsigned HOST_WIDE_INT align;
       rtx tmp;
 
       /* Zero sized arrays are GNU C extension.  Set size to 1 to avoid
@@ -837,7 +838,37 @@ assign_temp (tree type_or_decl, int keep
 	  size = 1;
 	}
 
-      tmp = assign_stack_temp_for_type (mode, size, keep, type);
+      align = TYPE_ALIGN (type);
+      if (decl && DECL_ALIGN (decl) > align)
+	align = DECL_ALIGN (decl);
+
+      /* When the user specifies an alignment larger than
+         BIGGEST_ALIGNMENT we need to allocate extra space so we can
+         adjust this decl's address.  */
+      if (align > PREFERRED_STACK_BOUNDARY)
+	{
+	  rtx addr;
+	  HOST_WIDE_INT pad = (align - PREFERRED_STACK_BOUNDARY) / BITS_PER_UNIT;
+	  size += pad;
+	  tmp = assign_stack_temp_for_type (mode, size, keep, type);
+	  addr = XEXP (tmp, 0);
+	  /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
+	     but we know it can't.  So add ourselves and then do
+	     TRUNC_DIV_EXPR.  */
+	  addr = expand_binop (Pmode, add_optab, addr,
+			       GEN_INT (align / BITS_PER_UNIT - 1),
+			       NULL_RTX, 1, OPTAB_LIB_WIDEN);
+	  addr = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, addr,
+			        GEN_INT (align / BITS_PER_UNIT),
+			        NULL_RTX, 1);
+	  addr = expand_mult (Pmode, addr,
+			      GEN_INT (align / BITS_PER_UNIT),
+			      NULL_RTX, 1);
+	  tmp = change_address (tmp, VOIDmode, addr);
+	}
+      else
+	tmp = assign_stack_temp_for_type (mode, size, keep, type);
+
       return tmp;
     }
 
Index: cfgexpand.c
===================================================================
--- cfgexpand.c	(revision 124657)
+++ cfgexpand.c	(working copy)
@@ -41,6 +41,7 @@ Boston, MA 02110-1301, USA.  */
 #include "params.h"
 #include "tree-inline.h"
 #include "value-prof.h"
+#include "optabs.h"
 
 /* Verify that there is exactly single jump instruction since last and attach
    REG_BR_PROB note specifying probability.
@@ -151,8 +152,7 @@ static bool has_protected_decls;
    smaller than our cutoff threshold.  Used for -Wstack-protector.  */
 static bool has_short_buffer;
 
-/* Discover the byte alignment to use for DECL.  Ignore alignment
-   we can't do with expected alignment of the stack boundary.  */
+/* Discover the byte alignment to use for DECL.  */
 
 static unsigned int
 get_decl_align_unit (tree decl)
@@ -161,8 +161,6 @@ get_decl_align_unit (tree decl)
 
   align = DECL_ALIGN (decl);
   align = LOCAL_ALIGNMENT (TREE_TYPE (decl), align);
-  if (align > PREFERRED_STACK_BOUNDARY)
-    align = PREFERRED_STACK_BOUNDARY;
   if (cfun->stack_alignment_needed < align)
     cfun->stack_alignment_needed = align;
 
@@ -173,11 +171,21 @@ get_decl_align_unit (tree decl)
    Return the frame offset.  */
 
 static HOST_WIDE_INT
-alloc_stack_frame_space (HOST_WIDE_INT size, HOST_WIDE_INT align)
+alloc_stack_frame_space (HOST_WIDE_INT size, unsigned HOST_WIDE_INT align)
 {
   HOST_WIDE_INT offset, new_frame_offset;
 
   new_frame_offset = frame_offset;
+
+  /* Allocate extra space if the alignment required is greater than the
+     stack boundary and then assume the RTL expansion of the variable, gets
+     the correct alignment.  */
+  if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      size += align;
+      align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
+    }
+
   if (FRAME_GROWS_DOWNWARD)
     {
       new_frame_offset -= size + frame_phase;
@@ -523,23 +531,38 @@ dump_stack_var_partition (void)
 static void
 expand_one_stack_var_at (tree decl, HOST_WIDE_INT offset)
 {
-  HOST_WIDE_INT align;
+  unsigned HOST_WIDE_INT align;
   rtx x;
 
   /* If this fails, we've overflowed the stack frame.  Error nicely?  */
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (virtual_stack_vars_rtx, offset);
-  x = gen_rtx_MEM (DECL_MODE (decl), x);
+  align = get_decl_align_unit (decl);
 
-  /* Set alignment we actually gave this decl.  */
-  offset -= frame_phase;
-  align = offset & -offset;
-  align *= BITS_PER_UNIT;
-  if (align > STACK_BOUNDARY || align == 0)
-    align = STACK_BOUNDARY;
-  DECL_ALIGN (decl) = align;
-  DECL_USER_ALIGN (decl) = 0;
+   if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      rtx addr = x;
+      addr = expand_binop (Pmode, add_optab, addr,
+			   GEN_INT (align - 1),
+			   NULL_RTX, 1, OPTAB_LIB_WIDEN);
+      addr = expand_and (Pmode, addr, GEN_INT (-align),
+			    NULL_RTX);
+      x = addr;
+    } 
+  else
+    {
+      /* Set alignment we actually gave this decl.  */
+      offset -= frame_phase;
+      align = offset & -offset;
+      align *= BITS_PER_UNIT;
+      if (align > STACK_BOUNDARY || align == 0)
+        align = STACK_BOUNDARY;
+      DECL_ALIGN (decl) = align;
+      DECL_USER_ALIGN (decl) = 0;
+    }
+
+  x = gen_rtx_MEM (DECL_MODE (decl), x);
 
   set_mem_attributes (x, decl, true);
   SET_DECL_RTL (decl, x);
Index: explow.c
===================================================================
--- explow.c	(revision 124657)
+++ explow.c	(working copy)
@@ -1093,16 +1093,28 @@ update_nonlocal_goto_save_area (void)
    TARGET is a place in which the address can be placed.
 
    KNOWN_ALIGN is the alignment (in bits) that we know SIZE has.  */
-
 rtx
 allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
 {
+  return allocate_dynamic_stack_space_1 (size, target, known_align,
+				       BIGGEST_ALIGNMENT);
+}
+
+/* Like allocate_dynamic_stack_space expect provide a way, REQUIRED_ALGIN,
+   to say the required alignment of the address.  */
+rtx
+allocate_dynamic_stack_space_1 (rtx size, rtx target, int known_align,
+				int required_align)
+{
   /* If we're asking for zero bytes, it doesn't matter what we point
      to since we can't dereference it.  But return a reasonable
      address anyway.  */
   if (size == const0_rtx)
     return virtual_stack_dynamic_rtx;
 
+  if (required_align < BIGGEST_ALIGNMENT)
+    required_align = BIGGEST_ALIGNMENT;
+
   /* Otherwise, show we're calling alloca or equivalent.  */
   current_function_calls_alloca = 1;
 
@@ -1131,13 +1143,13 @@ allocate_dynamic_stack_space (rtx size, 
 #if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
 #define MUST_ALIGN 1
 #else
-#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < BIGGEST_ALIGNMENT)
+#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < required_align)
 #endif
 
   if (MUST_ALIGN)
     size
       = force_operand (plus_constant (size,
-				      BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+				      required_align / BITS_PER_UNIT - 1),
 		       NULL_RTX);
 
 #ifdef SETJMP_VIA_SAVE_AREA
@@ -1293,18 +1305,12 @@ allocate_dynamic_stack_space (rtx size, 
 
   if (MUST_ALIGN)
     {
-      /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
-	 but we know it can't.  So add ourselves and then do
-	 TRUNC_DIV_EXPR.  */
       target = expand_binop (Pmode, add_optab, target,
-			     GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+			     GEN_INT (required_align / BITS_PER_UNIT - 1),
 			     NULL_RTX, 1, OPTAB_LIB_WIDEN);
-      target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
-			      GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
-			      NULL_RTX, 1);
-      target = expand_mult (Pmode, target,
-			    GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
-			    NULL_RTX, 1);
+      target = expand_and (Pmode, target,
+			   GEN_INT (-required_align / BITS_PER_UNIT),
+			   NULL_RTX);
     }
 
   /* Record the new stack level for nonlocal gotos.  */
Index: Makefile.in
===================================================================
--- Makefile.in	(revision 124657)
+++ Makefile.in	(working copy)
@@ -2553,7 +2553,7 @@ cfgexpand.o : cfgexpand.c $(TREE_FLOW_H)
    $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \
    coretypes.h $(TREE_DUMP_H) except.h langhooks.h tree-pass.h $(RTL_H) \
    $(DIAGNOSTIC_H) toplev.h $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
-   value-prof.h
+   value-prof.h $(OPTABS_H)
 cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
    $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
    output.h toplev.h $(FUNCTION_H) except.h $(TM_P_H) insn-config.h $(EXPR_H) \
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 124657)
+++ config/i386/i386.c	(working copy)
@@ -5610,9 +5610,6 @@ ix86_compute_frame_layout (struct ix86_f
 
   gcc_assert (!size || stack_alignment_needed);
   gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (preferred_alignment <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (stack_alignment_needed
-	      <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
 
   if (stack_alignment_needed < STACK_BOUNDARY / BITS_PER_UNIT)
     stack_alignment_needed = STACK_BOUNDARY / BITS_PER_UNIT;
Index: stmt.c
===================================================================
--- stmt.c	(revision 124657)
+++ stmt.c	(working copy)
@@ -1850,6 +1850,7 @@ void
 expand_decl (tree decl)
 {
   tree type;
+  unsigned HOST_WIDE_INT align;
 
   type = TREE_TYPE (decl);
 
@@ -1874,6 +1875,13 @@ expand_decl (tree decl)
   if (TREE_STATIC (decl) || DECL_EXTERNAL (decl))
     return;
 
+  if (DECL_USER_ALIGN (decl))
+    align = DECL_ALIGN (decl);
+  else if (DECL_MODE (decl) == BLKmode)
+    align = BIGGEST_ALIGNMENT;
+  else
+    align = TYPE_ALIGN (type);
+
   /* Create the RTL representation for the variable.  */
 
   if (type == error_mark_node)
@@ -1941,10 +1949,16 @@ expand_decl (tree decl)
 	  oldaddr = XEXP (DECL_RTL (decl), 0);
 	}
 
-      /* Set alignment we actually gave this decl.  */
-      DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
-			   : GET_MODE_BITSIZE (DECL_MODE (decl)));
-      DECL_USER_ALIGN (decl) = 0;
+      /* Set alignment we actually gave this decl if we don't have
+	 an user specific alignment and the alignment is less than
+	 the biggest alignment.  */
+      if (! DECL_USER_ALIGN (decl)
+	  || DECL_ALIGN (decl) <= BIGGEST_ALIGNMENT)
+        {
+	  DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
+			       : GET_MODE_BITSIZE (DECL_MODE (decl)));
+	  DECL_USER_ALIGN (decl) = 0;
+	}
 
       x = assign_temp (decl, 1, 1, 1);
       set_mem_attributes (x, decl, 1);
@@ -1975,8 +1989,10 @@ expand_decl (tree decl)
 	 DECL_ALIGN says how the variable is to be aligned and we
 	 cannot use it to conclude anything about the alignment of
 	 the size.  */
-      address = allocate_dynamic_stack_space (size, NULL_RTX,
-					      TYPE_ALIGN (TREE_TYPE (decl)));
+      address = allocate_dynamic_stack_space_1 (size, NULL_RTX,
+						TYPE_ALIGN (TREE_TYPE (decl)),
+						MAX (TYPE_ALIGN (TREE_TYPE (decl)),
+						align));
 
       /* Reference the variable indirect through that rtx.  */
       x = gen_rtx_MEM (DECL_MODE (decl), address);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment
  2007-05-18 17:42       ` Andrew_Pinski
@ 2007-07-02 23:56         ` Andrew Pinski
  2007-08-04 11:46           ` Andrew Pinski
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Pinski @ 2007-07-02 23:56 UTC (permalink / raw)
  To: Andrew_Pinski
  Cc: trevor_smigiel, gcc-patches, Richard Henderson, Russell_Olsen

[-- Attachment #1: Type: text/plain, Size: 3350 bytes --]

Ping?

On 5/18/07, Andrew_Pinski@playstation.sony.com
<Andrew_Pinski@playstation.sony.com> wrote:
> Trevor Smigiel/R&D/SCEA@Playstation wrote on 05/17/2007 04:49:53 PM:
>
> > Andrew,
> >
> > It seems you missed merging a change that I made in our local tree.
> >
> > In cfgexpand.c, you removed this code:
> >
> > -  /* Set alignment we actually gave this decl.  */
> > -  offset -= frame_phase;
> > -  align = offset & -offset;
> > -  align *= BITS_PER_UNIT;
> > -  if (align > STACK_BOUNDARY || align == 0)
> > -    align = STACK_BOUNDARY;
> > -  DECL_ALIGN (decl) = align;
> > -  DECL_USER_ALIGN (decl) = 0;
> >
> > In our local tree that code is conditioned on
> >   if (!DECL_USER_ALIGN (decl))
> > (Which could be done more precisely.  Consider the case where a user
> > specified alignment is smaller than the decl's actual alignment on the
> > stack.)
> >
> > Without it less efficient code can be generated because higher
> > alignments are not propagated.  At least, that's the behaviour on 4.1.1.
>
> Here is the new patch which fixes that problem and we get the get the
> correct alignment on the variables now that we were getting a smaller
> alignment on with my older version of the patch.  It adds a testcase which
> verifies this is the case too (I only check on ia32/x86_64, powerpc* and
> spu as those are the only targets which I know the normal alignment of
> stack is and the only targets I could test the testcase on).
>
> OK? Bootstrapped and tested on i686-linux-gnu with no regressions.
>
> Thanks,
> Andrew Pinski
>
> ChangeLog:
>
>         * expr.h  (allocate_dynamic_stack_space_1): New function
> prototype.
>         * functionc.c (assign_temp): Take into account the alignment
>         of the temp if it is greater than the target's preferred
> alignment.
>         * cfgexpand.c: Include optabs.h.
>         (get_decl_align_unit): Update comment and don't lower the
> alignment
>         if it is greater than the target's preferred alignment.
>         (alloc_stack_frame_space): Take an unsigned HOST_WIDE_INT for
> align.
>         Take into account the variable's alignment if it is greater than
>         the target's preferred alignment.
>         (expand_one_stack_var_at):  Likewise.  Only reset the alignment of
> the decl to given
>         alignment if the alignment is less than the target's preferred
> alignment.
>         * explow.c (allocate_dynamic_stack_space_1): Split out from
>         allocate_dynamic_stack_space and take into acount the required
>         alignment.  Use AND opcode instead of shifting left and shifting
>         back right.
>         (allocate_dynamic_stack_space): Call
> allocate_dynamic_stack_space_1.
>         * Makefile.in (cfgexpand.o): Update dependecies.
>         * stmt.c (expand_decl):  Take into account the alignment
>         of the variable if it is greater than the target's preferred
>         alignment.
>         * config/i386/i386.c (ix86_compute_frame_layout): Don't assert
>         that preferred alignment is greater than the normal preferred
>         alignment.  Don't assert that the stack alignment needed is
> greater
>         than the normal preferred alignment.
>
>         * gcc.c-torture/execute/pr16660-1.c: New testcase.
>         * gcc.c-torture/execute/pr16660-2.c: New testcase.
>         * gcc.dg/pr16660-1.c: New testcase.
>
>
>
>

[-- Attachment #2: fixpr16660.diff.txt --]
[-- Type: text/plain, Size: 13936 bytes --]

Index: testsuite/gcc.c-torture/execute/pr16660-1.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
@@ -0,0 +1,11 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+  size_t b = (size_t)&a;
+  if (b&(ALIGNMENT-1))
+    return 1;
+  return 0;
+}
+
Index: testsuite/gcc.c-torture/execute/pr16660-2.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
@@ -0,0 +1,19 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  {
+    int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+    size_t b = (size_t)&a;
+    if (b&(ALIGNMENT-1))
+      return 1;
+  }
+  {
+    int a1[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+    size_t b = (size_t)&a1;
+    if (b&(ALIGNMENT-1))
+      return 1;
+  }
+  return 0;
+}
+
Index: testsuite/gcc.dg/pr16660-1.c
===================================================================
--- testsuite/gcc.dg/pr16660-1.c	(revision 0)
+++ testsuite/gcc.dg/pr16660-1.c	(revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-expand" } */
+int g(int *);
+int f(void)
+{
+ int i __attribute__((aligned(1) ));
+ int *a = &i;
+ g(a);
+ return *a;
+}
+/* Even though the user supplied an alignment of 1, the memory
+  location for i should have the natural alignment of the stack.
+  Test only on x86, spu and powerpc for now.  */
+
+/* { dg-final { scan-rtl-dump-not "2 i\\\+0 S4 A8" "expand" } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A32" "expand" { target i?86-*-* x86_64-*-* } } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A64" "expand" { target { { rs6000-*-* powerpc*-*-* } && ilp32 } } } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A128" "expand" { target { spu-*-* } || { { rs6000-*-* powerpc*-*-* } && lp64 } } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
Index: expr.h
===================================================================
--- expr.h	(revision 124657)
+++ expr.h	(working copy)
@@ -712,6 +712,7 @@ extern void update_nonlocal_goto_save_ar
 /* Allocate some space on the stack dynamically and return its address.  An rtx
    says how many bytes.  */
 extern rtx allocate_dynamic_stack_space (rtx, rtx, int);
+extern rtx allocate_dynamic_stack_space_1 (rtx, rtx, int, int);
 
 /* Probe a range of stack addresses from FIRST to FIRST+SIZE, inclusive.
    FIRST is a constant and size is a Pmode RTX.  These are offsets from the
Index: function.c
===================================================================
--- function.c	(revision 124657)
+++ function.c	(working copy)
@@ -813,6 +813,7 @@ assign_temp (tree type_or_decl, int keep
   if (mode == BLKmode || memory_required)
     {
       HOST_WIDE_INT size = int_size_in_bytes (type);
+      unsigned HOST_WIDE_INT align;
       rtx tmp;
 
       /* Zero sized arrays are GNU C extension.  Set size to 1 to avoid
@@ -837,7 +838,37 @@ assign_temp (tree type_or_decl, int keep
 	  size = 1;
 	}
 
-      tmp = assign_stack_temp_for_type (mode, size, keep, type);
+      align = TYPE_ALIGN (type);
+      if (decl && DECL_ALIGN (decl) > align)
+	align = DECL_ALIGN (decl);
+
+      /* When the user specifies an alignment larger than
+         BIGGEST_ALIGNMENT we need to allocate extra space so we can
+         adjust this decl's address.  */
+      if (align > PREFERRED_STACK_BOUNDARY)
+	{
+	  rtx addr;
+	  HOST_WIDE_INT pad = (align - PREFERRED_STACK_BOUNDARY) / BITS_PER_UNIT;
+	  size += pad;
+	  tmp = assign_stack_temp_for_type (mode, size, keep, type);
+	  addr = XEXP (tmp, 0);
+	  /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
+	     but we know it can't.  So add ourselves and then do
+	     TRUNC_DIV_EXPR.  */
+	  addr = expand_binop (Pmode, add_optab, addr,
+			       GEN_INT (align / BITS_PER_UNIT - 1),
+			       NULL_RTX, 1, OPTAB_LIB_WIDEN);
+	  addr = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, addr,
+			        GEN_INT (align / BITS_PER_UNIT),
+			        NULL_RTX, 1);
+	  addr = expand_mult (Pmode, addr,
+			      GEN_INT (align / BITS_PER_UNIT),
+			      NULL_RTX, 1);
+	  tmp = change_address (tmp, VOIDmode, addr);
+	}
+      else
+	tmp = assign_stack_temp_for_type (mode, size, keep, type);
+
       return tmp;
     }
 
Index: cfgexpand.c
===================================================================
--- cfgexpand.c	(revision 124657)
+++ cfgexpand.c	(working copy)
@@ -41,6 +41,7 @@ Boston, MA 02110-1301, USA.  */
 #include "params.h"
 #include "tree-inline.h"
 #include "value-prof.h"
+#include "optabs.h"
 
 /* Verify that there is exactly single jump instruction since last and attach
    REG_BR_PROB note specifying probability.
@@ -151,8 +152,7 @@ static bool has_protected_decls;
    smaller than our cutoff threshold.  Used for -Wstack-protector.  */
 static bool has_short_buffer;
 
-/* Discover the byte alignment to use for DECL.  Ignore alignment
-   we can't do with expected alignment of the stack boundary.  */
+/* Discover the byte alignment to use for DECL.  */
 
 static unsigned int
 get_decl_align_unit (tree decl)
@@ -161,8 +161,6 @@ get_decl_align_unit (tree decl)
 
   align = DECL_ALIGN (decl);
   align = LOCAL_ALIGNMENT (TREE_TYPE (decl), align);
-  if (align > PREFERRED_STACK_BOUNDARY)
-    align = PREFERRED_STACK_BOUNDARY;
   if (cfun->stack_alignment_needed < align)
     cfun->stack_alignment_needed = align;
 
@@ -173,11 +171,21 @@ get_decl_align_unit (tree decl)
    Return the frame offset.  */
 
 static HOST_WIDE_INT
-alloc_stack_frame_space (HOST_WIDE_INT size, HOST_WIDE_INT align)
+alloc_stack_frame_space (HOST_WIDE_INT size, unsigned HOST_WIDE_INT align)
 {
   HOST_WIDE_INT offset, new_frame_offset;
 
   new_frame_offset = frame_offset;
+
+  /* Allocate extra space if the alignment required is greater than the
+     stack boundary and then assume the RTL expansion of the variable, gets
+     the correct alignment.  */
+  if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      size += align;
+      align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
+    }
+
   if (FRAME_GROWS_DOWNWARD)
     {
       new_frame_offset -= size + frame_phase;
@@ -523,23 +531,38 @@ dump_stack_var_partition (void)
 static void
 expand_one_stack_var_at (tree decl, HOST_WIDE_INT offset)
 {
-  HOST_WIDE_INT align;
+  unsigned HOST_WIDE_INT align;
   rtx x;
 
   /* If this fails, we've overflowed the stack frame.  Error nicely?  */
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (virtual_stack_vars_rtx, offset);
-  x = gen_rtx_MEM (DECL_MODE (decl), x);
+  align = get_decl_align_unit (decl);
 
-  /* Set alignment we actually gave this decl.  */
-  offset -= frame_phase;
-  align = offset & -offset;
-  align *= BITS_PER_UNIT;
-  if (align > STACK_BOUNDARY || align == 0)
-    align = STACK_BOUNDARY;
-  DECL_ALIGN (decl) = align;
-  DECL_USER_ALIGN (decl) = 0;
+   if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      rtx addr = x;
+      addr = expand_binop (Pmode, add_optab, addr,
+			   GEN_INT (align - 1),
+			   NULL_RTX, 1, OPTAB_LIB_WIDEN);
+      addr = expand_and (Pmode, addr, GEN_INT (-align),
+			    NULL_RTX);
+      x = addr;
+    } 
+  else
+    {
+      /* Set alignment we actually gave this decl.  */
+      offset -= frame_phase;
+      align = offset & -offset;
+      align *= BITS_PER_UNIT;
+      if (align > STACK_BOUNDARY || align == 0)
+        align = STACK_BOUNDARY;
+      DECL_ALIGN (decl) = align;
+      DECL_USER_ALIGN (decl) = 0;
+    }
+
+  x = gen_rtx_MEM (DECL_MODE (decl), x);
 
   set_mem_attributes (x, decl, true);
   SET_DECL_RTL (decl, x);
Index: explow.c
===================================================================
--- explow.c	(revision 124657)
+++ explow.c	(working copy)
@@ -1093,16 +1093,28 @@ update_nonlocal_goto_save_area (void)
    TARGET is a place in which the address can be placed.
 
    KNOWN_ALIGN is the alignment (in bits) that we know SIZE has.  */
-
 rtx
 allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
 {
+  return allocate_dynamic_stack_space_1 (size, target, known_align,
+				       BIGGEST_ALIGNMENT);
+}
+
+/* Like allocate_dynamic_stack_space expect provide a way, REQUIRED_ALGIN,
+   to say the required alignment of the address.  */
+rtx
+allocate_dynamic_stack_space_1 (rtx size, rtx target, int known_align,
+				int required_align)
+{
   /* If we're asking for zero bytes, it doesn't matter what we point
      to since we can't dereference it.  But return a reasonable
      address anyway.  */
   if (size == const0_rtx)
     return virtual_stack_dynamic_rtx;
 
+  if (required_align < BIGGEST_ALIGNMENT)
+    required_align = BIGGEST_ALIGNMENT;
+
   /* Otherwise, show we're calling alloca or equivalent.  */
   current_function_calls_alloca = 1;
 
@@ -1131,13 +1143,13 @@ allocate_dynamic_stack_space (rtx size, 
 #if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
 #define MUST_ALIGN 1
 #else
-#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < BIGGEST_ALIGNMENT)
+#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < required_align)
 #endif
 
   if (MUST_ALIGN)
     size
       = force_operand (plus_constant (size,
-				      BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+				      required_align / BITS_PER_UNIT - 1),
 		       NULL_RTX);
 
 #ifdef SETJMP_VIA_SAVE_AREA
@@ -1293,18 +1305,12 @@ allocate_dynamic_stack_space (rtx size, 
 
   if (MUST_ALIGN)
     {
-      /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
-	 but we know it can't.  So add ourselves and then do
-	 TRUNC_DIV_EXPR.  */
       target = expand_binop (Pmode, add_optab, target,
-			     GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+			     GEN_INT (required_align / BITS_PER_UNIT - 1),
 			     NULL_RTX, 1, OPTAB_LIB_WIDEN);
-      target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
-			      GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
-			      NULL_RTX, 1);
-      target = expand_mult (Pmode, target,
-			    GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
-			    NULL_RTX, 1);
+      target = expand_and (Pmode, target,
+			   GEN_INT (-required_align / BITS_PER_UNIT),
+			   NULL_RTX);
     }
 
   /* Record the new stack level for nonlocal gotos.  */
Index: Makefile.in
===================================================================
--- Makefile.in	(revision 124657)
+++ Makefile.in	(working copy)
@@ -2553,7 +2553,7 @@ cfgexpand.o : cfgexpand.c $(TREE_FLOW_H)
    $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \
    coretypes.h $(TREE_DUMP_H) except.h langhooks.h tree-pass.h $(RTL_H) \
    $(DIAGNOSTIC_H) toplev.h $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
-   value-prof.h
+   value-prof.h $(OPTABS_H)
 cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
    $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
    output.h toplev.h $(FUNCTION_H) except.h $(TM_P_H) insn-config.h $(EXPR_H) \
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 124657)
+++ config/i386/i386.c	(working copy)
@@ -5610,9 +5610,6 @@ ix86_compute_frame_layout (struct ix86_f
 
   gcc_assert (!size || stack_alignment_needed);
   gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (preferred_alignment <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (stack_alignment_needed
-	      <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
 
   if (stack_alignment_needed < STACK_BOUNDARY / BITS_PER_UNIT)
     stack_alignment_needed = STACK_BOUNDARY / BITS_PER_UNIT;
Index: stmt.c
===================================================================
--- stmt.c	(revision 124657)
+++ stmt.c	(working copy)
@@ -1850,6 +1850,7 @@ void
 expand_decl (tree decl)
 {
   tree type;
+  unsigned HOST_WIDE_INT align;
 
   type = TREE_TYPE (decl);
 
@@ -1874,6 +1875,13 @@ expand_decl (tree decl)
   if (TREE_STATIC (decl) || DECL_EXTERNAL (decl))
     return;
 
+  if (DECL_USER_ALIGN (decl))
+    align = DECL_ALIGN (decl);
+  else if (DECL_MODE (decl) == BLKmode)
+    align = BIGGEST_ALIGNMENT;
+  else
+    align = TYPE_ALIGN (type);
+
   /* Create the RTL representation for the variable.  */
 
   if (type == error_mark_node)
@@ -1941,10 +1949,16 @@ expand_decl (tree decl)
 	  oldaddr = XEXP (DECL_RTL (decl), 0);
 	}
 
-      /* Set alignment we actually gave this decl.  */
-      DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
-			   : GET_MODE_BITSIZE (DECL_MODE (decl)));
-      DECL_USER_ALIGN (decl) = 0;
+      /* Set alignment we actually gave this decl if we don't have
+	 an user specific alignment and the alignment is less than
+	 the biggest alignment.  */
+      if (! DECL_USER_ALIGN (decl)
+	  || DECL_ALIGN (decl) <= BIGGEST_ALIGNMENT)
+        {
+	  DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
+			       : GET_MODE_BITSIZE (DECL_MODE (decl)));
+	  DECL_USER_ALIGN (decl) = 0;
+	}
 
       x = assign_temp (decl, 1, 1, 1);
       set_mem_attributes (x, decl, 1);
@@ -1975,8 +1989,10 @@ expand_decl (tree decl)
 	 DECL_ALIGN says how the variable is to be aligned and we
 	 cannot use it to conclude anything about the alignment of
 	 the size.  */
-      address = allocate_dynamic_stack_space (size, NULL_RTX,
-					      TYPE_ALIGN (TREE_TYPE (decl)));
+      address = allocate_dynamic_stack_space_1 (size, NULL_RTX,
+						TYPE_ALIGN (TREE_TYPE (decl)),
+						MAX (TYPE_ALIGN (TREE_TYPE (decl)),
+						align));
 
       /* Reference the variable indirect through that rtx.  */
       x = gen_rtx_MEM (DECL_MODE (decl), address);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment
  2007-07-02 23:56         ` Andrew Pinski
@ 2007-08-04 11:46           ` Andrew Pinski
  2007-08-28 20:08             ` Andrew Pinski
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Pinski @ 2007-08-04 11:46 UTC (permalink / raw)
  To: Andrew_Pinski
  Cc: trevor_smigiel, gcc-patches, Richard Henderson, Russell_Olsen

[-- Attachment #1: Type: text/plain, Size: 3580 bytes --]

Ping?

On 7/2/07, Andrew Pinski <pinskia@gmail.com> wrote:
> Ping?
>
> On 5/18/07, Andrew_Pinski@playstation.sony.com
> <Andrew_Pinski@playstation.sony.com> wrote:
> > Trevor Smigiel/R&D/SCEA@Playstation wrote on 05/17/2007 04:49:53 PM:
> >
> > > Andrew,
> > >
> > > It seems you missed merging a change that I made in our local tree.
> > >
> > > In cfgexpand.c, you removed this code:
> > >
> > > -  /* Set alignment we actually gave this decl.  */
> > > -  offset -= frame_phase;
> > > -  align = offset & -offset;
> > > -  align *= BITS_PER_UNIT;
> > > -  if (align > STACK_BOUNDARY || align == 0)
> > > -    align = STACK_BOUNDARY;
> > > -  DECL_ALIGN (decl) = align;
> > > -  DECL_USER_ALIGN (decl) = 0;
> > >
> > > In our local tree that code is conditioned on
> > >   if (!DECL_USER_ALIGN (decl))
> > > (Which could be done more precisely.  Consider the case where a user
> > > specified alignment is smaller than the decl's actual alignment on the
> > > stack.)
> > >
> > > Without it less efficient code can be generated because higher
> > > alignments are not propagated.  At least, that's the behaviour on 4.1.1.
> >
> > Here is the new patch which fixes that problem and we get the get the
> > correct alignment on the variables now that we were getting a smaller
> > alignment on with my older version of the patch.  It adds a testcase which
> > verifies this is the case too (I only check on ia32/x86_64, powerpc* and
> > spu as those are the only targets which I know the normal alignment of
> > stack is and the only targets I could test the testcase on).
> >
> > OK? Bootstrapped and tested on i686-linux-gnu with no regressions.
> >
> > Thanks,
> > Andrew Pinski
> >
> > ChangeLog:
> >
> >         * expr.h  (allocate_dynamic_stack_space_1): New function
> > prototype.
> >         * functionc.c (assign_temp): Take into account the alignment
> >         of the temp if it is greater than the target's preferred
> > alignment.
> >         * cfgexpand.c: Include optabs.h.
> >         (get_decl_align_unit): Update comment and don't lower the
> > alignment
> >         if it is greater than the target's preferred alignment.
> >         (alloc_stack_frame_space): Take an unsigned HOST_WIDE_INT for
> > align.
> >         Take into account the variable's alignment if it is greater than
> >         the target's preferred alignment.
> >         (expand_one_stack_var_at):  Likewise.  Only reset the alignment of
> > the decl to given
> >         alignment if the alignment is less than the target's preferred
> > alignment.
> >         * explow.c (allocate_dynamic_stack_space_1): Split out from
> >         allocate_dynamic_stack_space and take into acount the required
> >         alignment.  Use AND opcode instead of shifting left and shifting
> >         back right.
> >         (allocate_dynamic_stack_space): Call
> > allocate_dynamic_stack_space_1.
> >         * Makefile.in (cfgexpand.o): Update dependecies.
> >         * stmt.c (expand_decl):  Take into account the alignment
> >         of the variable if it is greater than the target's preferred
> >         alignment.
> >         * config/i386/i386.c (ix86_compute_frame_layout): Don't assert
> >         that preferred alignment is greater than the normal preferred
> >         alignment.  Don't assert that the stack alignment needed is
> > greater
> >         than the normal preferred alignment.
> >
> >         * gcc.c-torture/execute/pr16660-1.c: New testcase.
> >         * gcc.c-torture/execute/pr16660-2.c: New testcase.
> >         * gcc.dg/pr16660-1.c: New testcase.
> >
> >
> >
> >
>
>

[-- Attachment #2: fixpr16660.diff.txt --]
[-- Type: text/plain, Size: 13936 bytes --]

Index: testsuite/gcc.c-torture/execute/pr16660-1.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
@@ -0,0 +1,11 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+  size_t b = (size_t)&a;
+  if (b&(ALIGNMENT-1))
+    return 1;
+  return 0;
+}
+
Index: testsuite/gcc.c-torture/execute/pr16660-2.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
@@ -0,0 +1,19 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  {
+    int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+    size_t b = (size_t)&a;
+    if (b&(ALIGNMENT-1))
+      return 1;
+  }
+  {
+    int a1[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+    size_t b = (size_t)&a1;
+    if (b&(ALIGNMENT-1))
+      return 1;
+  }
+  return 0;
+}
+
Index: testsuite/gcc.dg/pr16660-1.c
===================================================================
--- testsuite/gcc.dg/pr16660-1.c	(revision 0)
+++ testsuite/gcc.dg/pr16660-1.c	(revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-expand" } */
+int g(int *);
+int f(void)
+{
+ int i __attribute__((aligned(1) ));
+ int *a = &i;
+ g(a);
+ return *a;
+}
+/* Even though the user supplied an alignment of 1, the memory
+  location for i should have the natural alignment of the stack.
+  Test only on x86, spu and powerpc for now.  */
+
+/* { dg-final { scan-rtl-dump-not "2 i\\\+0 S4 A8" "expand" } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A32" "expand" { target i?86-*-* x86_64-*-* } } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A64" "expand" { target { { rs6000-*-* powerpc*-*-* } && ilp32 } } } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A128" "expand" { target { spu-*-* } || { { rs6000-*-* powerpc*-*-* } && lp64 } } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
Index: expr.h
===================================================================
--- expr.h	(revision 124657)
+++ expr.h	(working copy)
@@ -712,6 +712,7 @@ extern void update_nonlocal_goto_save_ar
 /* Allocate some space on the stack dynamically and return its address.  An rtx
    says how many bytes.  */
 extern rtx allocate_dynamic_stack_space (rtx, rtx, int);
+extern rtx allocate_dynamic_stack_space_1 (rtx, rtx, int, int);
 
 /* Probe a range of stack addresses from FIRST to FIRST+SIZE, inclusive.
    FIRST is a constant and size is a Pmode RTX.  These are offsets from the
Index: function.c
===================================================================
--- function.c	(revision 124657)
+++ function.c	(working copy)
@@ -813,6 +813,7 @@ assign_temp (tree type_or_decl, int keep
   if (mode == BLKmode || memory_required)
     {
       HOST_WIDE_INT size = int_size_in_bytes (type);
+      unsigned HOST_WIDE_INT align;
       rtx tmp;
 
       /* Zero sized arrays are GNU C extension.  Set size to 1 to avoid
@@ -837,7 +838,37 @@ assign_temp (tree type_or_decl, int keep
 	  size = 1;
 	}
 
-      tmp = assign_stack_temp_for_type (mode, size, keep, type);
+      align = TYPE_ALIGN (type);
+      if (decl && DECL_ALIGN (decl) > align)
+	align = DECL_ALIGN (decl);
+
+      /* When the user specifies an alignment larger than
+         BIGGEST_ALIGNMENT we need to allocate extra space so we can
+         adjust this decl's address.  */
+      if (align > PREFERRED_STACK_BOUNDARY)
+	{
+	  rtx addr;
+	  HOST_WIDE_INT pad = (align - PREFERRED_STACK_BOUNDARY) / BITS_PER_UNIT;
+	  size += pad;
+	  tmp = assign_stack_temp_for_type (mode, size, keep, type);
+	  addr = XEXP (tmp, 0);
+	  /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
+	     but we know it can't.  So add ourselves and then do
+	     TRUNC_DIV_EXPR.  */
+	  addr = expand_binop (Pmode, add_optab, addr,
+			       GEN_INT (align / BITS_PER_UNIT - 1),
+			       NULL_RTX, 1, OPTAB_LIB_WIDEN);
+	  addr = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, addr,
+			        GEN_INT (align / BITS_PER_UNIT),
+			        NULL_RTX, 1);
+	  addr = expand_mult (Pmode, addr,
+			      GEN_INT (align / BITS_PER_UNIT),
+			      NULL_RTX, 1);
+	  tmp = change_address (tmp, VOIDmode, addr);
+	}
+      else
+	tmp = assign_stack_temp_for_type (mode, size, keep, type);
+
       return tmp;
     }
 
Index: cfgexpand.c
===================================================================
--- cfgexpand.c	(revision 124657)
+++ cfgexpand.c	(working copy)
@@ -41,6 +41,7 @@ Boston, MA 02110-1301, USA.  */
 #include "params.h"
 #include "tree-inline.h"
 #include "value-prof.h"
+#include "optabs.h"
 
 /* Verify that there is exactly single jump instruction since last and attach
    REG_BR_PROB note specifying probability.
@@ -151,8 +152,7 @@ static bool has_protected_decls;
    smaller than our cutoff threshold.  Used for -Wstack-protector.  */
 static bool has_short_buffer;
 
-/* Discover the byte alignment to use for DECL.  Ignore alignment
-   we can't do with expected alignment of the stack boundary.  */
+/* Discover the byte alignment to use for DECL.  */
 
 static unsigned int
 get_decl_align_unit (tree decl)
@@ -161,8 +161,6 @@ get_decl_align_unit (tree decl)
 
   align = DECL_ALIGN (decl);
   align = LOCAL_ALIGNMENT (TREE_TYPE (decl), align);
-  if (align > PREFERRED_STACK_BOUNDARY)
-    align = PREFERRED_STACK_BOUNDARY;
   if (cfun->stack_alignment_needed < align)
     cfun->stack_alignment_needed = align;
 
@@ -173,11 +171,21 @@ get_decl_align_unit (tree decl)
    Return the frame offset.  */
 
 static HOST_WIDE_INT
-alloc_stack_frame_space (HOST_WIDE_INT size, HOST_WIDE_INT align)
+alloc_stack_frame_space (HOST_WIDE_INT size, unsigned HOST_WIDE_INT align)
 {
   HOST_WIDE_INT offset, new_frame_offset;
 
   new_frame_offset = frame_offset;
+
+  /* Allocate extra space if the alignment required is greater than the
+     stack boundary and then assume the RTL expansion of the variable, gets
+     the correct alignment.  */
+  if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      size += align;
+      align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
+    }
+
   if (FRAME_GROWS_DOWNWARD)
     {
       new_frame_offset -= size + frame_phase;
@@ -523,23 +531,38 @@ dump_stack_var_partition (void)
 static void
 expand_one_stack_var_at (tree decl, HOST_WIDE_INT offset)
 {
-  HOST_WIDE_INT align;
+  unsigned HOST_WIDE_INT align;
   rtx x;
 
   /* If this fails, we've overflowed the stack frame.  Error nicely?  */
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (virtual_stack_vars_rtx, offset);
-  x = gen_rtx_MEM (DECL_MODE (decl), x);
+  align = get_decl_align_unit (decl);
 
-  /* Set alignment we actually gave this decl.  */
-  offset -= frame_phase;
-  align = offset & -offset;
-  align *= BITS_PER_UNIT;
-  if (align > STACK_BOUNDARY || align == 0)
-    align = STACK_BOUNDARY;
-  DECL_ALIGN (decl) = align;
-  DECL_USER_ALIGN (decl) = 0;
+   if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      rtx addr = x;
+      addr = expand_binop (Pmode, add_optab, addr,
+			   GEN_INT (align - 1),
+			   NULL_RTX, 1, OPTAB_LIB_WIDEN);
+      addr = expand_and (Pmode, addr, GEN_INT (-align),
+			    NULL_RTX);
+      x = addr;
+    } 
+  else
+    {
+      /* Set alignment we actually gave this decl.  */
+      offset -= frame_phase;
+      align = offset & -offset;
+      align *= BITS_PER_UNIT;
+      if (align > STACK_BOUNDARY || align == 0)
+        align = STACK_BOUNDARY;
+      DECL_ALIGN (decl) = align;
+      DECL_USER_ALIGN (decl) = 0;
+    }
+
+  x = gen_rtx_MEM (DECL_MODE (decl), x);
 
   set_mem_attributes (x, decl, true);
   SET_DECL_RTL (decl, x);
Index: explow.c
===================================================================
--- explow.c	(revision 124657)
+++ explow.c	(working copy)
@@ -1093,16 +1093,28 @@ update_nonlocal_goto_save_area (void)
    TARGET is a place in which the address can be placed.
 
    KNOWN_ALIGN is the alignment (in bits) that we know SIZE has.  */
-
 rtx
 allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
 {
+  return allocate_dynamic_stack_space_1 (size, target, known_align,
+				       BIGGEST_ALIGNMENT);
+}
+
+/* Like allocate_dynamic_stack_space expect provide a way, REQUIRED_ALGIN,
+   to say the required alignment of the address.  */
+rtx
+allocate_dynamic_stack_space_1 (rtx size, rtx target, int known_align,
+				int required_align)
+{
   /* If we're asking for zero bytes, it doesn't matter what we point
      to since we can't dereference it.  But return a reasonable
      address anyway.  */
   if (size == const0_rtx)
     return virtual_stack_dynamic_rtx;
 
+  if (required_align < BIGGEST_ALIGNMENT)
+    required_align = BIGGEST_ALIGNMENT;
+
   /* Otherwise, show we're calling alloca or equivalent.  */
   current_function_calls_alloca = 1;
 
@@ -1131,13 +1143,13 @@ allocate_dynamic_stack_space (rtx size, 
 #if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
 #define MUST_ALIGN 1
 #else
-#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < BIGGEST_ALIGNMENT)
+#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < required_align)
 #endif
 
   if (MUST_ALIGN)
     size
       = force_operand (plus_constant (size,
-				      BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+				      required_align / BITS_PER_UNIT - 1),
 		       NULL_RTX);
 
 #ifdef SETJMP_VIA_SAVE_AREA
@@ -1293,18 +1305,12 @@ allocate_dynamic_stack_space (rtx size, 
 
   if (MUST_ALIGN)
     {
-      /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
-	 but we know it can't.  So add ourselves and then do
-	 TRUNC_DIV_EXPR.  */
       target = expand_binop (Pmode, add_optab, target,
-			     GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+			     GEN_INT (required_align / BITS_PER_UNIT - 1),
 			     NULL_RTX, 1, OPTAB_LIB_WIDEN);
-      target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
-			      GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
-			      NULL_RTX, 1);
-      target = expand_mult (Pmode, target,
-			    GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
-			    NULL_RTX, 1);
+      target = expand_and (Pmode, target,
+			   GEN_INT (-required_align / BITS_PER_UNIT),
+			   NULL_RTX);
     }
 
   /* Record the new stack level for nonlocal gotos.  */
Index: Makefile.in
===================================================================
--- Makefile.in	(revision 124657)
+++ Makefile.in	(working copy)
@@ -2553,7 +2553,7 @@ cfgexpand.o : cfgexpand.c $(TREE_FLOW_H)
    $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \
    coretypes.h $(TREE_DUMP_H) except.h langhooks.h tree-pass.h $(RTL_H) \
    $(DIAGNOSTIC_H) toplev.h $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
-   value-prof.h
+   value-prof.h $(OPTABS_H)
 cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
    $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
    output.h toplev.h $(FUNCTION_H) except.h $(TM_P_H) insn-config.h $(EXPR_H) \
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 124657)
+++ config/i386/i386.c	(working copy)
@@ -5610,9 +5610,6 @@ ix86_compute_frame_layout (struct ix86_f
 
   gcc_assert (!size || stack_alignment_needed);
   gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (preferred_alignment <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (stack_alignment_needed
-	      <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
 
   if (stack_alignment_needed < STACK_BOUNDARY / BITS_PER_UNIT)
     stack_alignment_needed = STACK_BOUNDARY / BITS_PER_UNIT;
Index: stmt.c
===================================================================
--- stmt.c	(revision 124657)
+++ stmt.c	(working copy)
@@ -1850,6 +1850,7 @@ void
 expand_decl (tree decl)
 {
   tree type;
+  unsigned HOST_WIDE_INT align;
 
   type = TREE_TYPE (decl);
 
@@ -1874,6 +1875,13 @@ expand_decl (tree decl)
   if (TREE_STATIC (decl) || DECL_EXTERNAL (decl))
     return;
 
+  if (DECL_USER_ALIGN (decl))
+    align = DECL_ALIGN (decl);
+  else if (DECL_MODE (decl) == BLKmode)
+    align = BIGGEST_ALIGNMENT;
+  else
+    align = TYPE_ALIGN (type);
+
   /* Create the RTL representation for the variable.  */
 
   if (type == error_mark_node)
@@ -1941,10 +1949,16 @@ expand_decl (tree decl)
 	  oldaddr = XEXP (DECL_RTL (decl), 0);
 	}
 
-      /* Set alignment we actually gave this decl.  */
-      DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
-			   : GET_MODE_BITSIZE (DECL_MODE (decl)));
-      DECL_USER_ALIGN (decl) = 0;
+      /* Set alignment we actually gave this decl if we don't have
+	 an user specific alignment and the alignment is less than
+	 the biggest alignment.  */
+      if (! DECL_USER_ALIGN (decl)
+	  || DECL_ALIGN (decl) <= BIGGEST_ALIGNMENT)
+        {
+	  DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
+			       : GET_MODE_BITSIZE (DECL_MODE (decl)));
+	  DECL_USER_ALIGN (decl) = 0;
+	}
 
       x = assign_temp (decl, 1, 1, 1);
       set_mem_attributes (x, decl, 1);
@@ -1975,8 +1989,10 @@ expand_decl (tree decl)
 	 DECL_ALIGN says how the variable is to be aligned and we
 	 cannot use it to conclude anything about the alignment of
 	 the size.  */
-      address = allocate_dynamic_stack_space (size, NULL_RTX,
-					      TYPE_ALIGN (TREE_TYPE (decl)));
+      address = allocate_dynamic_stack_space_1 (size, NULL_RTX,
+						TYPE_ALIGN (TREE_TYPE (decl)),
+						MAX (TYPE_ALIGN (TREE_TYPE (decl)),
+						align));
 
       /* Reference the variable indirect through that rtx.  */
       x = gen_rtx_MEM (DECL_MODE (decl), address);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment
  2007-08-04 11:46           ` Andrew Pinski
@ 2007-08-28 20:08             ` Andrew Pinski
  2007-09-17 22:22               ` Andrew Pinski
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Pinski @ 2007-08-28 20:08 UTC (permalink / raw)
  To: Andrew_Pinski
  Cc: trevor_smigiel, gcc-patches, Richard Henderson, Russell_Olsen

[-- Attachment #1: Type: text/plain, Size: 3820 bytes --]

Ping?

On 8/4/07, Andrew Pinski <pinskia@gmail.com> wrote:
> Ping?
>
> On 7/2/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > Ping?
> >
> > On 5/18/07, Andrew_Pinski@playstation.sony.com
> > <Andrew_Pinski@playstation.sony.com> wrote:
> > > Trevor Smigiel/R&D/SCEA@Playstation wrote on 05/17/2007 04:49:53 PM:
> > >
> > > > Andrew,
> > > >
> > > > It seems you missed merging a change that I made in our local tree.
> > > >
> > > > In cfgexpand.c, you removed this code:
> > > >
> > > > -  /* Set alignment we actually gave this decl.  */
> > > > -  offset -= frame_phase;
> > > > -  align = offset & -offset;
> > > > -  align *= BITS_PER_UNIT;
> > > > -  if (align > STACK_BOUNDARY || align == 0)
> > > > -    align = STACK_BOUNDARY;
> > > > -  DECL_ALIGN (decl) = align;
> > > > -  DECL_USER_ALIGN (decl) = 0;
> > > >
> > > > In our local tree that code is conditioned on
> > > >   if (!DECL_USER_ALIGN (decl))
> > > > (Which could be done more precisely.  Consider the case where a user
> > > > specified alignment is smaller than the decl's actual alignment on the
> > > > stack.)
> > > >
> > > > Without it less efficient code can be generated because higher
> > > > alignments are not propagated.  At least, that's the behaviour on 4.1.1.
> > >
> > > Here is the new patch which fixes that problem and we get the get the
> > > correct alignment on the variables now that we were getting a smaller
> > > alignment on with my older version of the patch.  It adds a testcase which
> > > verifies this is the case too (I only check on ia32/x86_64, powerpc* and
> > > spu as those are the only targets which I know the normal alignment of
> > > stack is and the only targets I could test the testcase on).
> > >
> > > OK? Bootstrapped and tested on i686-linux-gnu with no regressions.
> > >
> > > Thanks,
> > > Andrew Pinski
> > >
> > > ChangeLog:
> > >
> > >         * expr.h  (allocate_dynamic_stack_space_1): New function
> > > prototype.
> > >         * functionc.c (assign_temp): Take into account the alignment
> > >         of the temp if it is greater than the target's preferred
> > > alignment.
> > >         * cfgexpand.c: Include optabs.h.
> > >         (get_decl_align_unit): Update comment and don't lower the
> > > alignment
> > >         if it is greater than the target's preferred alignment.
> > >         (alloc_stack_frame_space): Take an unsigned HOST_WIDE_INT for
> > > align.
> > >         Take into account the variable's alignment if it is greater than
> > >         the target's preferred alignment.
> > >         (expand_one_stack_var_at):  Likewise.  Only reset the alignment of
> > > the decl to given
> > >         alignment if the alignment is less than the target's preferred
> > > alignment.
> > >         * explow.c (allocate_dynamic_stack_space_1): Split out from
> > >         allocate_dynamic_stack_space and take into acount the required
> > >         alignment.  Use AND opcode instead of shifting left and shifting
> > >         back right.
> > >         (allocate_dynamic_stack_space): Call
> > > allocate_dynamic_stack_space_1.
> > >         * Makefile.in (cfgexpand.o): Update dependecies.
> > >         * stmt.c (expand_decl):  Take into account the alignment
> > >         of the variable if it is greater than the target's preferred
> > >         alignment.
> > >         * config/i386/i386.c (ix86_compute_frame_layout): Don't assert
> > >         that preferred alignment is greater than the normal preferred
> > >         alignment.  Don't assert that the stack alignment needed is
> > > greater
> > >         than the normal preferred alignment.
> > >
> > >         * gcc.c-torture/execute/pr16660-1.c: New testcase.
> > >         * gcc.c-torture/execute/pr16660-2.c: New testcase.
> > >         * gcc.dg/pr16660-1.c: New testcase.
> > >
> > >
> > >
> > >
> >
> >
>
>

[-- Attachment #2: fixpr16660.diff.txt --]
[-- Type: text/plain, Size: 13936 bytes --]

Index: testsuite/gcc.c-torture/execute/pr16660-1.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
@@ -0,0 +1,11 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+  size_t b = (size_t)&a;
+  if (b&(ALIGNMENT-1))
+    return 1;
+  return 0;
+}
+
Index: testsuite/gcc.c-torture/execute/pr16660-2.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
@@ -0,0 +1,19 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  {
+    int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+    size_t b = (size_t)&a;
+    if (b&(ALIGNMENT-1))
+      return 1;
+  }
+  {
+    int a1[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+    size_t b = (size_t)&a1;
+    if (b&(ALIGNMENT-1))
+      return 1;
+  }
+  return 0;
+}
+
Index: testsuite/gcc.dg/pr16660-1.c
===================================================================
--- testsuite/gcc.dg/pr16660-1.c	(revision 0)
+++ testsuite/gcc.dg/pr16660-1.c	(revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-expand" } */
+int g(int *);
+int f(void)
+{
+ int i __attribute__((aligned(1) ));
+ int *a = &i;
+ g(a);
+ return *a;
+}
+/* Even though the user supplied an alignment of 1, the memory
+  location for i should have the natural alignment of the stack.
+  Test only on x86, spu and powerpc for now.  */
+
+/* { dg-final { scan-rtl-dump-not "2 i\\\+0 S4 A8" "expand" } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A32" "expand" { target i?86-*-* x86_64-*-* } } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A64" "expand" { target { { rs6000-*-* powerpc*-*-* } && ilp32 } } } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A128" "expand" { target { spu-*-* } || { { rs6000-*-* powerpc*-*-* } && lp64 } } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
Index: expr.h
===================================================================
--- expr.h	(revision 124657)
+++ expr.h	(working copy)
@@ -712,6 +712,7 @@ extern void update_nonlocal_goto_save_ar
 /* Allocate some space on the stack dynamically and return its address.  An rtx
    says how many bytes.  */
 extern rtx allocate_dynamic_stack_space (rtx, rtx, int);
+extern rtx allocate_dynamic_stack_space_1 (rtx, rtx, int, int);
 
 /* Probe a range of stack addresses from FIRST to FIRST+SIZE, inclusive.
    FIRST is a constant and size is a Pmode RTX.  These are offsets from the
Index: function.c
===================================================================
--- function.c	(revision 124657)
+++ function.c	(working copy)
@@ -813,6 +813,7 @@ assign_temp (tree type_or_decl, int keep
   if (mode == BLKmode || memory_required)
     {
       HOST_WIDE_INT size = int_size_in_bytes (type);
+      unsigned HOST_WIDE_INT align;
       rtx tmp;
 
       /* Zero sized arrays are GNU C extension.  Set size to 1 to avoid
@@ -837,7 +838,37 @@ assign_temp (tree type_or_decl, int keep
 	  size = 1;
 	}
 
-      tmp = assign_stack_temp_for_type (mode, size, keep, type);
+      align = TYPE_ALIGN (type);
+      if (decl && DECL_ALIGN (decl) > align)
+	align = DECL_ALIGN (decl);
+
+      /* When the user specifies an alignment larger than
+         BIGGEST_ALIGNMENT we need to allocate extra space so we can
+         adjust this decl's address.  */
+      if (align > PREFERRED_STACK_BOUNDARY)
+	{
+	  rtx addr;
+	  HOST_WIDE_INT pad = (align - PREFERRED_STACK_BOUNDARY) / BITS_PER_UNIT;
+	  size += pad;
+	  tmp = assign_stack_temp_for_type (mode, size, keep, type);
+	  addr = XEXP (tmp, 0);
+	  /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
+	     but we know it can't.  So add ourselves and then do
+	     TRUNC_DIV_EXPR.  */
+	  addr = expand_binop (Pmode, add_optab, addr,
+			       GEN_INT (align / BITS_PER_UNIT - 1),
+			       NULL_RTX, 1, OPTAB_LIB_WIDEN);
+	  addr = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, addr,
+			        GEN_INT (align / BITS_PER_UNIT),
+			        NULL_RTX, 1);
+	  addr = expand_mult (Pmode, addr,
+			      GEN_INT (align / BITS_PER_UNIT),
+			      NULL_RTX, 1);
+	  tmp = change_address (tmp, VOIDmode, addr);
+	}
+      else
+	tmp = assign_stack_temp_for_type (mode, size, keep, type);
+
       return tmp;
     }
 
Index: cfgexpand.c
===================================================================
--- cfgexpand.c	(revision 124657)
+++ cfgexpand.c	(working copy)
@@ -41,6 +41,7 @@ Boston, MA 02110-1301, USA.  */
 #include "params.h"
 #include "tree-inline.h"
 #include "value-prof.h"
+#include "optabs.h"
 
 /* Verify that there is exactly single jump instruction since last and attach
    REG_BR_PROB note specifying probability.
@@ -151,8 +152,7 @@ static bool has_protected_decls;
    smaller than our cutoff threshold.  Used for -Wstack-protector.  */
 static bool has_short_buffer;
 
-/* Discover the byte alignment to use for DECL.  Ignore alignment
-   we can't do with expected alignment of the stack boundary.  */
+/* Discover the byte alignment to use for DECL.  */
 
 static unsigned int
 get_decl_align_unit (tree decl)
@@ -161,8 +161,6 @@ get_decl_align_unit (tree decl)
 
   align = DECL_ALIGN (decl);
   align = LOCAL_ALIGNMENT (TREE_TYPE (decl), align);
-  if (align > PREFERRED_STACK_BOUNDARY)
-    align = PREFERRED_STACK_BOUNDARY;
   if (cfun->stack_alignment_needed < align)
     cfun->stack_alignment_needed = align;
 
@@ -173,11 +171,21 @@ get_decl_align_unit (tree decl)
    Return the frame offset.  */
 
 static HOST_WIDE_INT
-alloc_stack_frame_space (HOST_WIDE_INT size, HOST_WIDE_INT align)
+alloc_stack_frame_space (HOST_WIDE_INT size, unsigned HOST_WIDE_INT align)
 {
   HOST_WIDE_INT offset, new_frame_offset;
 
   new_frame_offset = frame_offset;
+
+  /* Allocate extra space if the alignment required is greater than the
+     stack boundary and then assume the RTL expansion of the variable, gets
+     the correct alignment.  */
+  if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      size += align;
+      align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
+    }
+
   if (FRAME_GROWS_DOWNWARD)
     {
       new_frame_offset -= size + frame_phase;
@@ -523,23 +531,38 @@ dump_stack_var_partition (void)
 static void
 expand_one_stack_var_at (tree decl, HOST_WIDE_INT offset)
 {
-  HOST_WIDE_INT align;
+  unsigned HOST_WIDE_INT align;
   rtx x;
 
   /* If this fails, we've overflowed the stack frame.  Error nicely?  */
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (virtual_stack_vars_rtx, offset);
-  x = gen_rtx_MEM (DECL_MODE (decl), x);
+  align = get_decl_align_unit (decl);
 
-  /* Set alignment we actually gave this decl.  */
-  offset -= frame_phase;
-  align = offset & -offset;
-  align *= BITS_PER_UNIT;
-  if (align > STACK_BOUNDARY || align == 0)
-    align = STACK_BOUNDARY;
-  DECL_ALIGN (decl) = align;
-  DECL_USER_ALIGN (decl) = 0;
+   if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      rtx addr = x;
+      addr = expand_binop (Pmode, add_optab, addr,
+			   GEN_INT (align - 1),
+			   NULL_RTX, 1, OPTAB_LIB_WIDEN);
+      addr = expand_and (Pmode, addr, GEN_INT (-align),
+			    NULL_RTX);
+      x = addr;
+    } 
+  else
+    {
+      /* Set alignment we actually gave this decl.  */
+      offset -= frame_phase;
+      align = offset & -offset;
+      align *= BITS_PER_UNIT;
+      if (align > STACK_BOUNDARY || align == 0)
+        align = STACK_BOUNDARY;
+      DECL_ALIGN (decl) = align;
+      DECL_USER_ALIGN (decl) = 0;
+    }
+
+  x = gen_rtx_MEM (DECL_MODE (decl), x);
 
   set_mem_attributes (x, decl, true);
   SET_DECL_RTL (decl, x);
Index: explow.c
===================================================================
--- explow.c	(revision 124657)
+++ explow.c	(working copy)
@@ -1093,16 +1093,28 @@ update_nonlocal_goto_save_area (void)
    TARGET is a place in which the address can be placed.
 
    KNOWN_ALIGN is the alignment (in bits) that we know SIZE has.  */
-
 rtx
 allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
 {
+  return allocate_dynamic_stack_space_1 (size, target, known_align,
+				       BIGGEST_ALIGNMENT);
+}
+
+/* Like allocate_dynamic_stack_space expect provide a way, REQUIRED_ALGIN,
+   to say the required alignment of the address.  */
+rtx
+allocate_dynamic_stack_space_1 (rtx size, rtx target, int known_align,
+				int required_align)
+{
   /* If we're asking for zero bytes, it doesn't matter what we point
      to since we can't dereference it.  But return a reasonable
      address anyway.  */
   if (size == const0_rtx)
     return virtual_stack_dynamic_rtx;
 
+  if (required_align < BIGGEST_ALIGNMENT)
+    required_align = BIGGEST_ALIGNMENT;
+
   /* Otherwise, show we're calling alloca or equivalent.  */
   current_function_calls_alloca = 1;
 
@@ -1131,13 +1143,13 @@ allocate_dynamic_stack_space (rtx size, 
 #if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
 #define MUST_ALIGN 1
 #else
-#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < BIGGEST_ALIGNMENT)
+#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < required_align)
 #endif
 
   if (MUST_ALIGN)
     size
       = force_operand (plus_constant (size,
-				      BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+				      required_align / BITS_PER_UNIT - 1),
 		       NULL_RTX);
 
 #ifdef SETJMP_VIA_SAVE_AREA
@@ -1293,18 +1305,12 @@ allocate_dynamic_stack_space (rtx size, 
 
   if (MUST_ALIGN)
     {
-      /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
-	 but we know it can't.  So add ourselves and then do
-	 TRUNC_DIV_EXPR.  */
       target = expand_binop (Pmode, add_optab, target,
-			     GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+			     GEN_INT (required_align / BITS_PER_UNIT - 1),
 			     NULL_RTX, 1, OPTAB_LIB_WIDEN);
-      target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
-			      GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
-			      NULL_RTX, 1);
-      target = expand_mult (Pmode, target,
-			    GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
-			    NULL_RTX, 1);
+      target = expand_and (Pmode, target,
+			   GEN_INT (-required_align / BITS_PER_UNIT),
+			   NULL_RTX);
     }
 
   /* Record the new stack level for nonlocal gotos.  */
Index: Makefile.in
===================================================================
--- Makefile.in	(revision 124657)
+++ Makefile.in	(working copy)
@@ -2553,7 +2553,7 @@ cfgexpand.o : cfgexpand.c $(TREE_FLOW_H)
    $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \
    coretypes.h $(TREE_DUMP_H) except.h langhooks.h tree-pass.h $(RTL_H) \
    $(DIAGNOSTIC_H) toplev.h $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
-   value-prof.h
+   value-prof.h $(OPTABS_H)
 cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
    $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
    output.h toplev.h $(FUNCTION_H) except.h $(TM_P_H) insn-config.h $(EXPR_H) \
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 124657)
+++ config/i386/i386.c	(working copy)
@@ -5610,9 +5610,6 @@ ix86_compute_frame_layout (struct ix86_f
 
   gcc_assert (!size || stack_alignment_needed);
   gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (preferred_alignment <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (stack_alignment_needed
-	      <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
 
   if (stack_alignment_needed < STACK_BOUNDARY / BITS_PER_UNIT)
     stack_alignment_needed = STACK_BOUNDARY / BITS_PER_UNIT;
Index: stmt.c
===================================================================
--- stmt.c	(revision 124657)
+++ stmt.c	(working copy)
@@ -1850,6 +1850,7 @@ void
 expand_decl (tree decl)
 {
   tree type;
+  unsigned HOST_WIDE_INT align;
 
   type = TREE_TYPE (decl);
 
@@ -1874,6 +1875,13 @@ expand_decl (tree decl)
   if (TREE_STATIC (decl) || DECL_EXTERNAL (decl))
     return;
 
+  if (DECL_USER_ALIGN (decl))
+    align = DECL_ALIGN (decl);
+  else if (DECL_MODE (decl) == BLKmode)
+    align = BIGGEST_ALIGNMENT;
+  else
+    align = TYPE_ALIGN (type);
+
   /* Create the RTL representation for the variable.  */
 
   if (type == error_mark_node)
@@ -1941,10 +1949,16 @@ expand_decl (tree decl)
 	  oldaddr = XEXP (DECL_RTL (decl), 0);
 	}
 
-      /* Set alignment we actually gave this decl.  */
-      DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
-			   : GET_MODE_BITSIZE (DECL_MODE (decl)));
-      DECL_USER_ALIGN (decl) = 0;
+      /* Set alignment we actually gave this decl if we don't have
+	 an user specific alignment and the alignment is less than
+	 the biggest alignment.  */
+      if (! DECL_USER_ALIGN (decl)
+	  || DECL_ALIGN (decl) <= BIGGEST_ALIGNMENT)
+        {
+	  DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
+			       : GET_MODE_BITSIZE (DECL_MODE (decl)));
+	  DECL_USER_ALIGN (decl) = 0;
+	}
 
       x = assign_temp (decl, 1, 1, 1);
       set_mem_attributes (x, decl, 1);
@@ -1975,8 +1989,10 @@ expand_decl (tree decl)
 	 DECL_ALIGN says how the variable is to be aligned and we
 	 cannot use it to conclude anything about the alignment of
 	 the size.  */
-      address = allocate_dynamic_stack_space (size, NULL_RTX,
-					      TYPE_ALIGN (TREE_TYPE (decl)));
+      address = allocate_dynamic_stack_space_1 (size, NULL_RTX,
+						TYPE_ALIGN (TREE_TYPE (decl)),
+						MAX (TYPE_ALIGN (TREE_TYPE (decl)),
+						align));
 
       /* Reference the variable indirect through that rtx.  */
       x = gen_rtx_MEM (DECL_MODE (decl), address);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment
  2007-08-28 20:08             ` Andrew Pinski
@ 2007-09-17 22:22               ` Andrew Pinski
  2007-10-09 21:20                 ` Andrew Pinski
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Pinski @ 2007-09-17 22:22 UTC (permalink / raw)
  To: Andrew_Pinski
  Cc: trevor_smigiel, gcc-patches, Richard Henderson, Russell_Olsen

[-- Attachment #1: Type: text/plain, Size: 4071 bytes --]

Ping?

On 8/28/07, Andrew Pinski <pinskia@gmail.com> wrote:
> Ping?
>
> On 8/4/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > Ping?
> >
> > On 7/2/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > Ping?
> > >
> > > On 5/18/07, Andrew_Pinski@playstation.sony.com
> > > <Andrew_Pinski@playstation.sony.com> wrote:
> > > > Trevor Smigiel/R&D/SCEA@Playstation wrote on 05/17/2007 04:49:53 PM:
> > > >
> > > > > Andrew,
> > > > >
> > > > > It seems you missed merging a change that I made in our local tree.
> > > > >
> > > > > In cfgexpand.c, you removed this code:
> > > > >
> > > > > -  /* Set alignment we actually gave this decl.  */
> > > > > -  offset -= frame_phase;
> > > > > -  align = offset & -offset;
> > > > > -  align *= BITS_PER_UNIT;
> > > > > -  if (align > STACK_BOUNDARY || align == 0)
> > > > > -    align = STACK_BOUNDARY;
> > > > > -  DECL_ALIGN (decl) = align;
> > > > > -  DECL_USER_ALIGN (decl) = 0;
> > > > >
> > > > > In our local tree that code is conditioned on
> > > > >   if (!DECL_USER_ALIGN (decl))
> > > > > (Which could be done more precisely.  Consider the case where a user
> > > > > specified alignment is smaller than the decl's actual alignment on the
> > > > > stack.)
> > > > >
> > > > > Without it less efficient code can be generated because higher
> > > > > alignments are not propagated.  At least, that's the behaviour on 4.1.1.
> > > >
> > > > Here is the new patch which fixes that problem and we get the get the
> > > > correct alignment on the variables now that we were getting a smaller
> > > > alignment on with my older version of the patch.  It adds a testcase which
> > > > verifies this is the case too (I only check on ia32/x86_64, powerpc* and
> > > > spu as those are the only targets which I know the normal alignment of
> > > > stack is and the only targets I could test the testcase on).
> > > >
> > > > OK? Bootstrapped and tested on i686-linux-gnu with no regressions.
> > > >
> > > > Thanks,
> > > > Andrew Pinski
> > > >
> > > > ChangeLog:
> > > >
> > > >         * expr.h  (allocate_dynamic_stack_space_1): New function
> > > > prototype.
> > > >         * functionc.c (assign_temp): Take into account the alignment
> > > >         of the temp if it is greater than the target's preferred
> > > > alignment.
> > > >         * cfgexpand.c: Include optabs.h.
> > > >         (get_decl_align_unit): Update comment and don't lower the
> > > > alignment
> > > >         if it is greater than the target's preferred alignment.
> > > >         (alloc_stack_frame_space): Take an unsigned HOST_WIDE_INT for
> > > > align.
> > > >         Take into account the variable's alignment if it is greater than
> > > >         the target's preferred alignment.
> > > >         (expand_one_stack_var_at):  Likewise.  Only reset the alignment of
> > > > the decl to given
> > > >         alignment if the alignment is less than the target's preferred
> > > > alignment.
> > > >         * explow.c (allocate_dynamic_stack_space_1): Split out from
> > > >         allocate_dynamic_stack_space and take into acount the required
> > > >         alignment.  Use AND opcode instead of shifting left and shifting
> > > >         back right.
> > > >         (allocate_dynamic_stack_space): Call
> > > > allocate_dynamic_stack_space_1.
> > > >         * Makefile.in (cfgexpand.o): Update dependecies.
> > > >         * stmt.c (expand_decl):  Take into account the alignment
> > > >         of the variable if it is greater than the target's preferred
> > > >         alignment.
> > > >         * config/i386/i386.c (ix86_compute_frame_layout): Don't assert
> > > >         that preferred alignment is greater than the normal preferred
> > > >         alignment.  Don't assert that the stack alignment needed is
> > > > greater
> > > >         than the normal preferred alignment.
> > > >
> > > >         * gcc.c-torture/execute/pr16660-1.c: New testcase.
> > > >         * gcc.c-torture/execute/pr16660-2.c: New testcase.
> > > >         * gcc.dg/pr16660-1.c: New testcase.
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>

[-- Attachment #2: fixpr16660.diff.txt --]
[-- Type: text/plain, Size: 13936 bytes --]

Index: testsuite/gcc.c-torture/execute/pr16660-1.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
@@ -0,0 +1,11 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+  size_t b = (size_t)&a;
+  if (b&(ALIGNMENT-1))
+    return 1;
+  return 0;
+}
+
Index: testsuite/gcc.c-torture/execute/pr16660-2.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
@@ -0,0 +1,19 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  {
+    int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+    size_t b = (size_t)&a;
+    if (b&(ALIGNMENT-1))
+      return 1;
+  }
+  {
+    int a1[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+    size_t b = (size_t)&a1;
+    if (b&(ALIGNMENT-1))
+      return 1;
+  }
+  return 0;
+}
+
Index: testsuite/gcc.dg/pr16660-1.c
===================================================================
--- testsuite/gcc.dg/pr16660-1.c	(revision 0)
+++ testsuite/gcc.dg/pr16660-1.c	(revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-expand" } */
+int g(int *);
+int f(void)
+{
+ int i __attribute__((aligned(1) ));
+ int *a = &i;
+ g(a);
+ return *a;
+}
+/* Even though the user supplied an alignment of 1, the memory
+  location for i should have the natural alignment of the stack.
+  Test only on x86, spu and powerpc for now.  */
+
+/* { dg-final { scan-rtl-dump-not "2 i\\\+0 S4 A8" "expand" } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A32" "expand" { target i?86-*-* x86_64-*-* } } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A64" "expand" { target { { rs6000-*-* powerpc*-*-* } && ilp32 } } } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A128" "expand" { target { spu-*-* } || { { rs6000-*-* powerpc*-*-* } && lp64 } } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
Index: expr.h
===================================================================
--- expr.h	(revision 124657)
+++ expr.h	(working copy)
@@ -712,6 +712,7 @@ extern void update_nonlocal_goto_save_ar
 /* Allocate some space on the stack dynamically and return its address.  An rtx
    says how many bytes.  */
 extern rtx allocate_dynamic_stack_space (rtx, rtx, int);
+extern rtx allocate_dynamic_stack_space_1 (rtx, rtx, int, int);
 
 /* Probe a range of stack addresses from FIRST to FIRST+SIZE, inclusive.
    FIRST is a constant and size is a Pmode RTX.  These are offsets from the
Index: function.c
===================================================================
--- function.c	(revision 124657)
+++ function.c	(working copy)
@@ -813,6 +813,7 @@ assign_temp (tree type_or_decl, int keep
   if (mode == BLKmode || memory_required)
     {
       HOST_WIDE_INT size = int_size_in_bytes (type);
+      unsigned HOST_WIDE_INT align;
       rtx tmp;
 
       /* Zero sized arrays are GNU C extension.  Set size to 1 to avoid
@@ -837,7 +838,37 @@ assign_temp (tree type_or_decl, int keep
 	  size = 1;
 	}
 
-      tmp = assign_stack_temp_for_type (mode, size, keep, type);
+      align = TYPE_ALIGN (type);
+      if (decl && DECL_ALIGN (decl) > align)
+	align = DECL_ALIGN (decl);
+
+      /* When the user specifies an alignment larger than
+         BIGGEST_ALIGNMENT we need to allocate extra space so we can
+         adjust this decl's address.  */
+      if (align > PREFERRED_STACK_BOUNDARY)
+	{
+	  rtx addr;
+	  HOST_WIDE_INT pad = (align - PREFERRED_STACK_BOUNDARY) / BITS_PER_UNIT;
+	  size += pad;
+	  tmp = assign_stack_temp_for_type (mode, size, keep, type);
+	  addr = XEXP (tmp, 0);
+	  /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
+	     but we know it can't.  So add ourselves and then do
+	     TRUNC_DIV_EXPR.  */
+	  addr = expand_binop (Pmode, add_optab, addr,
+			       GEN_INT (align / BITS_PER_UNIT - 1),
+			       NULL_RTX, 1, OPTAB_LIB_WIDEN);
+	  addr = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, addr,
+			        GEN_INT (align / BITS_PER_UNIT),
+			        NULL_RTX, 1);
+	  addr = expand_mult (Pmode, addr,
+			      GEN_INT (align / BITS_PER_UNIT),
+			      NULL_RTX, 1);
+	  tmp = change_address (tmp, VOIDmode, addr);
+	}
+      else
+	tmp = assign_stack_temp_for_type (mode, size, keep, type);
+
       return tmp;
     }
 
Index: cfgexpand.c
===================================================================
--- cfgexpand.c	(revision 124657)
+++ cfgexpand.c	(working copy)
@@ -41,6 +41,7 @@ Boston, MA 02110-1301, USA.  */
 #include "params.h"
 #include "tree-inline.h"
 #include "value-prof.h"
+#include "optabs.h"
 
 /* Verify that there is exactly single jump instruction since last and attach
    REG_BR_PROB note specifying probability.
@@ -151,8 +152,7 @@ static bool has_protected_decls;
    smaller than our cutoff threshold.  Used for -Wstack-protector.  */
 static bool has_short_buffer;
 
-/* Discover the byte alignment to use for DECL.  Ignore alignment
-   we can't do with expected alignment of the stack boundary.  */
+/* Discover the byte alignment to use for DECL.  */
 
 static unsigned int
 get_decl_align_unit (tree decl)
@@ -161,8 +161,6 @@ get_decl_align_unit (tree decl)
 
   align = DECL_ALIGN (decl);
   align = LOCAL_ALIGNMENT (TREE_TYPE (decl), align);
-  if (align > PREFERRED_STACK_BOUNDARY)
-    align = PREFERRED_STACK_BOUNDARY;
   if (cfun->stack_alignment_needed < align)
     cfun->stack_alignment_needed = align;
 
@@ -173,11 +171,21 @@ get_decl_align_unit (tree decl)
    Return the frame offset.  */
 
 static HOST_WIDE_INT
-alloc_stack_frame_space (HOST_WIDE_INT size, HOST_WIDE_INT align)
+alloc_stack_frame_space (HOST_WIDE_INT size, unsigned HOST_WIDE_INT align)
 {
   HOST_WIDE_INT offset, new_frame_offset;
 
   new_frame_offset = frame_offset;
+
+  /* Allocate extra space if the alignment required is greater than the
+     stack boundary and then assume the RTL expansion of the variable, gets
+     the correct alignment.  */
+  if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      size += align;
+      align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
+    }
+
   if (FRAME_GROWS_DOWNWARD)
     {
       new_frame_offset -= size + frame_phase;
@@ -523,23 +531,38 @@ dump_stack_var_partition (void)
 static void
 expand_one_stack_var_at (tree decl, HOST_WIDE_INT offset)
 {
-  HOST_WIDE_INT align;
+  unsigned HOST_WIDE_INT align;
   rtx x;
 
   /* If this fails, we've overflowed the stack frame.  Error nicely?  */
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (virtual_stack_vars_rtx, offset);
-  x = gen_rtx_MEM (DECL_MODE (decl), x);
+  align = get_decl_align_unit (decl);
 
-  /* Set alignment we actually gave this decl.  */
-  offset -= frame_phase;
-  align = offset & -offset;
-  align *= BITS_PER_UNIT;
-  if (align > STACK_BOUNDARY || align == 0)
-    align = STACK_BOUNDARY;
-  DECL_ALIGN (decl) = align;
-  DECL_USER_ALIGN (decl) = 0;
+   if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      rtx addr = x;
+      addr = expand_binop (Pmode, add_optab, addr,
+			   GEN_INT (align - 1),
+			   NULL_RTX, 1, OPTAB_LIB_WIDEN);
+      addr = expand_and (Pmode, addr, GEN_INT (-align),
+			    NULL_RTX);
+      x = addr;
+    } 
+  else
+    {
+      /* Set alignment we actually gave this decl.  */
+      offset -= frame_phase;
+      align = offset & -offset;
+      align *= BITS_PER_UNIT;
+      if (align > STACK_BOUNDARY || align == 0)
+        align = STACK_BOUNDARY;
+      DECL_ALIGN (decl) = align;
+      DECL_USER_ALIGN (decl) = 0;
+    }
+
+  x = gen_rtx_MEM (DECL_MODE (decl), x);
 
   set_mem_attributes (x, decl, true);
   SET_DECL_RTL (decl, x);
Index: explow.c
===================================================================
--- explow.c	(revision 124657)
+++ explow.c	(working copy)
@@ -1093,16 +1093,28 @@ update_nonlocal_goto_save_area (void)
    TARGET is a place in which the address can be placed.
 
    KNOWN_ALIGN is the alignment (in bits) that we know SIZE has.  */
-
 rtx
 allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
 {
+  return allocate_dynamic_stack_space_1 (size, target, known_align,
+				       BIGGEST_ALIGNMENT);
+}
+
+/* Like allocate_dynamic_stack_space expect provide a way, REQUIRED_ALGIN,
+   to say the required alignment of the address.  */
+rtx
+allocate_dynamic_stack_space_1 (rtx size, rtx target, int known_align,
+				int required_align)
+{
   /* If we're asking for zero bytes, it doesn't matter what we point
      to since we can't dereference it.  But return a reasonable
      address anyway.  */
   if (size == const0_rtx)
     return virtual_stack_dynamic_rtx;
 
+  if (required_align < BIGGEST_ALIGNMENT)
+    required_align = BIGGEST_ALIGNMENT;
+
   /* Otherwise, show we're calling alloca or equivalent.  */
   current_function_calls_alloca = 1;
 
@@ -1131,13 +1143,13 @@ allocate_dynamic_stack_space (rtx size, 
 #if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
 #define MUST_ALIGN 1
 #else
-#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < BIGGEST_ALIGNMENT)
+#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < required_align)
 #endif
 
   if (MUST_ALIGN)
     size
       = force_operand (plus_constant (size,
-				      BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+				      required_align / BITS_PER_UNIT - 1),
 		       NULL_RTX);
 
 #ifdef SETJMP_VIA_SAVE_AREA
@@ -1293,18 +1305,12 @@ allocate_dynamic_stack_space (rtx size, 
 
   if (MUST_ALIGN)
     {
-      /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
-	 but we know it can't.  So add ourselves and then do
-	 TRUNC_DIV_EXPR.  */
       target = expand_binop (Pmode, add_optab, target,
-			     GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+			     GEN_INT (required_align / BITS_PER_UNIT - 1),
 			     NULL_RTX, 1, OPTAB_LIB_WIDEN);
-      target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
-			      GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
-			      NULL_RTX, 1);
-      target = expand_mult (Pmode, target,
-			    GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
-			    NULL_RTX, 1);
+      target = expand_and (Pmode, target,
+			   GEN_INT (-required_align / BITS_PER_UNIT),
+			   NULL_RTX);
     }
 
   /* Record the new stack level for nonlocal gotos.  */
Index: Makefile.in
===================================================================
--- Makefile.in	(revision 124657)
+++ Makefile.in	(working copy)
@@ -2553,7 +2553,7 @@ cfgexpand.o : cfgexpand.c $(TREE_FLOW_H)
    $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \
    coretypes.h $(TREE_DUMP_H) except.h langhooks.h tree-pass.h $(RTL_H) \
    $(DIAGNOSTIC_H) toplev.h $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
-   value-prof.h
+   value-prof.h $(OPTABS_H)
 cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
    $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
    output.h toplev.h $(FUNCTION_H) except.h $(TM_P_H) insn-config.h $(EXPR_H) \
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 124657)
+++ config/i386/i386.c	(working copy)
@@ -5610,9 +5610,6 @@ ix86_compute_frame_layout (struct ix86_f
 
   gcc_assert (!size || stack_alignment_needed);
   gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (preferred_alignment <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (stack_alignment_needed
-	      <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
 
   if (stack_alignment_needed < STACK_BOUNDARY / BITS_PER_UNIT)
     stack_alignment_needed = STACK_BOUNDARY / BITS_PER_UNIT;
Index: stmt.c
===================================================================
--- stmt.c	(revision 124657)
+++ stmt.c	(working copy)
@@ -1850,6 +1850,7 @@ void
 expand_decl (tree decl)
 {
   tree type;
+  unsigned HOST_WIDE_INT align;
 
   type = TREE_TYPE (decl);
 
@@ -1874,6 +1875,13 @@ expand_decl (tree decl)
   if (TREE_STATIC (decl) || DECL_EXTERNAL (decl))
     return;
 
+  if (DECL_USER_ALIGN (decl))
+    align = DECL_ALIGN (decl);
+  else if (DECL_MODE (decl) == BLKmode)
+    align = BIGGEST_ALIGNMENT;
+  else
+    align = TYPE_ALIGN (type);
+
   /* Create the RTL representation for the variable.  */
 
   if (type == error_mark_node)
@@ -1941,10 +1949,16 @@ expand_decl (tree decl)
 	  oldaddr = XEXP (DECL_RTL (decl), 0);
 	}
 
-      /* Set alignment we actually gave this decl.  */
-      DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
-			   : GET_MODE_BITSIZE (DECL_MODE (decl)));
-      DECL_USER_ALIGN (decl) = 0;
+      /* Set alignment we actually gave this decl if we don't have
+	 an user specific alignment and the alignment is less than
+	 the biggest alignment.  */
+      if (! DECL_USER_ALIGN (decl)
+	  || DECL_ALIGN (decl) <= BIGGEST_ALIGNMENT)
+        {
+	  DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
+			       : GET_MODE_BITSIZE (DECL_MODE (decl)));
+	  DECL_USER_ALIGN (decl) = 0;
+	}
 
       x = assign_temp (decl, 1, 1, 1);
       set_mem_attributes (x, decl, 1);
@@ -1975,8 +1989,10 @@ expand_decl (tree decl)
 	 DECL_ALIGN says how the variable is to be aligned and we
 	 cannot use it to conclude anything about the alignment of
 	 the size.  */
-      address = allocate_dynamic_stack_space (size, NULL_RTX,
-					      TYPE_ALIGN (TREE_TYPE (decl)));
+      address = allocate_dynamic_stack_space_1 (size, NULL_RTX,
+						TYPE_ALIGN (TREE_TYPE (decl)),
+						MAX (TYPE_ALIGN (TREE_TYPE (decl)),
+						align));
 
       /* Reference the variable indirect through that rtx.  */
       x = gen_rtx_MEM (DECL_MODE (decl), address);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment
  2007-09-17 22:22               ` Andrew Pinski
@ 2007-10-09 21:20                 ` Andrew Pinski
  2007-10-10  0:31                   ` H.J. Lu
                                     ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Andrew Pinski @ 2007-10-09 21:20 UTC (permalink / raw)
  To: Andrew_Pinski
  Cc: trevor_smigiel, gcc-patches, Richard Henderson, Russell_Olsen

[-- Attachment #1: Type: text/plain, Size: 4332 bytes --]

Ping?

On 9/17/07, Andrew Pinski <pinskia@gmail.com> wrote:
> Ping?
>
> On 8/28/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > Ping?
> >
> > On 8/4/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > Ping?
> > >
> > > On 7/2/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > > Ping?
> > > >
> > > > On 5/18/07, Andrew_Pinski@playstation.sony.com
> > > > <Andrew_Pinski@playstation.sony.com> wrote:
> > > > > Trevor Smigiel/R&D/SCEA@Playstation wrote on 05/17/2007 04:49:53 PM:
> > > > >
> > > > > > Andrew,
> > > > > >
> > > > > > It seems you missed merging a change that I made in our local tree.
> > > > > >
> > > > > > In cfgexpand.c, you removed this code:
> > > > > >
> > > > > > -  /* Set alignment we actually gave this decl.  */
> > > > > > -  offset -= frame_phase;
> > > > > > -  align = offset & -offset;
> > > > > > -  align *= BITS_PER_UNIT;
> > > > > > -  if (align > STACK_BOUNDARY || align == 0)
> > > > > > -    align = STACK_BOUNDARY;
> > > > > > -  DECL_ALIGN (decl) = align;
> > > > > > -  DECL_USER_ALIGN (decl) = 0;
> > > > > >
> > > > > > In our local tree that code is conditioned on
> > > > > >   if (!DECL_USER_ALIGN (decl))
> > > > > > (Which could be done more precisely.  Consider the case where a user
> > > > > > specified alignment is smaller than the decl's actual alignment on the
> > > > > > stack.)
> > > > > >
> > > > > > Without it less efficient code can be generated because higher
> > > > > > alignments are not propagated.  At least, that's the behaviour on 4.1.1.
> > > > >
> > > > > Here is the new patch which fixes that problem and we get the get the
> > > > > correct alignment on the variables now that we were getting a smaller
> > > > > alignment on with my older version of the patch.  It adds a testcase which
> > > > > verifies this is the case too (I only check on ia32/x86_64, powerpc* and
> > > > > spu as those are the only targets which I know the normal alignment of
> > > > > stack is and the only targets I could test the testcase on).
> > > > >
> > > > > OK? Bootstrapped and tested on i686-linux-gnu with no regressions.
> > > > >
> > > > > Thanks,
> > > > > Andrew Pinski
> > > > >
> > > > > ChangeLog:
> > > > >
> > > > >         * expr.h  (allocate_dynamic_stack_space_1): New function
> > > > > prototype.
> > > > >         * functionc.c (assign_temp): Take into account the alignment
> > > > >         of the temp if it is greater than the target's preferred
> > > > > alignment.
> > > > >         * cfgexpand.c: Include optabs.h.
> > > > >         (get_decl_align_unit): Update comment and don't lower the
> > > > > alignment
> > > > >         if it is greater than the target's preferred alignment.
> > > > >         (alloc_stack_frame_space): Take an unsigned HOST_WIDE_INT for
> > > > > align.
> > > > >         Take into account the variable's alignment if it is greater than
> > > > >         the target's preferred alignment.
> > > > >         (expand_one_stack_var_at):  Likewise.  Only reset the alignment of
> > > > > the decl to given
> > > > >         alignment if the alignment is less than the target's preferred
> > > > > alignment.
> > > > >         * explow.c (allocate_dynamic_stack_space_1): Split out from
> > > > >         allocate_dynamic_stack_space and take into acount the required
> > > > >         alignment.  Use AND opcode instead of shifting left and shifting
> > > > >         back right.
> > > > >         (allocate_dynamic_stack_space): Call
> > > > > allocate_dynamic_stack_space_1.
> > > > >         * Makefile.in (cfgexpand.o): Update dependecies.
> > > > >         * stmt.c (expand_decl):  Take into account the alignment
> > > > >         of the variable if it is greater than the target's preferred
> > > > >         alignment.
> > > > >         * config/i386/i386.c (ix86_compute_frame_layout): Don't assert
> > > > >         that preferred alignment is greater than the normal preferred
> > > > >         alignment.  Don't assert that the stack alignment needed is
> > > > > greater
> > > > >         than the normal preferred alignment.
> > > > >
> > > > >         * gcc.c-torture/execute/pr16660-1.c: New testcase.
> > > > >         * gcc.c-torture/execute/pr16660-2.c: New testcase.
> > > > >         * gcc.dg/pr16660-1.c: New testcase.
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>

[-- Attachment #2: fixpr16660.diff.txt --]
[-- Type: text/plain, Size: 13936 bytes --]

Index: testsuite/gcc.c-torture/execute/pr16660-1.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
@@ -0,0 +1,11 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+  size_t b = (size_t)&a;
+  if (b&(ALIGNMENT-1))
+    return 1;
+  return 0;
+}
+
Index: testsuite/gcc.c-torture/execute/pr16660-2.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
@@ -0,0 +1,19 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+  {
+    int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+    size_t b = (size_t)&a;
+    if (b&(ALIGNMENT-1))
+      return 1;
+  }
+  {
+    int a1[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+    size_t b = (size_t)&a1;
+    if (b&(ALIGNMENT-1))
+      return 1;
+  }
+  return 0;
+}
+
Index: testsuite/gcc.dg/pr16660-1.c
===================================================================
--- testsuite/gcc.dg/pr16660-1.c	(revision 0)
+++ testsuite/gcc.dg/pr16660-1.c	(revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-expand" } */
+int g(int *);
+int f(void)
+{
+ int i __attribute__((aligned(1) ));
+ int *a = &i;
+ g(a);
+ return *a;
+}
+/* Even though the user supplied an alignment of 1, the memory
+  location for i should have the natural alignment of the stack.
+  Test only on x86, spu and powerpc for now.  */
+
+/* { dg-final { scan-rtl-dump-not "2 i\\\+0 S4 A8" "expand" } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A32" "expand" { target i?86-*-* x86_64-*-* } } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A64" "expand" { target { { rs6000-*-* powerpc*-*-* } && ilp32 } } } } */
+/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A128" "expand" { target { spu-*-* } || { { rs6000-*-* powerpc*-*-* } && lp64 } } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
Index: expr.h
===================================================================
--- expr.h	(revision 124657)
+++ expr.h	(working copy)
@@ -712,6 +712,7 @@ extern void update_nonlocal_goto_save_ar
 /* Allocate some space on the stack dynamically and return its address.  An rtx
    says how many bytes.  */
 extern rtx allocate_dynamic_stack_space (rtx, rtx, int);
+extern rtx allocate_dynamic_stack_space_1 (rtx, rtx, int, int);
 
 /* Probe a range of stack addresses from FIRST to FIRST+SIZE, inclusive.
    FIRST is a constant and size is a Pmode RTX.  These are offsets from the
Index: function.c
===================================================================
--- function.c	(revision 124657)
+++ function.c	(working copy)
@@ -813,6 +813,7 @@ assign_temp (tree type_or_decl, int keep
   if (mode == BLKmode || memory_required)
     {
       HOST_WIDE_INT size = int_size_in_bytes (type);
+      unsigned HOST_WIDE_INT align;
       rtx tmp;
 
       /* Zero sized arrays are GNU C extension.  Set size to 1 to avoid
@@ -837,7 +838,37 @@ assign_temp (tree type_or_decl, int keep
 	  size = 1;
 	}
 
-      tmp = assign_stack_temp_for_type (mode, size, keep, type);
+      align = TYPE_ALIGN (type);
+      if (decl && DECL_ALIGN (decl) > align)
+	align = DECL_ALIGN (decl);
+
+      /* When the user specifies an alignment larger than
+         BIGGEST_ALIGNMENT we need to allocate extra space so we can
+         adjust this decl's address.  */
+      if (align > PREFERRED_STACK_BOUNDARY)
+	{
+	  rtx addr;
+	  HOST_WIDE_INT pad = (align - PREFERRED_STACK_BOUNDARY) / BITS_PER_UNIT;
+	  size += pad;
+	  tmp = assign_stack_temp_for_type (mode, size, keep, type);
+	  addr = XEXP (tmp, 0);
+	  /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
+	     but we know it can't.  So add ourselves and then do
+	     TRUNC_DIV_EXPR.  */
+	  addr = expand_binop (Pmode, add_optab, addr,
+			       GEN_INT (align / BITS_PER_UNIT - 1),
+			       NULL_RTX, 1, OPTAB_LIB_WIDEN);
+	  addr = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, addr,
+			        GEN_INT (align / BITS_PER_UNIT),
+			        NULL_RTX, 1);
+	  addr = expand_mult (Pmode, addr,
+			      GEN_INT (align / BITS_PER_UNIT),
+			      NULL_RTX, 1);
+	  tmp = change_address (tmp, VOIDmode, addr);
+	}
+      else
+	tmp = assign_stack_temp_for_type (mode, size, keep, type);
+
       return tmp;
     }
 
Index: cfgexpand.c
===================================================================
--- cfgexpand.c	(revision 124657)
+++ cfgexpand.c	(working copy)
@@ -41,6 +41,7 @@ Boston, MA 02110-1301, USA.  */
 #include "params.h"
 #include "tree-inline.h"
 #include "value-prof.h"
+#include "optabs.h"
 
 /* Verify that there is exactly single jump instruction since last and attach
    REG_BR_PROB note specifying probability.
@@ -151,8 +152,7 @@ static bool has_protected_decls;
    smaller than our cutoff threshold.  Used for -Wstack-protector.  */
 static bool has_short_buffer;
 
-/* Discover the byte alignment to use for DECL.  Ignore alignment
-   we can't do with expected alignment of the stack boundary.  */
+/* Discover the byte alignment to use for DECL.  */
 
 static unsigned int
 get_decl_align_unit (tree decl)
@@ -161,8 +161,6 @@ get_decl_align_unit (tree decl)
 
   align = DECL_ALIGN (decl);
   align = LOCAL_ALIGNMENT (TREE_TYPE (decl), align);
-  if (align > PREFERRED_STACK_BOUNDARY)
-    align = PREFERRED_STACK_BOUNDARY;
   if (cfun->stack_alignment_needed < align)
     cfun->stack_alignment_needed = align;
 
@@ -173,11 +171,21 @@ get_decl_align_unit (tree decl)
    Return the frame offset.  */
 
 static HOST_WIDE_INT
-alloc_stack_frame_space (HOST_WIDE_INT size, HOST_WIDE_INT align)
+alloc_stack_frame_space (HOST_WIDE_INT size, unsigned HOST_WIDE_INT align)
 {
   HOST_WIDE_INT offset, new_frame_offset;
 
   new_frame_offset = frame_offset;
+
+  /* Allocate extra space if the alignment required is greater than the
+     stack boundary and then assume the RTL expansion of the variable, gets
+     the correct alignment.  */
+  if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      size += align;
+      align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
+    }
+
   if (FRAME_GROWS_DOWNWARD)
     {
       new_frame_offset -= size + frame_phase;
@@ -523,23 +531,38 @@ dump_stack_var_partition (void)
 static void
 expand_one_stack_var_at (tree decl, HOST_WIDE_INT offset)
 {
-  HOST_WIDE_INT align;
+  unsigned HOST_WIDE_INT align;
   rtx x;
 
   /* If this fails, we've overflowed the stack frame.  Error nicely?  */
   gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
 
   x = plus_constant (virtual_stack_vars_rtx, offset);
-  x = gen_rtx_MEM (DECL_MODE (decl), x);
+  align = get_decl_align_unit (decl);
 
-  /* Set alignment we actually gave this decl.  */
-  offset -= frame_phase;
-  align = offset & -offset;
-  align *= BITS_PER_UNIT;
-  if (align > STACK_BOUNDARY || align == 0)
-    align = STACK_BOUNDARY;
-  DECL_ALIGN (decl) = align;
-  DECL_USER_ALIGN (decl) = 0;
+   if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+    {
+      rtx addr = x;
+      addr = expand_binop (Pmode, add_optab, addr,
+			   GEN_INT (align - 1),
+			   NULL_RTX, 1, OPTAB_LIB_WIDEN);
+      addr = expand_and (Pmode, addr, GEN_INT (-align),
+			    NULL_RTX);
+      x = addr;
+    } 
+  else
+    {
+      /* Set alignment we actually gave this decl.  */
+      offset -= frame_phase;
+      align = offset & -offset;
+      align *= BITS_PER_UNIT;
+      if (align > STACK_BOUNDARY || align == 0)
+        align = STACK_BOUNDARY;
+      DECL_ALIGN (decl) = align;
+      DECL_USER_ALIGN (decl) = 0;
+    }
+
+  x = gen_rtx_MEM (DECL_MODE (decl), x);
 
   set_mem_attributes (x, decl, true);
   SET_DECL_RTL (decl, x);
Index: explow.c
===================================================================
--- explow.c	(revision 124657)
+++ explow.c	(working copy)
@@ -1093,16 +1093,28 @@ update_nonlocal_goto_save_area (void)
    TARGET is a place in which the address can be placed.
 
    KNOWN_ALIGN is the alignment (in bits) that we know SIZE has.  */
-
 rtx
 allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
 {
+  return allocate_dynamic_stack_space_1 (size, target, known_align,
+				       BIGGEST_ALIGNMENT);
+}
+
+/* Like allocate_dynamic_stack_space expect provide a way, REQUIRED_ALGIN,
+   to say the required alignment of the address.  */
+rtx
+allocate_dynamic_stack_space_1 (rtx size, rtx target, int known_align,
+				int required_align)
+{
   /* If we're asking for zero bytes, it doesn't matter what we point
      to since we can't dereference it.  But return a reasonable
      address anyway.  */
   if (size == const0_rtx)
     return virtual_stack_dynamic_rtx;
 
+  if (required_align < BIGGEST_ALIGNMENT)
+    required_align = BIGGEST_ALIGNMENT;
+
   /* Otherwise, show we're calling alloca or equivalent.  */
   current_function_calls_alloca = 1;
 
@@ -1131,13 +1143,13 @@ allocate_dynamic_stack_space (rtx size, 
 #if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
 #define MUST_ALIGN 1
 #else
-#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < BIGGEST_ALIGNMENT)
+#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < required_align)
 #endif
 
   if (MUST_ALIGN)
     size
       = force_operand (plus_constant (size,
-				      BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+				      required_align / BITS_PER_UNIT - 1),
 		       NULL_RTX);
 
 #ifdef SETJMP_VIA_SAVE_AREA
@@ -1293,18 +1305,12 @@ allocate_dynamic_stack_space (rtx size, 
 
   if (MUST_ALIGN)
     {
-      /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
-	 but we know it can't.  So add ourselves and then do
-	 TRUNC_DIV_EXPR.  */
       target = expand_binop (Pmode, add_optab, target,
-			     GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+			     GEN_INT (required_align / BITS_PER_UNIT - 1),
 			     NULL_RTX, 1, OPTAB_LIB_WIDEN);
-      target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
-			      GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
-			      NULL_RTX, 1);
-      target = expand_mult (Pmode, target,
-			    GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
-			    NULL_RTX, 1);
+      target = expand_and (Pmode, target,
+			   GEN_INT (-required_align / BITS_PER_UNIT),
+			   NULL_RTX);
     }
 
   /* Record the new stack level for nonlocal gotos.  */
Index: Makefile.in
===================================================================
--- Makefile.in	(revision 124657)
+++ Makefile.in	(working copy)
@@ -2553,7 +2553,7 @@ cfgexpand.o : cfgexpand.c $(TREE_FLOW_H)
    $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \
    coretypes.h $(TREE_DUMP_H) except.h langhooks.h tree-pass.h $(RTL_H) \
    $(DIAGNOSTIC_H) toplev.h $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
-   value-prof.h
+   value-prof.h $(OPTABS_H)
 cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
    $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
    output.h toplev.h $(FUNCTION_H) except.h $(TM_P_H) insn-config.h $(EXPR_H) \
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 124657)
+++ config/i386/i386.c	(working copy)
@@ -5610,9 +5610,6 @@ ix86_compute_frame_layout (struct ix86_f
 
   gcc_assert (!size || stack_alignment_needed);
   gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (preferred_alignment <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
-  gcc_assert (stack_alignment_needed
-	      <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
 
   if (stack_alignment_needed < STACK_BOUNDARY / BITS_PER_UNIT)
     stack_alignment_needed = STACK_BOUNDARY / BITS_PER_UNIT;
Index: stmt.c
===================================================================
--- stmt.c	(revision 124657)
+++ stmt.c	(working copy)
@@ -1850,6 +1850,7 @@ void
 expand_decl (tree decl)
 {
   tree type;
+  unsigned HOST_WIDE_INT align;
 
   type = TREE_TYPE (decl);
 
@@ -1874,6 +1875,13 @@ expand_decl (tree decl)
   if (TREE_STATIC (decl) || DECL_EXTERNAL (decl))
     return;
 
+  if (DECL_USER_ALIGN (decl))
+    align = DECL_ALIGN (decl);
+  else if (DECL_MODE (decl) == BLKmode)
+    align = BIGGEST_ALIGNMENT;
+  else
+    align = TYPE_ALIGN (type);
+
   /* Create the RTL representation for the variable.  */
 
   if (type == error_mark_node)
@@ -1941,10 +1949,16 @@ expand_decl (tree decl)
 	  oldaddr = XEXP (DECL_RTL (decl), 0);
 	}
 
-      /* Set alignment we actually gave this decl.  */
-      DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
-			   : GET_MODE_BITSIZE (DECL_MODE (decl)));
-      DECL_USER_ALIGN (decl) = 0;
+      /* Set alignment we actually gave this decl if we don't have
+	 an user specific alignment and the alignment is less than
+	 the biggest alignment.  */
+      if (! DECL_USER_ALIGN (decl)
+	  || DECL_ALIGN (decl) <= BIGGEST_ALIGNMENT)
+        {
+	  DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
+			       : GET_MODE_BITSIZE (DECL_MODE (decl)));
+	  DECL_USER_ALIGN (decl) = 0;
+	}
 
       x = assign_temp (decl, 1, 1, 1);
       set_mem_attributes (x, decl, 1);
@@ -1975,8 +1989,10 @@ expand_decl (tree decl)
 	 DECL_ALIGN says how the variable is to be aligned and we
 	 cannot use it to conclude anything about the alignment of
 	 the size.  */
-      address = allocate_dynamic_stack_space (size, NULL_RTX,
-					      TYPE_ALIGN (TREE_TYPE (decl)));
+      address = allocate_dynamic_stack_space_1 (size, NULL_RTX,
+						TYPE_ALIGN (TREE_TYPE (decl)),
+						MAX (TYPE_ALIGN (TREE_TYPE (decl)),
+						align));
 
       /* Reference the variable indirect through that rtx.  */
       x = gen_rtx_MEM (DECL_MODE (decl), address);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for  variables on the stack for greater than required alignment
  2007-10-09 21:20                 ` Andrew Pinski
@ 2007-10-10  0:31                   ` H.J. Lu
  2007-10-10  0:38                     ` Andrew_Pinski
  2007-10-10  1:07                   ` [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment Dorit Nuzman
  2008-01-24 14:52                   ` Andrew Pinski
  2 siblings, 1 reply; 27+ messages in thread
From: H.J. Lu @ 2007-10-10  0:31 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Andrew_Pinski, trevor_smigiel, gcc-patches, Richard Henderson,
	Russell_Olsen

This patch doesn't solve the stack alignment problem on x86 where gcc
assumes the stack is 16byte aligned and ia32 psABI only specifies 4
byte alignment. To make gcc to conform ia32 psABI, we need to
align the stack when >4 byte alignment is needed for local variable
as well as register spill. If we align the stack, this patch isn't
needed for x86.

For some backends like x86, we can use the frame pointer to store the
orginal stack pointer and align the stack pointer when necessary. I'd
like to create a gcc 4.4 project to make gcc to conform ia32 psABI by
aligning the stack pointer only when needed.


H.J.
----
On Tue, Oct 09, 2007 at 02:20:26PM -0700, Andrew Pinski wrote:
> Ping?
> 
> On 9/17/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > Ping?
> >
> > On 8/28/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > Ping?
> > >
> > > On 8/4/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > > Ping?
> > > >
> > > > On 7/2/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > > > Ping?
> > > > >
> > > > > On 5/18/07, Andrew_Pinski@playstation.sony.com
> > > > > <Andrew_Pinski@playstation.sony.com> wrote:
> > > > > > Trevor Smigiel/R&D/SCEA@Playstation wrote on 05/17/2007 04:49:53 PM:
> > > > > >
> > > > > > > Andrew,
> > > > > > >
> > > > > > > It seems you missed merging a change that I made in our local tree.
> > > > > > >
> > > > > > > In cfgexpand.c, you removed this code:
> > > > > > >
> > > > > > > -  /* Set alignment we actually gave this decl.  */
> > > > > > > -  offset -= frame_phase;
> > > > > > > -  align = offset & -offset;
> > > > > > > -  align *= BITS_PER_UNIT;
> > > > > > > -  if (align > STACK_BOUNDARY || align == 0)
> > > > > > > -    align = STACK_BOUNDARY;
> > > > > > > -  DECL_ALIGN (decl) = align;
> > > > > > > -  DECL_USER_ALIGN (decl) = 0;
> > > > > > >
> > > > > > > In our local tree that code is conditioned on
> > > > > > >   if (!DECL_USER_ALIGN (decl))
> > > > > > > (Which could be done more precisely.  Consider the case where a user
> > > > > > > specified alignment is smaller than the decl's actual alignment on the
> > > > > > > stack.)
> > > > > > >
> > > > > > > Without it less efficient code can be generated because higher
> > > > > > > alignments are not propagated.  At least, that's the behaviour on 4.1.1.
> > > > > >
> > > > > > Here is the new patch which fixes that problem and we get the get the
> > > > > > correct alignment on the variables now that we were getting a smaller
> > > > > > alignment on with my older version of the patch.  It adds a testcase which
> > > > > > verifies this is the case too (I only check on ia32/x86_64, powerpc* and
> > > > > > spu as those are the only targets which I know the normal alignment of
> > > > > > stack is and the only targets I could test the testcase on).
> > > > > >
> > > > > > OK? Bootstrapped and tested on i686-linux-gnu with no regressions.
> > > > > >
> > > > > > Thanks,
> > > > > > Andrew Pinski
> > > > > >
> > > > > > ChangeLog:
> > > > > >
> > > > > >         * expr.h  (allocate_dynamic_stack_space_1): New function
> > > > > > prototype.
> > > > > >         * functionc.c (assign_temp): Take into account the alignment
> > > > > >         of the temp if it is greater than the target's preferred
> > > > > > alignment.
> > > > > >         * cfgexpand.c: Include optabs.h.
> > > > > >         (get_decl_align_unit): Update comment and don't lower the
> > > > > > alignment
> > > > > >         if it is greater than the target's preferred alignment.
> > > > > >         (alloc_stack_frame_space): Take an unsigned HOST_WIDE_INT for
> > > > > > align.
> > > > > >         Take into account the variable's alignment if it is greater than
> > > > > >         the target's preferred alignment.
> > > > > >         (expand_one_stack_var_at):  Likewise.  Only reset the alignment of
> > > > > > the decl to given
> > > > > >         alignment if the alignment is less than the target's preferred
> > > > > > alignment.
> > > > > >         * explow.c (allocate_dynamic_stack_space_1): Split out from
> > > > > >         allocate_dynamic_stack_space and take into acount the required
> > > > > >         alignment.  Use AND opcode instead of shifting left and shifting
> > > > > >         back right.
> > > > > >         (allocate_dynamic_stack_space): Call
> > > > > > allocate_dynamic_stack_space_1.
> > > > > >         * Makefile.in (cfgexpand.o): Update dependecies.
> > > > > >         * stmt.c (expand_decl):  Take into account the alignment
> > > > > >         of the variable if it is greater than the target's preferred
> > > > > >         alignment.
> > > > > >         * config/i386/i386.c (ix86_compute_frame_layout): Don't assert
> > > > > >         that preferred alignment is greater than the normal preferred
> > > > > >         alignment.  Don't assert that the stack alignment needed is
> > > > > > greater
> > > > > >         than the normal preferred alignment.
> > > > > >
> > > > > >         * gcc.c-torture/execute/pr16660-1.c: New testcase.
> > > > > >         * gcc.c-torture/execute/pr16660-2.c: New testcase.
> > > > > >         * gcc.dg/pr16660-1.c: New testcase.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >

> Index: testsuite/gcc.c-torture/execute/pr16660-1.c
> ===================================================================
> --- testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
> +++ testsuite/gcc.c-torture/execute/pr16660-1.c	(revision 0)
> @@ -0,0 +1,11 @@
> +typedef __SIZE_TYPE__ size_t;
> +#define ALIGNMENT 256
> +int main(void)
> +{
> +  int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
> +  size_t b = (size_t)&a;
> +  if (b&(ALIGNMENT-1))
> +    return 1;
> +  return 0;
> +}
> +
> Index: testsuite/gcc.c-torture/execute/pr16660-2.c
> ===================================================================
> --- testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
> +++ testsuite/gcc.c-torture/execute/pr16660-2.c	(revision 0)
> @@ -0,0 +1,19 @@
> +typedef __SIZE_TYPE__ size_t;
> +#define ALIGNMENT 256
> +int main(void)
> +{
> +  {
> +    int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
> +    size_t b = (size_t)&a;
> +    if (b&(ALIGNMENT-1))
> +      return 1;
> +  }
> +  {
> +    int a1[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
> +    size_t b = (size_t)&a1;
> +    if (b&(ALIGNMENT-1))
> +      return 1;
> +  }
> +  return 0;
> +}
> +
> Index: testsuite/gcc.dg/pr16660-1.c
> ===================================================================
> --- testsuite/gcc.dg/pr16660-1.c	(revision 0)
> +++ testsuite/gcc.dg/pr16660-1.c	(revision 0)
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-rtl-expand" } */
> +int g(int *);
> +int f(void)
> +{
> + int i __attribute__((aligned(1) ));
> + int *a = &i;
> + g(a);
> + return *a;
> +}
> +/* Even though the user supplied an alignment of 1, the memory
> +  location for i should have the natural alignment of the stack.
> +  Test only on x86, spu and powerpc for now.  */
> +
> +/* { dg-final { scan-rtl-dump-not "2 i\\\+0 S4 A8" "expand" } } */
> +/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A32" "expand" { target i?86-*-* x86_64-*-* } } } */
> +/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A64" "expand" { target { { rs6000-*-* powerpc*-*-* } && ilp32 } } } } */
> +/* { dg-final { scan-rtl-dump "2 i\\\+0 S4 A128" "expand" { target { spu-*-* } || { { rs6000-*-* powerpc*-*-* } && lp64 } } } } */
> +/* { dg-final { cleanup-rtl-dump "expand" } } */
> Index: expr.h
> ===================================================================
> --- expr.h	(revision 124657)
> +++ expr.h	(working copy)
> @@ -712,6 +712,7 @@ extern void update_nonlocal_goto_save_ar
>  /* Allocate some space on the stack dynamically and return its address.  An rtx
>     says how many bytes.  */
>  extern rtx allocate_dynamic_stack_space (rtx, rtx, int);
> +extern rtx allocate_dynamic_stack_space_1 (rtx, rtx, int, int);
>  
>  /* Probe a range of stack addresses from FIRST to FIRST+SIZE, inclusive.
>     FIRST is a constant and size is a Pmode RTX.  These are offsets from the
> Index: function.c
> ===================================================================
> --- function.c	(revision 124657)
> +++ function.c	(working copy)
> @@ -813,6 +813,7 @@ assign_temp (tree type_or_decl, int keep
>    if (mode == BLKmode || memory_required)
>      {
>        HOST_WIDE_INT size = int_size_in_bytes (type);
> +      unsigned HOST_WIDE_INT align;
>        rtx tmp;
>  
>        /* Zero sized arrays are GNU C extension.  Set size to 1 to avoid
> @@ -837,7 +838,37 @@ assign_temp (tree type_or_decl, int keep
>  	  size = 1;
>  	}
>  
> -      tmp = assign_stack_temp_for_type (mode, size, keep, type);
> +      align = TYPE_ALIGN (type);
> +      if (decl && DECL_ALIGN (decl) > align)
> +	align = DECL_ALIGN (decl);
> +
> +      /* When the user specifies an alignment larger than
> +         BIGGEST_ALIGNMENT we need to allocate extra space so we can
> +         adjust this decl's address.  */
> +      if (align > PREFERRED_STACK_BOUNDARY)
> +	{
> +	  rtx addr;
> +	  HOST_WIDE_INT pad = (align - PREFERRED_STACK_BOUNDARY) / BITS_PER_UNIT;
> +	  size += pad;
> +	  tmp = assign_stack_temp_for_type (mode, size, keep, type);
> +	  addr = XEXP (tmp, 0);
> +	  /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
> +	     but we know it can't.  So add ourselves and then do
> +	     TRUNC_DIV_EXPR.  */
> +	  addr = expand_binop (Pmode, add_optab, addr,
> +			       GEN_INT (align / BITS_PER_UNIT - 1),
> +			       NULL_RTX, 1, OPTAB_LIB_WIDEN);
> +	  addr = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, addr,
> +			        GEN_INT (align / BITS_PER_UNIT),
> +			        NULL_RTX, 1);
> +	  addr = expand_mult (Pmode, addr,
> +			      GEN_INT (align / BITS_PER_UNIT),
> +			      NULL_RTX, 1);
> +	  tmp = change_address (tmp, VOIDmode, addr);
> +	}
> +      else
> +	tmp = assign_stack_temp_for_type (mode, size, keep, type);
> +
>        return tmp;
>      }
>  
> Index: cfgexpand.c
> ===================================================================
> --- cfgexpand.c	(revision 124657)
> +++ cfgexpand.c	(working copy)
> @@ -41,6 +41,7 @@ Boston, MA 02110-1301, USA.  */
>  #include "params.h"
>  #include "tree-inline.h"
>  #include "value-prof.h"
> +#include "optabs.h"
>  
>  /* Verify that there is exactly single jump instruction since last and attach
>     REG_BR_PROB note specifying probability.
> @@ -151,8 +152,7 @@ static bool has_protected_decls;
>     smaller than our cutoff threshold.  Used for -Wstack-protector.  */
>  static bool has_short_buffer;
>  
> -/* Discover the byte alignment to use for DECL.  Ignore alignment
> -   we can't do with expected alignment of the stack boundary.  */
> +/* Discover the byte alignment to use for DECL.  */
>  
>  static unsigned int
>  get_decl_align_unit (tree decl)
> @@ -161,8 +161,6 @@ get_decl_align_unit (tree decl)
>  
>    align = DECL_ALIGN (decl);
>    align = LOCAL_ALIGNMENT (TREE_TYPE (decl), align);
> -  if (align > PREFERRED_STACK_BOUNDARY)
> -    align = PREFERRED_STACK_BOUNDARY;
>    if (cfun->stack_alignment_needed < align)
>      cfun->stack_alignment_needed = align;
>  
> @@ -173,11 +171,21 @@ get_decl_align_unit (tree decl)
>     Return the frame offset.  */
>  
>  static HOST_WIDE_INT
> -alloc_stack_frame_space (HOST_WIDE_INT size, HOST_WIDE_INT align)
> +alloc_stack_frame_space (HOST_WIDE_INT size, unsigned HOST_WIDE_INT align)
>  {
>    HOST_WIDE_INT offset, new_frame_offset;
>  
>    new_frame_offset = frame_offset;
> +
> +  /* Allocate extra space if the alignment required is greater than the
> +     stack boundary and then assume the RTL expansion of the variable, gets
> +     the correct alignment.  */
> +  if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
> +    {
> +      size += align;
> +      align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
> +    }
> +
>    if (FRAME_GROWS_DOWNWARD)
>      {
>        new_frame_offset -= size + frame_phase;
> @@ -523,23 +531,38 @@ dump_stack_var_partition (void)
>  static void
>  expand_one_stack_var_at (tree decl, HOST_WIDE_INT offset)
>  {
> -  HOST_WIDE_INT align;
> +  unsigned HOST_WIDE_INT align;
>    rtx x;
>  
>    /* If this fails, we've overflowed the stack frame.  Error nicely?  */
>    gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>  
>    x = plus_constant (virtual_stack_vars_rtx, offset);
> -  x = gen_rtx_MEM (DECL_MODE (decl), x);
> +  align = get_decl_align_unit (decl);
>  
> -  /* Set alignment we actually gave this decl.  */
> -  offset -= frame_phase;
> -  align = offset & -offset;
> -  align *= BITS_PER_UNIT;
> -  if (align > STACK_BOUNDARY || align == 0)
> -    align = STACK_BOUNDARY;
> -  DECL_ALIGN (decl) = align;
> -  DECL_USER_ALIGN (decl) = 0;
> +   if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
> +    {
> +      rtx addr = x;
> +      addr = expand_binop (Pmode, add_optab, addr,
> +			   GEN_INT (align - 1),
> +			   NULL_RTX, 1, OPTAB_LIB_WIDEN);
> +      addr = expand_and (Pmode, addr, GEN_INT (-align),
> +			    NULL_RTX);
> +      x = addr;
> +    } 
> +  else
> +    {
> +      /* Set alignment we actually gave this decl.  */
> +      offset -= frame_phase;
> +      align = offset & -offset;
> +      align *= BITS_PER_UNIT;
> +      if (align > STACK_BOUNDARY || align == 0)
> +        align = STACK_BOUNDARY;
> +      DECL_ALIGN (decl) = align;
> +      DECL_USER_ALIGN (decl) = 0;
> +    }
> +
> +  x = gen_rtx_MEM (DECL_MODE (decl), x);
>  
>    set_mem_attributes (x, decl, true);
>    SET_DECL_RTL (decl, x);
> Index: explow.c
> ===================================================================
> --- explow.c	(revision 124657)
> +++ explow.c	(working copy)
> @@ -1093,16 +1093,28 @@ update_nonlocal_goto_save_area (void)
>     TARGET is a place in which the address can be placed.
>  
>     KNOWN_ALIGN is the alignment (in bits) that we know SIZE has.  */
> -
>  rtx
>  allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
>  {
> +  return allocate_dynamic_stack_space_1 (size, target, known_align,
> +				       BIGGEST_ALIGNMENT);
> +}
> +
> +/* Like allocate_dynamic_stack_space expect provide a way, REQUIRED_ALGIN,
> +   to say the required alignment of the address.  */
> +rtx
> +allocate_dynamic_stack_space_1 (rtx size, rtx target, int known_align,
> +				int required_align)
> +{
>    /* If we're asking for zero bytes, it doesn't matter what we point
>       to since we can't dereference it.  But return a reasonable
>       address anyway.  */
>    if (size == const0_rtx)
>      return virtual_stack_dynamic_rtx;
>  
> +  if (required_align < BIGGEST_ALIGNMENT)
> +    required_align = BIGGEST_ALIGNMENT;
> +
>    /* Otherwise, show we're calling alloca or equivalent.  */
>    current_function_calls_alloca = 1;
>  
> @@ -1131,13 +1143,13 @@ allocate_dynamic_stack_space (rtx size, 
>  #if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
>  #define MUST_ALIGN 1
>  #else
> -#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < BIGGEST_ALIGNMENT)
> +#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < required_align)
>  #endif
>  
>    if (MUST_ALIGN)
>      size
>        = force_operand (plus_constant (size,
> -				      BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
> +				      required_align / BITS_PER_UNIT - 1),
>  		       NULL_RTX);
>  
>  #ifdef SETJMP_VIA_SAVE_AREA
> @@ -1293,18 +1305,12 @@ allocate_dynamic_stack_space (rtx size, 
>  
>    if (MUST_ALIGN)
>      {
> -      /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
> -	 but we know it can't.  So add ourselves and then do
> -	 TRUNC_DIV_EXPR.  */
>        target = expand_binop (Pmode, add_optab, target,
> -			     GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
> +			     GEN_INT (required_align / BITS_PER_UNIT - 1),
>  			     NULL_RTX, 1, OPTAB_LIB_WIDEN);
> -      target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
> -			      GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
> -			      NULL_RTX, 1);
> -      target = expand_mult (Pmode, target,
> -			    GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
> -			    NULL_RTX, 1);
> +      target = expand_and (Pmode, target,
> +			   GEN_INT (-required_align / BITS_PER_UNIT),
> +			   NULL_RTX);
>      }
>  
>    /* Record the new stack level for nonlocal gotos.  */
> Index: Makefile.in
> ===================================================================
> --- Makefile.in	(revision 124657)
> +++ Makefile.in	(working copy)
> @@ -2553,7 +2553,7 @@ cfgexpand.o : cfgexpand.c $(TREE_FLOW_H)
>     $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \
>     coretypes.h $(TREE_DUMP_H) except.h langhooks.h tree-pass.h $(RTL_H) \
>     $(DIAGNOSTIC_H) toplev.h $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
> -   value-prof.h
> +   value-prof.h $(OPTABS_H)
>  cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
>     $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
>     output.h toplev.h $(FUNCTION_H) except.h $(TM_P_H) insn-config.h $(EXPR_H) \
> Index: config/i386/i386.c
> ===================================================================
> --- config/i386/i386.c	(revision 124657)
> +++ config/i386/i386.c	(working copy)
> @@ -5610,9 +5610,6 @@ ix86_compute_frame_layout (struct ix86_f
>  
>    gcc_assert (!size || stack_alignment_needed);
>    gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT);
> -  gcc_assert (preferred_alignment <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
> -  gcc_assert (stack_alignment_needed
> -	      <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
>  
>    if (stack_alignment_needed < STACK_BOUNDARY / BITS_PER_UNIT)
>      stack_alignment_needed = STACK_BOUNDARY / BITS_PER_UNIT;
> Index: stmt.c
> ===================================================================
> --- stmt.c	(revision 124657)
> +++ stmt.c	(working copy)
> @@ -1850,6 +1850,7 @@ void
>  expand_decl (tree decl)
>  {
>    tree type;
> +  unsigned HOST_WIDE_INT align;
>  
>    type = TREE_TYPE (decl);
>  
> @@ -1874,6 +1875,13 @@ expand_decl (tree decl)
>    if (TREE_STATIC (decl) || DECL_EXTERNAL (decl))
>      return;
>  
> +  if (DECL_USER_ALIGN (decl))
> +    align = DECL_ALIGN (decl);
> +  else if (DECL_MODE (decl) == BLKmode)
> +    align = BIGGEST_ALIGNMENT;
> +  else
> +    align = TYPE_ALIGN (type);
> +
>    /* Create the RTL representation for the variable.  */
>  
>    if (type == error_mark_node)
> @@ -1941,10 +1949,16 @@ expand_decl (tree decl)
>  	  oldaddr = XEXP (DECL_RTL (decl), 0);
>  	}
>  
> -      /* Set alignment we actually gave this decl.  */
> -      DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
> -			   : GET_MODE_BITSIZE (DECL_MODE (decl)));
> -      DECL_USER_ALIGN (decl) = 0;
> +      /* Set alignment we actually gave this decl if we don't have
> +	 an user specific alignment and the alignment is less than
> +	 the biggest alignment.  */
> +      if (! DECL_USER_ALIGN (decl)
> +	  || DECL_ALIGN (decl) <= BIGGEST_ALIGNMENT)
> +        {
> +	  DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
> +			       : GET_MODE_BITSIZE (DECL_MODE (decl)));
> +	  DECL_USER_ALIGN (decl) = 0;
> +	}
>  
>        x = assign_temp (decl, 1, 1, 1);
>        set_mem_attributes (x, decl, 1);
> @@ -1975,8 +1989,10 @@ expand_decl (tree decl)
>  	 DECL_ALIGN says how the variable is to be aligned and we
>  	 cannot use it to conclude anything about the alignment of
>  	 the size.  */
> -      address = allocate_dynamic_stack_space (size, NULL_RTX,
> -					      TYPE_ALIGN (TREE_TYPE (decl)));
> +      address = allocate_dynamic_stack_space_1 (size, NULL_RTX,
> +						TYPE_ALIGN (TREE_TYPE (decl)),
> +						MAX (TYPE_ALIGN (TREE_TYPE (decl)),
> +						align));
>  
>        /* Reference the variable indirect through that rtx.  */
>        x = gen_rtx_MEM (DECL_MODE (decl), address);

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables  on the stack for greater than required alignment
  2007-10-10  0:31                   ` H.J. Lu
@ 2007-10-10  0:38                     ` Andrew_Pinski
  2007-10-10  2:17                       ` H.J. Lu
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew_Pinski @ 2007-10-10  0:38 UTC (permalink / raw)
  To: H.J. Lu
  Cc: gcc-patches, Andrew Pinski, Richard Henderson, Russell_Olsen,
	trevor_smigiel

"H.J. Lu" <hjl@lucon.org> wrote on 10/09/2007 05:31:30 PM:

> This patch doesn't solve the stack alignment problem on x86 where gcc
> assumes the stack is 16byte aligned and ia32 psABI only specifies 4
> byte alignment. To make gcc to conform ia32 psABI, we need to
> align the stack when >4 byte alignment is needed for local variable
> as well as register spill. If we align the stack, this patch isn't
> needed for x86.

Again how many times do I have to mention, the proposal above does not 
help out when you need a bigger alignment than what is required than 
16byte alignment.  If you take a look at the testcases which are added, 
you will notice that I am testing for 256byte alignment.  For the Cell, we 
need 128byte alignment for the DMA transfers between the PowerPC and the 
SPUs.  This patch is designed to fix that case and not really any other 
case.  If you want to fix a different case, then go ahead fix that case a 
different way, this patch will be still needed to fix the case where you 
need a "big" alignment.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables  on the stack for greater than required alignment
  2007-10-09 21:20                 ` Andrew Pinski
  2007-10-10  0:31                   ` H.J. Lu
@ 2007-10-10  1:07                   ` Dorit Nuzman
  2008-01-24 14:52                   ` Andrew Pinski
  2 siblings, 0 replies; 27+ messages in thread
From: Dorit Nuzman @ 2007-10-10  1:07 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Andrew_Pinski, gcc-patches, Richard Henderson, Russell_Olsen,
	trevor_smigiel

Andrew,
Will this patch solve PR32893? (please see
http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00202.html)

thanks,
dorit

> Ping?
>
> On 9/17/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > Ping?
> >
> > On 8/28/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > Ping?
> > >
> > > On 8/4/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > > Ping?
> > > >
> > > > On 7/2/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > > > Ping?
> > > > >
> > > > > On 5/18/07, Andrew_Pinski@playstation.sony.com
> > > > > <Andrew_Pinski@playstation.sony.com> wrote:
> > > > > > Trevor Smigiel/R&D/SCEA@Playstation wrote on 05/17/2007
04:49:53 PM:
> > > > > >
> > > > > > > Andrew,
> > > > > > >
> > > > > > > It seems you missed merging a change that I made in our
> local tree.
> > > > > > >
> > > > > > > In cfgexpand.c, you removed this code:
> > > > > > >
> > > > > > > -  /* Set alignment we actually gave this decl.  */
> > > > > > > -  offset -= frame_phase;
> > > > > > > -  align = offset & -offset;
> > > > > > > -  align *= BITS_PER_UNIT;
> > > > > > > -  if (align > STACK_BOUNDARY || align == 0)
> > > > > > > -    align = STACK_BOUNDARY;
> > > > > > > -  DECL_ALIGN (decl) = align;
> > > > > > > -  DECL_USER_ALIGN (decl) = 0;
> > > > > > >
> > > > > > > In our local tree that code is conditioned on
> > > > > > >   if (!DECL_USER_ALIGN (decl))
> > > > > > > (Which could be done more precisely.  Consider the case
> where a user
> > > > > > > specified alignment is smaller than the decl's actual
> alignment on the
> > > > > > > stack.)
> > > > > > >
> > > > > > > Without it less efficient code can be generated because
higher
> > > > > > > alignments are not propagated.  At least, that's the
> behaviour on 4.1.1.
> > > > > >
> > > > > > Here is the new patch which fixes that problem and we get
> the get the
> > > > > > correct alignment on the variables now that we were
> getting a smaller
> > > > > > alignment on with my older version of the patch.  It adds
> a testcase which
> > > > > > verifies this is the case too (I only check on
> ia32/x86_64, powerpc* and
> > > > > > spu as those are the only targets which I know the normal
> alignment of
> > > > > > stack is and the only targets I could test the testcase on).
> > > > > >
> > > > > > OK? Bootstrapped and tested on i686-linux-gnu with no
regressions.
> > > > > >
> > > > > > Thanks,
> > > > > > Andrew Pinski
> > > > > >
> > > > > > ChangeLog:
> > > > > >
> > > > > >         * expr.h  (allocate_dynamic_stack_space_1): New
function
> > > > > > prototype.
> > > > > >         * functionc.c (assign_temp): Take into account
thealignment
> > > > > >         of the temp if it is greater than the target's
preferred
> > > > > > alignment.
> > > > > >         * cfgexpand.c: Include optabs.h.
> > > > > >         (get_decl_align_unit): Update comment and don't lower
the
> > > > > > alignment
> > > > > >         if it is greater than the target's preferred alignment.
> > > > > >         (alloc_stack_frame_space): Take an unsigned
> HOST_WIDE_INT for
> > > > > > align.
> > > > > >         Take into account the variable's alignment if it
> is greater than
> > > > > >         the target's preferred alignment.
> > > > > >         (expand_one_stack_var_at):  Likewise.  Only reset
> the alignment of
> > > > > > the decl to given
> > > > > >         alignment if the alignment is less than the
> target's preferred
> > > > > > alignment.
> > > > > >         * explow.c (allocate_dynamic_stack_space_1): Split out
from
> > > > > >         allocate_dynamic_stack_space and take into acount
> the required
> > > > > >         alignment.  Use AND opcode instead of shifting
> left and shifting
> > > > > >         back right.
> > > > > >         (allocate_dynamic_stack_space): Call
> > > > > > allocate_dynamic_stack_space_1.
> > > > > >         * Makefile.in (cfgexpand.o): Update dependecies.
> > > > > >         * stmt.c (expand_decl):  Take into account the
alignment
> > > > > >         of the variable if it is greater than the
target'spreferred
> > > > > >         alignment.
> > > > > >         * config/i386/i386.c (ix86_compute_frame_layout):
> Don't assert
> > > > > >         that preferred alignment is greater than the
> normal preferred
> > > > > >         alignment.  Don't assert that the stack alignment
needed is
> > > > > > greater
> > > > > >         than the normal preferred alignment.
> > > > > >
> > > > > >         * gcc.c-torture/execute/pr16660-1.c: New testcase.
> > > > > >         * gcc.c-torture/execute/pr16660-2.c: New testcase.
> > > > > >         * gcc.dg/pr16660-1.c: New testcase.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
> [attachment "fixpr16660.diff.txt" deleted by Dorit Nuzman/Haifa/IBM]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for  variables  on the stack for greater than required alignment
  2007-10-10  0:38                     ` Andrew_Pinski
@ 2007-10-10  2:17                       ` H.J. Lu
  2007-10-10  3:01                         ` Andrew Pinski
  0 siblings, 1 reply; 27+ messages in thread
From: H.J. Lu @ 2007-10-10  2:17 UTC (permalink / raw)
  To: Andrew_Pinski
  Cc: gcc-patches, Andrew Pinski, Richard Henderson, Russell_Olsen,
	trevor_smigiel

On Tue, Oct 09, 2007 at 05:37:52PM -0700, Andrew_Pinski@PlayStation.Sony.Com wrote:
> "H.J. Lu" <hjl@lucon.org> wrote on 10/09/2007 05:31:30 PM:
> 
> > This patch doesn't solve the stack alignment problem on x86 where gcc
> > assumes the stack is 16byte aligned and ia32 psABI only specifies 4
> > byte alignment. To make gcc to conform ia32 psABI, we need to
> > align the stack when >4 byte alignment is needed for local variable
> > as well as register spill. If we align the stack, this patch isn't
> > needed for x86.
> 
> Again how many times do I have to mention, the proposal above does not 
> help out when you need a bigger alignment than what is required than 
> 16byte alignment.  If you take a look at the testcases which are added, 

What did you mean by "what is required than 16byte alignment"?
The alignment reqiurement of a local variable is the alignment
reqiurement of its type which can be a structure, a scalar or a
vector type.  My proposal treats the alignment reqiurement of
a structure the same as the alignment reqiurement of a scalar/vector
type. Why can't my proposal align stack to 256 byte? 



H.J.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment
  2007-10-10  2:17                       ` H.J. Lu
@ 2007-10-10  3:01                         ` Andrew Pinski
  2007-10-10  3:44                           ` How to make gcc psABI conformant H.J. Lu
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Pinski @ 2007-10-10  3:01 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Andrew_Pinski, gcc-patches, Richard Henderson, Russell_Olsen,
	trevor_smigiel

On 10/9/07, H.J. Lu <hjl@lucon.org> wrote:
> What did you mean by "what is required than 16byte alignment"?
> The alignment reqiurement of a local variable is the alignment
> reqiurement of its type which can be a structure, a scalar or a
> vector type.  My proposal treats the alignment reqiurement of
> a structure the same as the alignment reqiurement of a scalar/vector
> type. Why can't my proposal align stack to 256 byte?

If you want to implement something, that is ok with me but I have this
patch now and have it on the table since May of this year and nobody
complained about the patch for 5 months which seems like a long time.
I also pinged this patch ~ once a month and nobody reviewed it.  This
is getting annoying.  I had been asked to ping this patch again but
some internal developers to be able to get this working for 4.3.  Your
proposal has the ability to work but it is not implemented yet.  My
works now and is really needed for Cell development.  If you want to
go out and implement your proposal that is ok with me but the
implementation I have should go in as a stop gap while yours is being
worked on.  I am getting tried of pinging this patch anyways.  I have
better things to do, like fix some CSE load/store missed optimizations
(which is needed for some C++ cases).

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 27+ messages in thread

* How to make gcc psABI conformant
  2007-10-10  3:01                         ` Andrew Pinski
@ 2007-10-10  3:44                           ` H.J. Lu
  2007-10-10 11:43                             ` Daniel Jacobowitz
  0 siblings, 1 reply; 27+ messages in thread
From: H.J. Lu @ 2007-10-10  3:44 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Andrew_Pinski, gcc-patches, Richard Henderson, Russell_Olsen,
	trevor_smigiel

On Tue, Oct 09, 2007 at 08:00:52PM -0700, Andrew Pinski wrote:
> On 10/9/07, H.J. Lu <hjl@lucon.org> wrote:
> > What did you mean by "what is required than 16byte alignment"?
> > The alignment reqiurement of a local variable is the alignment
> > reqiurement of its type which can be a structure, a scalar or a
> > vector type.  My proposal treats the alignment reqiurement of
> > a structure the same as the alignment reqiurement of a scalar/vector
> > type. Why can't my proposal align stack to 256 byte?
> 
> If you want to implement something, that is ok with me but I have this
> patch now and have it on the table since May of this year and nobody
> complained about the patch for 5 months which seems like a long time.
> I also pinged this patch ~ once a month and nobody reviewed it.  This
> is getting annoying.  I had been asked to ping this patch again but
> some internal developers to be able to get this working for 4.3.  Your
> proposal has the ability to work but it is not implemented yet.  My
> works now and is really needed for Cell development.  If you want to
> go out and implement your proposal that is ok with me but the
> implementation I have should go in as a stop gap while yours is being
> worked on.  I am getting tried of pinging this patch anyways.  I have
> better things to do, like fix some CSE load/store missed optimizations
> (which is needed for some C++ cases).
> 

I am not against your patch per se.  Stack variable alignment is a
much larger issue. Your patch only fixes a small subset and a proper
fix may make your patch unnecessary.

There are STACK_BOUNDARY and PREFERRED_STACK_BOUNDARY. But they
don't provide psABI conformance. A psABI specifies a minimum
alignment for incoming stack. Unless a function is local to
translation unit, we can only assume the minimum alignment for
incoming stack. I think we need a new macro, INCOMING_STACK_BOUNDARY.
which can be PREFERRED_STACK_BOUNDARY only if all callers of the
function is local to translation unit.


H.J.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: How to make gcc psABI conformant
  2007-10-10  3:44                           ` How to make gcc psABI conformant H.J. Lu
@ 2007-10-10 11:43                             ` Daniel Jacobowitz
  2007-10-10 13:05                               ` H.J. Lu
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Jacobowitz @ 2007-10-10 11:43 UTC (permalink / raw)
  To: gcc-patches

On Tue, Oct 09, 2007 at 08:44:52PM -0700, H.J. Lu wrote:
> There are STACK_BOUNDARY and PREFERRED_STACK_BOUNDARY. But they
> don't provide psABI conformance. A psABI specifies a minimum
> alignment for incoming stack. Unless a function is local to
> translation unit, we can only assume the minimum alignment for
> incoming stack. I think we need a new macro, INCOMING_STACK_BOUNDARY.
> which can be PREFERRED_STACK_BOUNDARY only if all callers of the
> function is local to translation unit.

Isn't that STACK_BOUNDARY?  If not, since you seem to understand the
difference between these variables, could you improve the
documentation?

I'd like to reuse whatever solution you find for this for
powerpc-linux -maltivec.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: How to make gcc psABI conformant
  2007-10-10 11:43                             ` Daniel Jacobowitz
@ 2007-10-10 13:05                               ` H.J. Lu
  2007-10-10 13:10                                 ` Daniel Jacobowitz
  0 siblings, 1 reply; 27+ messages in thread
From: H.J. Lu @ 2007-10-10 13:05 UTC (permalink / raw)
  To: gcc-patches

On Wed, Oct 10, 2007 at 07:42:49AM -0400, Daniel Jacobowitz wrote:
> On Tue, Oct 09, 2007 at 08:44:52PM -0700, H.J. Lu wrote:
> > There are STACK_BOUNDARY and PREFERRED_STACK_BOUNDARY. But they
> > don't provide psABI conformance. A psABI specifies a minimum
> > alignment for incoming stack. Unless a function is local to
> > translation unit, we can only assume the minimum alignment for
> > incoming stack. I think we need a new macro, INCOMING_STACK_BOUNDARY.
> > which can be PREFERRED_STACK_BOUNDARY only if all callers of the
> > function is local to translation unit.
> 
> Isn't that STACK_BOUNDARY?  If not, since you seem to understand the
> difference between these variables, could you improve the
> documentation?

STACK_BOUNDARY is enforced by hardware. INCOMING_STACK_BOUNDARY is
specified by psABI. INCOMING_STACK_BOUNDARY >= STACK_BOUNDARY.

> 
> I'd like to reuse whatever solution you find for this for
> powerpc-linux -maltivec.

The generated code should look similar to what icc generates:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28074

That is we save the original stack pointer in the frame pointer and
align the stack in prologue. It will touch frondend, middleend and
backend. Will this scheme work for PowerPC?

I opened a mete bug:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33721


H.J.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: How to make gcc psABI conformant
  2007-10-10 13:05                               ` H.J. Lu
@ 2007-10-10 13:10                                 ` Daniel Jacobowitz
  2007-10-10 13:28                                   ` H.J. Lu
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Jacobowitz @ 2007-10-10 13:10 UTC (permalink / raw)
  To: gcc-patches

On Wed, Oct 10, 2007 at 06:05:01AM -0700, H.J. Lu wrote:
> > Isn't that STACK_BOUNDARY?  If not, since you seem to understand the
> > difference between these variables, could you improve the
> > documentation?
> 
> STACK_BOUNDARY is enforced by hardware. INCOMING_STACK_BOUNDARY is
> specified by psABI. INCOMING_STACK_BOUNDARY >= STACK_BOUNDARY.

Great, a third one...

> That is we save the original stack pointer in the frame pointer and
> align the stack in prologue. It will touch frondend, middleend and
> backend. Will this scheme work for PowerPC?

Probably not.  I don't think we have the freedom to abuse the frame
pointer that way.  But the PowerPC case really ought to be simple,
modulo correct debug info.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: How to make gcc psABI conformant
  2007-10-10 13:10                                 ` Daniel Jacobowitz
@ 2007-10-10 13:28                                   ` H.J. Lu
  2007-10-10 13:35                                     ` H.J. Lu
  0 siblings, 1 reply; 27+ messages in thread
From: H.J. Lu @ 2007-10-10 13:28 UTC (permalink / raw)
  To: gcc-patches

On Wed, Oct 10, 2007 at 09:10:06AM -0400, Daniel Jacobowitz wrote:
> On Wed, Oct 10, 2007 at 06:05:01AM -0700, H.J. Lu wrote:
> > > Isn't that STACK_BOUNDARY?  If not, since you seem to understand the
> > > difference between these variables, could you improve the
> > > documentation?
> > 
> > STACK_BOUNDARY is enforced by hardware. INCOMING_STACK_BOUNDARY is
> > specified by psABI. INCOMING_STACK_BOUNDARY >= STACK_BOUNDARY.
> 
> Great, a third one...

For the backends which support INCOMING_STACK_BOUNDARY, we may
undefine PREFERRED_STACK_BOUNDARY since we can align variables
on stack properly.

> 
> > That is we save the original stack pointer in the frame pointer and
> > align the stack in prologue. It will touch frondend, middleend and
> > backend. Will this scheme work for PowerPC?
> 
> Probably not.  I don't think we have the freedom to abuse the frame
> pointer that way.  But the PowerPC case really ought to be simple,
> modulo correct debug info.
> 

Each backend needs to specify a hard register for the original stack
pointer.  Using the frame pointer on x86 since it is less useful than
others.  Will align stack cause problem for PowerPC?


H.J.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: How to make gcc psABI conformant
  2007-10-10 13:28                                   ` H.J. Lu
@ 2007-10-10 13:35                                     ` H.J. Lu
  0 siblings, 0 replies; 27+ messages in thread
From: H.J. Lu @ 2007-10-10 13:35 UTC (permalink / raw)
  To: gcc-patches

On Wed, Oct 10, 2007 at 06:28:09AM -0700, H.J. Lu wrote:
> On Wed, Oct 10, 2007 at 09:10:06AM -0400, Daniel Jacobowitz wrote:
> > On Wed, Oct 10, 2007 at 06:05:01AM -0700, H.J. Lu wrote:
> > > > Isn't that STACK_BOUNDARY?  If not, since you seem to understand the
> > > > difference between these variables, could you improve the
> > > > documentation?
> > > 
> > > STACK_BOUNDARY is enforced by hardware. INCOMING_STACK_BOUNDARY is
> > > specified by psABI. INCOMING_STACK_BOUNDARY >= STACK_BOUNDARY.
> > 
> > Great, a third one...
> 
> For the backends which support INCOMING_STACK_BOUNDARY, we may
> undefine PREFERRED_STACK_BOUNDARY since we can align variables
> on stack properly.
> 
> > 
> > > That is we save the original stack pointer in the frame pointer and
> > > align the stack in prologue. It will touch frondend, middleend and
> > > backend. Will this scheme work for PowerPC?
> > 
> > Probably not.  I don't think we have the freedom to abuse the frame
> > pointer that way.  But the PowerPC case really ought to be simple,
> > modulo correct debug info.
> > 
> 
> Each backend needs to specify a hard register for the original stack
> pointer.  Using the frame pointer on x86 since it is less useful than
> others.  Will align stack cause problem for PowerPC?
> 

One more thing, when stack is aligned in prologue, the frame pointer
can no longer be used to access stack variables. That is another reason
to save the original stack pointer in it.


H.J.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment
  2007-10-09 21:20                 ` Andrew Pinski
  2007-10-10  0:31                   ` H.J. Lu
  2007-10-10  1:07                   ` [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment Dorit Nuzman
@ 2008-01-24 14:52                   ` Andrew Pinski
  2008-01-24 18:07                     ` H.J. Lu
  2 siblings, 1 reply; 27+ messages in thread
From: Andrew Pinski @ 2008-01-24 14:52 UTC (permalink / raw)
  To: Andrew_Pinski
  Cc: trevor_smigiel, gcc-patches, Richard Henderson, Russell_Olsen

Ping?  I am pinging this after 3 months of no activity.

On Oct 9, 2007 1:20 PM, Andrew Pinski <pinskia@gmail.com> wrote:
> Ping?
>
>
> On 9/17/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > Ping?
> >
> > On 8/28/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > Ping?
> > >
> > > On 8/4/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > > Ping?
> > > >
> > > > On 7/2/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > > > Ping?
> > > > >
> > > > > On 5/18/07, Andrew_Pinski@playstation.sony.com
> > > > > <Andrew_Pinski@playstation.sony.com> wrote:
> > > > > > Trevor Smigiel/R&D/SCEA@Playstation wrote on 05/17/2007 04:49:53 PM:
> > > > > >
> > > > > > > Andrew,
> > > > > > >
> > > > > > > It seems you missed merging a change that I made in our local tree.
> > > > > > >
> > > > > > > In cfgexpand.c, you removed this code:
> > > > > > >
> > > > > > > -  /* Set alignment we actually gave this decl.  */
> > > > > > > -  offset -= frame_phase;
> > > > > > > -  align = offset & -offset;
> > > > > > > -  align *= BITS_PER_UNIT;
> > > > > > > -  if (align > STACK_BOUNDARY || align == 0)
> > > > > > > -    align = STACK_BOUNDARY;
> > > > > > > -  DECL_ALIGN (decl) = align;
> > > > > > > -  DECL_USER_ALIGN (decl) = 0;
> > > > > > >
> > > > > > > In our local tree that code is conditioned on
> > > > > > >   if (!DECL_USER_ALIGN (decl))
> > > > > > > (Which could be done more precisely.  Consider the case where a user
> > > > > > > specified alignment is smaller than the decl's actual alignment on the
> > > > > > > stack.)
> > > > > > >
> > > > > > > Without it less efficient code can be generated because higher
> > > > > > > alignments are not propagated.  At least, that's the behaviour on 4.1.1.
> > > > > >
> > > > > > Here is the new patch which fixes that problem and we get the get the
> > > > > > correct alignment on the variables now that we were getting a smaller
> > > > > > alignment on with my older version of the patch.  It adds a testcase which
> > > > > > verifies this is the case too (I only check on ia32/x86_64, powerpc* and
> > > > > > spu as those are the only targets which I know the normal alignment of
> > > > > > stack is and the only targets I could test the testcase on).
> > > > > >
> > > > > > OK? Bootstrapped and tested on i686-linux-gnu with no regressions.
> > > > > >
> > > > > > Thanks,
> > > > > > Andrew Pinski
> > > > > >
> > > > > > ChangeLog:
> > > > > >
> > > > > >         * expr.h  (allocate_dynamic_stack_space_1): New function
> > > > > > prototype.
> > > > > >         * functionc.c (assign_temp): Take into account the alignment
> > > > > >         of the temp if it is greater than the target's preferred
> > > > > > alignment.
> > > > > >         * cfgexpand.c: Include optabs.h.
> > > > > >         (get_decl_align_unit): Update comment and don't lower the
> > > > > > alignment
> > > > > >         if it is greater than the target's preferred alignment.
> > > > > >         (alloc_stack_frame_space): Take an unsigned HOST_WIDE_INT for
> > > > > > align.
> > > > > >         Take into account the variable's alignment if it is greater than
> > > > > >         the target's preferred alignment.
> > > > > >         (expand_one_stack_var_at):  Likewise.  Only reset the alignment of
> > > > > > the decl to given
> > > > > >         alignment if the alignment is less than the target's preferred
> > > > > > alignment.
> > > > > >         * explow.c (allocate_dynamic_stack_space_1): Split out from
> > > > > >         allocate_dynamic_stack_space and take into acount the required
> > > > > >         alignment.  Use AND opcode instead of shifting left and shifting
> > > > > >         back right.
> > > > > >         (allocate_dynamic_stack_space): Call
> > > > > > allocate_dynamic_stack_space_1.
> > > > > >         * Makefile.in (cfgexpand.o): Update dependecies.
> > > > > >         * stmt.c (expand_decl):  Take into account the alignment
> > > > > >         of the variable if it is greater than the target's preferred
> > > > > >         alignment.
> > > > > >         * config/i386/i386.c (ix86_compute_frame_layout): Don't assert
> > > > > >         that preferred alignment is greater than the normal preferred
> > > > > >         alignment.  Don't assert that the stack alignment needed is
> > > > > > greater
> > > > > >         than the normal preferred alignment.
> > > > > >
> > > > > >         * gcc.c-torture/execute/pr16660-1.c: New testcase.
> > > > > >         * gcc.c-torture/execute/pr16660-2.c: New testcase.
> > > > > >         * gcc.dg/pr16660-1.c: New testcase.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment
  2008-01-24 14:52                   ` Andrew Pinski
@ 2008-01-24 18:07                     ` H.J. Lu
  0 siblings, 0 replies; 27+ messages in thread
From: H.J. Lu @ 2008-01-24 18:07 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Andrew_Pinski, trevor_smigiel, gcc-patches, Richard Henderson,
	Russell_Olsen

FWIW, this patch isn't needed on stack alignment branch:

http://gcc.gnu.org/viewcvs/branches/stack/

Currently, we are working on x86 backend and we need help on other targets.

H.J.
On Jan 24, 2008 3:13 AM, Andrew Pinski <pinskia@gmail.com> wrote:
> Ping?  I am pinging this after 3 months of no activity.
>
> On Oct 9, 2007 1:20 PM, Andrew Pinski <pinskia@gmail.com> wrote:
> > Ping?
> >
> >
> > On 9/17/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > Ping?
> > >
> > > On 8/28/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > > Ping?
> > > >
> > > > On 8/4/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > > > Ping?
> > > > >
> > > > > On 7/2/07, Andrew Pinski <pinskia@gmail.com> wrote:
> > > > > > Ping?
> > > > > >
> > > > > > On 5/18/07, Andrew_Pinski@playstation.sony.com
> > > > > > <Andrew_Pinski@playstation.sony.com> wrote:
> > > > > > > Trevor Smigiel/R&D/SCEA@Playstation wrote on 05/17/2007 04:49:53 PM:
> > > > > > >
> > > > > > > > Andrew,
> > > > > > > >
> > > > > > > > It seems you missed merging a change that I made in our local tree.
> > > > > > > >
> > > > > > > > In cfgexpand.c, you removed this code:
> > > > > > > >
> > > > > > > > -  /* Set alignment we actually gave this decl.  */
> > > > > > > > -  offset -= frame_phase;
> > > > > > > > -  align = offset & -offset;
> > > > > > > > -  align *= BITS_PER_UNIT;
> > > > > > > > -  if (align > STACK_BOUNDARY || align == 0)
> > > > > > > > -    align = STACK_BOUNDARY;
> > > > > > > > -  DECL_ALIGN (decl) = align;
> > > > > > > > -  DECL_USER_ALIGN (decl) = 0;
> > > > > > > >
> > > > > > > > In our local tree that code is conditioned on
> > > > > > > >   if (!DECL_USER_ALIGN (decl))
> > > > > > > > (Which could be done more precisely.  Consider the case where a user
> > > > > > > > specified alignment is smaller than the decl's actual alignment on the
> > > > > > > > stack.)
> > > > > > > >
> > > > > > > > Without it less efficient code can be generated because higher
> > > > > > > > alignments are not propagated.  At least, that's the behaviour on 4.1.1.
> > > > > > >
> > > > > > > Here is the new patch which fixes that problem and we get the get the
> > > > > > > correct alignment on the variables now that we were getting a smaller
> > > > > > > alignment on with my older version of the patch.  It adds a testcase which
> > > > > > > verifies this is the case too (I only check on ia32/x86_64, powerpc* and
> > > > > > > spu as those are the only targets which I know the normal alignment of
> > > > > > > stack is and the only targets I could test the testcase on).
> > > > > > >
> > > > > > > OK? Bootstrapped and tested on i686-linux-gnu with no regressions.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Andrew Pinski
> > > > > > >
> > > > > > > ChangeLog:
> > > > > > >
> > > > > > >         * expr.h  (allocate_dynamic_stack_space_1): New function
> > > > > > > prototype.
> > > > > > >         * functionc.c (assign_temp): Take into account the alignment
> > > > > > >         of the temp if it is greater than the target's preferred
> > > > > > > alignment.
> > > > > > >         * cfgexpand.c: Include optabs.h.
> > > > > > >         (get_decl_align_unit): Update comment and don't lower the
> > > > > > > alignment
> > > > > > >         if it is greater than the target's preferred alignment.
> > > > > > >         (alloc_stack_frame_space): Take an unsigned HOST_WIDE_INT for
> > > > > > > align.
> > > > > > >         Take into account the variable's alignment if it is greater than
> > > > > > >         the target's preferred alignment.
> > > > > > >         (expand_one_stack_var_at):  Likewise.  Only reset the alignment of
> > > > > > > the decl to given
> > > > > > >         alignment if the alignment is less than the target's preferred
> > > > > > > alignment.
> > > > > > >         * explow.c (allocate_dynamic_stack_space_1): Split out from
> > > > > > >         allocate_dynamic_stack_space and take into acount the required
> > > > > > >         alignment.  Use AND opcode instead of shifting left and shifting
> > > > > > >         back right.
> > > > > > >         (allocate_dynamic_stack_space): Call
> > > > > > > allocate_dynamic_stack_space_1.
> > > > > > >         * Makefile.in (cfgexpand.o): Update dependecies.
> > > > > > >         * stmt.c (expand_decl):  Take into account the alignment
> > > > > > >         of the variable if it is greater than the target's preferred
> > > > > > >         alignment.
> > > > > > >         * config/i386/i386.c (ix86_compute_frame_layout): Don't assert
> > > > > > >         that preferred alignment is greater than the normal preferred
> > > > > > >         alignment.  Don't assert that the stack alignment needed is
> > > > > > > greater
> > > > > > >         than the normal preferred alignment.
> > > > > > >
> > > > > > >         * gcc.c-torture/execute/pr16660-1.c: New testcase.
> > > > > > >         * gcc.c-torture/execute/pr16660-2.c: New testcase.
> > > > > > >         * gcc.dg/pr16660-1.c: New testcase.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables  on  the stack for greater than required alignment
       [not found] <OF15BFC61A.42B05901-ON882572DE.000355EF-882572DE.00775314@LocalDomain>
@ 2007-05-17 21:45 ` Andrew_Pinski
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew_Pinski @ 2007-05-17 21:45 UTC (permalink / raw)
  To: Andrew_Pinski
  Cc: gcc-patches, Richard Henderson, Russell_Olsen, Trevor_Smigiel

Andrew Pinski/R&D/SCEA wrote on 05/17/2007 02:43:20 PM:
> OK? Bootstrapped and tested on i686-linux-gnu with no regressions.

I forgot one thing, this work was originally done by Trevor Smigiel here 
at Sony.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2008-01-24 16:50 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-30  2:42 [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment Andrew_Pinski
2007-01-31 15:20 ` Meissner, Michael
2007-02-02 23:49 ` Richard Henderson
2007-05-17 21:43   ` Andrew_Pinski
     [not found]   ` <OF15BFC61A.42B05901-ON882572DE.000355EF-882572DE.007752F8@playstation.sony.com>
2007-05-17 23:50     ` trevor_smigiel
2007-05-18  0:20       ` Andrew_Pinski
2007-05-18 17:42       ` Andrew_Pinski
2007-07-02 23:56         ` Andrew Pinski
2007-08-04 11:46           ` Andrew Pinski
2007-08-28 20:08             ` Andrew Pinski
2007-09-17 22:22               ` Andrew Pinski
2007-10-09 21:20                 ` Andrew Pinski
2007-10-10  0:31                   ` H.J. Lu
2007-10-10  0:38                     ` Andrew_Pinski
2007-10-10  2:17                       ` H.J. Lu
2007-10-10  3:01                         ` Andrew Pinski
2007-10-10  3:44                           ` How to make gcc psABI conformant H.J. Lu
2007-10-10 11:43                             ` Daniel Jacobowitz
2007-10-10 13:05                               ` H.J. Lu
2007-10-10 13:10                                 ` Daniel Jacobowitz
2007-10-10 13:28                                   ` H.J. Lu
2007-10-10 13:35                                     ` H.J. Lu
2007-10-10  1:07                   ` [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment Dorit Nuzman
2008-01-24 14:52                   ` Andrew Pinski
2008-01-24 18:07                     ` H.J. Lu
2007-02-05 22:39 ` Dorit Nuzman
     [not found] <OF15BFC61A.42B05901-ON882572DE.000355EF-882572DE.00775314@LocalDomain>
2007-05-17 21:45 ` Andrew_Pinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).