public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
@ 2013-06-07 19:26 Jakub Jelinek
  2013-06-07 20:43 ` Richard Henderson
       [not found] ` <0EFAB2BDD0F67E4FB6CCC8B9F87D75692B5204DB@IRSMSX101.ger.corp.intel.com>
  0 siblings, 2 replies; 35+ messages in thread
From: Jakub Jelinek @ 2013-06-07 19:26 UTC (permalink / raw)
  To: Richard Biener, Jan Hubicka, Richard Henderson; +Cc: gcc-patches

Hi!

This PR is about DATA_ALIGNMENT macro increasing alignment of some decls
for optimization purposes beyond ABI mandated levels.  It is fine to emit
the vars aligned as much as we want for optimization purposes, but if we
can't be sure that references to that decl bind to the definition we
increased the alignment on (e.g. common variables, or -fpic code without
hidden visibility, weak vars etc.), we can't assume that alignment.
As DECL_ALIGN is used for both the alignment emitted for the definitions
and alignment assumed on code referring to it, this patch increases
DECL_ALIGN only on decls where decl_binds_to_current_def_p is true,
and otherwise the optimization part on top of that emits only when
aligning definition.
On x86_64, DATA_ALIGNMENT macro was partly an optimization, partly ABI
mandated alignment increase, so I've introduced a new macro,
DATA_ABI_ALIGNMENT, which is the ABI mandated increase only (on x86-64
I think the only one is that arrays with size 16 bytes or more (and VLAs,
but that is not handled by DATA*ALIGNMENT) are at least 16 byte aligned).

Bootstrapped/regtested on x86_64-linux and i686-linux.  No idea about other
targets, I've kept them all using DATA_ALIGNMENT, which is considered
optimization increase only now, if there is some ABI mandated alignment
increase on other targets, that should be done in DATA_ABI_ALIGNMENT as
well as DATA_ALIGNMENT.  The patch causes some vectorization regressions
(tweaked in the testsuite), especially for common vars where we used to
align say common arrays to 256 bits rather than the ABI mandated 128 bits,
or for -fpic code, but I'm afraid we need to live with that, if you compile
another file with say icc or some other compiler which doesn't increase
alignment beyond ABI mandated level and that other file defines the var say
as non-common, we have wrong-code.

2013-06-07  Jakub Jelinek  <jakub@redhat.com>

	PR target/56564
	* varasm.c (align_variable): Don't use DATA_ALIGNMENT or
	CONSTANT_ALIGNMENT if !decl_binds_to_current_def_p (decl).
	Use DATA_ABI_ALIGNMENT for that case instead if defined.
	(get_variable_align): New function.
	(get_variable_section, emit_bss, emit_common,
	assemble_variable_contents, place_block_symbol): Use
	get_variable_align instead of DECL_ALIGN.
	(assemble_noswitch_variable): Add align argument, use it
	instead of DECL_ALIGN.
	(assemble_variable): Adjust caller.  Use get_variable_align
	instead of DECL_ALIGN.
	* config/i386/i386.h (DATA_ALIGNMENT): Adjust x86_data_alignment
	caller.
	(DATA_ABI_ALIGNMENT): Define.
	* config/i386/i386-protos.h (x86_data_alignment): Adjust prototype.
	* config/i386/i386.c (x86_data_alignment): Add opt argument.  If
	opt is false, only return the psABI mandated alignment increase.
	* doc/tm.texi.in (DATA_ABI_ALIGNMENT): Document.
	* doc/tm.texi: Regenerated.

	* gcc.target/i386/pr56564-1.c: New test.
	* gcc.target/i386/pr56564-2.c: New test.
	* gcc.target/i386/pr56564-3.c: New test.
	* gcc.target/i386/pr56564-4.c: New test.
	* gcc.target/i386/avx256-unaligned-load-4.c: Add -fno-common.
	* gcc.target/i386/avx256-unaligned-store-1.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-4.c: Likewise.
	* gcc.target/i386/vect-sizes-1.c: Likewise.
	* gcc.target/i386/memcpy-1.c: Likewise.
	* gcc.dg/vect/costmodel/i386/costmodel-vect-31.c (tmp): Initialize.
	* gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c (tmp): Likewise.

--- gcc/varasm.c.jj	2013-06-07 13:17:17.000000000 +0200
+++ gcc/varasm.c	2013-06-07 15:38:36.710908852 +0200
@@ -966,8 +966,12 @@ align_variable (tree decl, bool dont_out
       align = MAX_OFILE_ALIGNMENT;
     }
 
-  /* On some machines, it is good to increase alignment sometimes.  */
-  if (! DECL_USER_ALIGN (decl))
+  /* On some machines, it is good to increase alignment sometimes.
+     But as DECL_ALIGN is used both for actually emitting the variable
+     and for code accessing the variable as guaranteed alignment, we
+     can only increase the alignment if it is a performance optimization
+     if the references to it must bind to the current definition.  */
+  if (! DECL_USER_ALIGN (decl) && decl_binds_to_current_def_p (decl))
     {
 #ifdef DATA_ALIGNMENT
       unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
@@ -988,12 +992,69 @@ align_variable (tree decl, bool dont_out
 	}
 #endif
     }
+#ifdef DATA_ABI_ALIGNMENT
+  else if (! DECL_USER_ALIGN (decl))
+    {
+      unsigned int data_align = DATA_ABI_ALIGNMENT (TREE_TYPE (decl), align);
+      /* For backwards compatibility, don't assume the ABI alignment for
+	 TLS variables.  */
+      if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD)
+	align = data_align;
+    }
+#endif
 
   /* Reset the alignment in case we have made it tighter, so we can benefit
      from it in get_pointer_alignment.  */
   DECL_ALIGN (decl) = align;
 }
 
+/* Return DECL_ALIGN (decl), possibly increased for optimization purposes
+   beyond what align_variable returned.  */
+
+static unsigned int
+get_variable_align (tree decl)
+{
+  unsigned int align = DECL_ALIGN (decl);
+
+  /* For user aligned vars or static vars align_variable already did
+     everything.  */
+  if (DECL_USER_ALIGN (decl) || !TREE_PUBLIC (decl))
+    return align;
+
+  /* For decls that bind to the current definition, align_variable
+     did also everything, except for not assuming ABI required alignment
+     of TLS variables.  For other vars, increase the alignment here
+     as an optimization.  */
+  if (!decl_binds_to_current_def_p (decl))
+    {
+      /* On some machines, it is good to increase alignment sometimes.  */
+#ifdef DATA_ALIGNMENT
+      unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
+      /* Don't increase alignment too much for TLS variables - TLS space
+	 is too precious.  */
+      if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD)
+	align = data_align;
+#endif
+#ifdef CONSTANT_ALIGNMENT
+      if (DECL_INITIAL (decl) != 0 && DECL_INITIAL (decl) != error_mark_node)
+	{
+	  unsigned int const_align = CONSTANT_ALIGNMENT (DECL_INITIAL (decl),
+							 align);
+	  /* Don't increase alignment too much for TLS variables - TLS space
+	     is too precious.  */
+	  if (! DECL_THREAD_LOCAL_P (decl) || const_align <= BITS_PER_WORD)
+	    align = const_align;
+	}
+    }
+#endif
+
+#ifdef DATA_ABI_ALIGNMENT
+  if (DECL_THREAD_LOCAL_P (decl))
+    return DATA_ABI_ALIGNMENT (TREE_TYPE (decl), align);
+#endif
+  return align;
+}
+
 /* Return the section into which the given VAR_DECL or CONST_DECL
    should be placed.  PREFER_NOSWITCH_P is true if a noswitch
    section should be used wherever possible.  */
@@ -1043,7 +1104,8 @@ get_variable_section (tree decl, bool pr
 	return bss_noswitch_section;
     }
 
-  return targetm.asm_out.select_section (decl, reloc, DECL_ALIGN (decl));
+  return targetm.asm_out.select_section (decl, reloc,
+					 get_variable_align (decl));
 }
 
 /* Return the block into which object_block DECL should be placed.  */
@@ -1780,7 +1842,8 @@ emit_bss (tree decl ATTRIBUTE_UNUSED,
 	  unsigned HOST_WIDE_INT rounded ATTRIBUTE_UNUSED)
 {
 #if defined ASM_OUTPUT_ALIGNED_BSS
-  ASM_OUTPUT_ALIGNED_BSS (asm_out_file, decl, name, size, DECL_ALIGN (decl));
+  ASM_OUTPUT_ALIGNED_BSS (asm_out_file, decl, name, size,
+			  get_variable_align (decl));
   return true;
 #endif
 }
@@ -1796,10 +1859,11 @@ emit_common (tree decl ATTRIBUTE_UNUSED,
 {
 #if defined ASM_OUTPUT_ALIGNED_DECL_COMMON
   ASM_OUTPUT_ALIGNED_DECL_COMMON (asm_out_file, decl, name,
-				  size, DECL_ALIGN (decl));
+				  size, get_variable_align (decl));
   return true;
 #elif defined ASM_OUTPUT_ALIGNED_COMMON
-  ASM_OUTPUT_ALIGNED_COMMON (asm_out_file, name, size, DECL_ALIGN (decl));
+  ASM_OUTPUT_ALIGNED_COMMON (asm_out_file, name, size,
+			     get_variable_align (decl));
   return true;
 #else
   ASM_OUTPUT_COMMON (asm_out_file, name, size, rounded);
@@ -1828,7 +1892,8 @@ emit_tls_common (tree decl ATTRIBUTE_UNU
    NAME is the name of DECL's SYMBOL_REF.  */
 
 static void
-assemble_noswitch_variable (tree decl, const char *name, section *sect)
+assemble_noswitch_variable (tree decl, const char *name, section *sect,
+			    unsigned int align)
 {
   unsigned HOST_WIDE_INT size, rounded;
 
@@ -1850,7 +1915,7 @@ assemble_noswitch_variable (tree decl, c
 	     * (BIGGEST_ALIGNMENT / BITS_PER_UNIT));
 
   if (!sect->noswitch.callback (decl, name, size, rounded)
-      && (unsigned HOST_WIDE_INT) DECL_ALIGN_UNIT (decl) > rounded)
+      && (unsigned HOST_WIDE_INT) (align / BITS_PER_UNIT) > rounded)
     warning (0, "requested alignment for %q+D is greater than "
 	     "implemented alignment of %wu", decl, rounded);
 }
@@ -1880,7 +1945,7 @@ assemble_variable_contents (tree decl, c
 	/* Output the actual data.  */
 	output_constant (DECL_INITIAL (decl),
 			 tree_low_cst (DECL_SIZE_UNIT (decl), 1),
-			 DECL_ALIGN (decl));
+			 get_variable_align (decl));
       else
 	/* Leave space for it.  */
 	assemble_zeros (tree_low_cst (DECL_SIZE_UNIT (decl), 1));
@@ -1904,6 +1969,7 @@ assemble_variable (tree decl, int top_le
   const char *name;
   rtx decl_rtl, symbol;
   section *sect;
+  unsigned int align;
   bool asan_protected = false;
 
   /* This function is supposed to handle VARIABLES.  Ensure we have one.  */
@@ -2003,6 +2069,8 @@ assemble_variable (tree decl, int top_le
 
   set_mem_align (decl_rtl, DECL_ALIGN (decl));
 
+  align = get_variable_align (decl);
+
   if (TREE_PUBLIC (decl))
     maybe_assemble_visibility (decl);
 
@@ -2032,12 +2100,12 @@ assemble_variable (tree decl, int top_le
       place_block_symbol (symbol);
     }
   else if (SECTION_STYLE (sect) == SECTION_NOSWITCH)
-    assemble_noswitch_variable (decl, name, sect);
+    assemble_noswitch_variable (decl, name, sect, align);
   else
     {
       switch_to_section (sect);
-      if (DECL_ALIGN (decl) > BITS_PER_UNIT)
-	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (DECL_ALIGN_UNIT (decl)));
+      if (align > BITS_PER_UNIT)
+	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT));
       assemble_variable_contents (decl, name, dont_output_data);
       if (asan_protected)
 	{
@@ -6967,7 +7035,7 @@ place_block_symbol (rtx symbol)
   else
     {
       decl = SYMBOL_REF_DECL (symbol);
-      alignment = DECL_ALIGN (decl);
+      alignment = get_variable_align (decl);
       size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
       if (flag_asan && asan_protect_global (decl))
 	{
--- gcc/config/i386/i386.h.jj	2013-06-03 19:15:34.000000000 +0200
+++ gcc/config/i386/i386.h	2013-06-07 14:48:36.430589051 +0200
@@ -859,7 +859,19 @@ enum target_cpu_default
    cause character arrays to be word-aligned so that `strcpy' calls
    that copy constants to character arrays can be done inline.  */
 
-#define DATA_ALIGNMENT(TYPE, ALIGN) ix86_data_alignment ((TYPE), (ALIGN))
+#define DATA_ALIGNMENT(TYPE, ALIGN) \
+  ix86_data_alignment ((TYPE), (ALIGN), true)
+
+/* Similar to DATA_ALIGNMENT, but for the cases where the ABI mandates
+   some alignment increase, instead of optimization only purposes.  E.g.
+   AMD x86-64 psABI says that variables with array type larger than 15 bytes
+   must be aligned to 16 byte boundaries.  DATA_ALIGNMENT should always
+   return the same or larger value than DATA_ABI_ALIGNMENT.
+
+   If this macro is not defined, then ALIGN is used.  */
+
+#define DATA_ABI_ALIGNMENT(TYPE, ALIGN) \
+  ix86_data_alignment ((TYPE), (ALIGN), false)
 
 /* If defined, a C expression to compute the alignment for a local
    variable.  TYPE is the data type, and ALIGN is the alignment that
--- gcc/config/i386/i386-protos.h.jj	2013-05-14 21:30:19.000000000 +0200
+++ gcc/config/i386/i386-protos.h	2013-06-07 13:31:21.937823575 +0200
@@ -207,7 +207,7 @@ extern void init_cumulative_args (CUMULA
 #endif	/* RTX_CODE  */
 
 #ifdef TREE_CODE
-extern int ix86_data_alignment (tree, int);
+extern int ix86_data_alignment (tree, int, bool);
 extern unsigned int ix86_local_alignment (tree, enum machine_mode,
 					  unsigned int);
 extern unsigned int ix86_minimum_alignment (tree, enum machine_mode,
--- gcc/config/i386/i386.c.jj	2013-06-07 13:17:17.000000000 +0200
+++ gcc/config/i386/i386.c	2013-06-07 13:37:24.845416361 +0200
@@ -25375,11 +25375,12 @@ ix86_constant_alignment (tree exp, int a
    instead of that alignment to align the object.  */
 
 int
-ix86_data_alignment (tree type, int align)
+ix86_data_alignment (tree type, int align, bool opt)
 {
   int max_align = optimize_size ? BITS_PER_WORD : MIN (256, MAX_OFILE_ALIGNMENT);
 
-  if (AGGREGATE_TYPE_P (type)
+  if (opt
+      && AGGREGATE_TYPE_P (type)
       && TYPE_SIZE (type)
       && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
       && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= (unsigned) max_align
@@ -25391,14 +25392,17 @@ ix86_data_alignment (tree type, int alig
      to 16byte boundary.  */
   if (TARGET_64BIT)
     {
-      if (AGGREGATE_TYPE_P (type)
-	   && TYPE_SIZE (type)
-	   && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
-	   && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 128
-	       || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 128)
+      if ((opt ? AGGREGATE_TYPE_P (type) : TREE_CODE (type) == ARRAY_TYPE)
+	  && TYPE_SIZE (type)
+	  && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
+	  && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 128
+	      || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 128)
 	return 128;
     }
 
+  if (!opt)
+    return align;
+
   if (TREE_CODE (type) == ARRAY_TYPE)
     {
       if (TYPE_MODE (TREE_TYPE (type)) == DFmode && align < 64)
--- gcc/doc/tm.texi.in.jj	2013-06-01 14:47:17.000000000 +0200
+++ gcc/doc/tm.texi.in	2013-06-07 14:47:10.192003375 +0200
@@ -1062,6 +1062,16 @@ arrays to be word-aligned so that @code{
 constants to character arrays can be done inline.
 @end defmac
 
+@defmac DATA_ABI_ALIGNMENT (@var{type}, @var{basic-align})
+Similar to @code{DATA_ALIGNMENT}, but for the cases where the ABI mandates
+some alignment increase, instead of optimization only purposes.  E.g.@
+AMD x86-64 psABI says that variables with array type larger than 15 bytes
+must be aligned to 16 byte boundaries.  @code{DATA_ALIGNMENT} should always
+return the same or larger value than @code{DATA_ABI_ALIGNMENT}.
+
+If this macro is not defined, then @var{basic-align} is used.
+@end defmac
+
 @defmac CONSTANT_ALIGNMENT (@var{constant}, @var{basic-align})
 If defined, a C expression to compute the alignment given to a constant
 that is being placed in memory.  @var{constant} is the constant and
--- gcc/doc/tm.texi.jj	2013-06-01 14:47:17.241680273 +0200
+++ gcc/doc/tm.texi	2013-06-07 14:47:29.400694547 +0200
@@ -1078,6 +1078,16 @@ arrays to be word-aligned so that @code{
 constants to character arrays can be done inline.
 @end defmac
 
+@defmac DATA_ABI_ALIGNMENT (@var{type}, @var{basic-align})
+Similar to @code{DATA_ALIGNMENT}, but for the cases where the ABI mandates
+some alignment increase, instead of optimization only purposes.  E.g.@
+AMD x86-64 psABI says that variables with array type larger than 15 bytes
+must be aligned to 16 byte boundaries.  @code{DATA_ALIGNMENT} should always
+return the same or larger value than @code{DATA_ABI_ALIGNMENT}.
+
+If this macro is not defined, then @var{basic-align} is used.
+@end defmac
+
 @defmac CONSTANT_ALIGNMENT (@var{constant}, @var{basic-align})
 If defined, a C expression to compute the alignment given to a constant
 that is being placed in memory.  @var{constant} is the constant and
--- gcc/testsuite/gcc.target/i386/pr56564-1.c.jj	2013-06-07 15:17:15.879403383 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-1.c	2013-06-07 15:44:18.386232149 +0200
@@ -0,0 +1,25 @@
+/* PR target/56564 */
+/* { dg-do compile { target { fpic && lp64 } } } */
+/* { dg-options "-O3 -fpic -fdump-tree-optimized" } */
+
+struct S { long a, b; } s = { 5, 6 };
+char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-tree-dump-times "&s" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "&t" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized" } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/pr56564-2.c.jj	2013-06-07 15:19:28.900986237 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-2.c	2013-06-07 15:44:25.120129885 +0200
@@ -0,0 +1,25 @@
+/* PR target/56564 */
+/* { dg-do compile { target { *-*-linux* && lp64 } } } */
+/* { dg-options "-O3 -fno-pic -fdump-tree-optimized" } */
+
+struct S { long a, b; } s = { 5, 6 };
+char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-tree-dump-times "&s" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "&t" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return 0" 2 "optimized" } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/pr56564-3.c.jj	2013-06-07 15:22:26.470079983 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-3.c	2013-06-07 15:44:34.872968234 +0200
@@ -0,0 +1,28 @@
+/* PR target/56564 */
+/* { dg-do compile { target { fpic && lp64 } } } */
+/* { dg-options "-O3 -fpic -fdump-tree-optimized" } */
+
+__thread struct S { long a, b; } s = { 5, 6 };
+__thread char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+/* For backwards compatibility we don't assume that t must
+   be aligned to 16 bytes, but align it anyway.  */
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-tree-dump-times "&s" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "&t" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return 0" 0 "optimized" } } */
+/* { dg-final { scan-assembler-not ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/pr56564-4.c.jj	2013-06-07 15:25:12.638326084 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-4.c	2013-06-07 15:45:59.489567940 +0200
@@ -0,0 +1,22 @@
+/* PR target/56564 */
+/* { dg-do compile { target { *-*-linux* && lp64 } } } */
+/* { dg-options "-O3 -fno-pic -fdump-tree-optimized" } */
+
+__thread struct S { long a, b; } s = { 5, 6 };
+__thread char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-assembler-not ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-load-4.c.jj	2012-10-16 13:15:44.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-load-4.c	2013-06-07 21:07:22.341380267 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store" } */
+/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-1.c.jj	2012-10-16 13:15:44.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-1.c	2013-06-07 21:06:06.911946765 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store" } */
+/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-3.c.jj	2012-10-16 13:15:44.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-3.c	2013-06-07 21:06:41.421248309 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store -mtune=generic" } */
+/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store -mtune=generic -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-4.c.jj	2012-10-16 13:15:44.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-4.c	2013-06-07 21:06:53.516302188 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store" } */
+/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/vect-sizes-1.c.jj	2010-11-01 09:06:30.000000000 +0100
+++ gcc/testsuite/gcc.target/i386/vect-sizes-1.c	2013-06-07 21:08:07.851581595 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -ffast-math -mavx -mtune=generic" } */
+/* { dg-options "-O3 -ffast-math -mavx -mtune=generic -fno-common" } */
 
 double a[1024];
 
--- gcc/testsuite/gcc.target/i386/memcpy-1.c.jj	2011-07-11 10:39:29.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/memcpy-1.c	2013-06-07 21:08:46.263945653 +0200
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target ia32 } */
-/* { dg-options "-O2 -march=pentiumpro -minline-all-stringops" } */
+/* { dg-options "-O2 -march=pentiumpro -minline-all-stringops -fno-common" } */
 /* { dg-final { scan-assembler "rep" } } */
 /* { dg-final { scan-assembler "movs" } } */
 /* { dg-final { scan-assembler-not "test" } } */
--- gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c.jj	2009-11-04 08:15:26.000000000 +0100
+++ gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c	2013-06-07 20:58:02.091268267 +0200
@@ -18,7 +18,7 @@ struct s{
   struct t e;   /* unaligned (offset 2N+4N+4 B) */
 };
  
-struct s tmp;
+struct s tmp = { 1 };
 
 int main1 ()
 {  
--- gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c.jj	2009-11-04 08:15:26.000000000 +0100
+++ gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c	2013-06-07 20:58:38.201668248 +0200
@@ -18,7 +18,7 @@ struct s{
   struct t e;   /* unaligned (offset 2N+4N+4 B) */
 };
  
-struct s tmp;
+struct s tmp = { 1 };
 
 int main1 ()
 {  

	Jakub

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-07 19:26 [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564) Jakub Jelinek
@ 2013-06-07 20:43 ` Richard Henderson
  2013-06-07 21:14   ` Jakub Jelinek
                     ` (4 more replies)
       [not found] ` <0EFAB2BDD0F67E4FB6CCC8B9F87D75692B5204DB@IRSMSX101.ger.corp.intel.com>
  1 sibling, 5 replies; 35+ messages in thread
From: Richard Henderson @ 2013-06-07 20:43 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Biener, Jan Hubicka, gcc-patches, bernds, hp, hp,
	uweigand, Andreas.Krebbel, dje.gcc

On 06/07/2013 12:25 PM, Jakub Jelinek wrote:
> This PR is about DATA_ALIGNMENT macro increasing alignment of some decls
> for optimization purposes beyond ABI mandated levels.  It is fine to emit
> the vars aligned as much as we want for optimization purposes, but if we
> can't be sure that references to that decl bind to the definition we
> increased the alignment on (e.g. common variables, or -fpic code without
> hidden visibility, weak vars etc.), we can't assume that alignment.

When the linker merges common blocks, it chooses both maximum size and maximum
alignment.  Thus for any common block for which we can prove the block must
reside in the module (any executable, or hidden common in shared object), we
can go ahead and use the increased alignment.

It's only in shared objects with non-hidden common blocks that we have a
problem, since in that case we may resolve the common block via copy reloc to a
memory block in another module.

So while decl_binds_to_current_def_p is a good starting point, I think we can
do a little better with common blocks.  Which ought to take care of those
vectorization regressions you mention.

> @@ -966,8 +966,12 @@ align_variable (tree decl, bool dont_out
>        align = MAX_OFILE_ALIGNMENT;
>      }
>  
> -  /* On some machines, it is good to increase alignment sometimes.  */
> -  if (! DECL_USER_ALIGN (decl))
> +  /* On some machines, it is good to increase alignment sometimes.
> +     But as DECL_ALIGN is used both for actually emitting the variable
> +     and for code accessing the variable as guaranteed alignment, we
> +     can only increase the alignment if it is a performance optimization
> +     if the references to it must bind to the current definition.  */
> +  if (! DECL_USER_ALIGN (decl) && decl_binds_to_current_def_p (decl))
>      {
>  #ifdef DATA_ALIGNMENT
>        unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
> @@ -988,12 +992,69 @@ align_variable (tree decl, bool dont_out
>  	}
>  #endif
>      }
> +#ifdef DATA_ABI_ALIGNMENT
> +  else if (! DECL_USER_ALIGN (decl))
> +    {
> +      unsigned int data_align = DATA_ABI_ALIGNMENT (TREE_TYPE (decl), align);
> +      /* For backwards compatibility, don't assume the ABI alignment for
> +	 TLS variables.  */
> +      if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD)
> +	align = data_align;
> +    }
> +#endif

This structure would seem to do the wrong thing if DATA_ABI_ALIGNMENT is
defined, but DATA_ALIGNMENT isn't.  And while I realize you documented it, I
don't like the restriction that D_A /must/ return something larger than D_A_A.
 All that means is that in complex cases D_A will have to call D_A_A itself.

I would think that it would be better to rearrange as

  if (!D_U_A)
    {
  #ifdef D_A_A
      align = ...
  #endif
  #ifdef D_A
      if (d_b_t_c_d_p)
        align = ...
  #endif
    }

Why the special case for TLS?  If we really want that special case surely that
test should go into D_A_A itself, and not here in generic code.

> Bootstrapped/regtested on x86_64-linux and i686-linux.  No idea about other
> targets, I've kept them all using DATA_ALIGNMENT, which is considered
> optimization increase only now, if there is some ABI mandated alignment
> increase on other targets, that should be done in DATA_ABI_ALIGNMENT as
> well as DATA_ALIGNMENT.

I've had a brief look over the instances of D_A within the tree atm.  Most of
them carry the cut-n-paste comment "for the same reasons".  These I believe
never intended an ABI change, and were really only interested in optimization.

But these I think require a good hard look to see if they really intended an
ABI alignment:

c6x	comment explicitly mentions abi
cris	compiler options for alignment -- systemwide or local?
mmix	comment mentions GETA instruction
s390	comment mentions LARL instruction
rs6000	SPE and E500 portion of the alignment non-optional?

Relevant port maintainers CCed.


r~

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-07 20:43 ` Richard Henderson
@ 2013-06-07 21:14   ` Jakub Jelinek
  2013-06-08 15:13     ` Jakub Jelinek
  2013-06-10 14:52     ` Richard Henderson
  2013-06-07 22:56   ` Hans-Peter Nilsson
                     ` (3 subsequent siblings)
  4 siblings, 2 replies; 35+ messages in thread
From: Jakub Jelinek @ 2013-06-07 21:14 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Richard Biener, Jan Hubicka, gcc-patches, bernds, hp, hp,
	uweigand, Andreas.Krebbel, dje.gcc

On Fri, Jun 07, 2013 at 01:43:27PM -0700, Richard Henderson wrote:
> On 06/07/2013 12:25 PM, Jakub Jelinek wrote:
> > This PR is about DATA_ALIGNMENT macro increasing alignment of some decls
> > for optimization purposes beyond ABI mandated levels.  It is fine to emit
> > the vars aligned as much as we want for optimization purposes, but if we
> > can't be sure that references to that decl bind to the definition we
> > increased the alignment on (e.g. common variables, or -fpic code without
> > hidden visibility, weak vars etc.), we can't assume that alignment.
> 
> When the linker merges common blocks, it chooses both maximum size and maximum
> alignment.  Thus for any common block for which we can prove the block must
> reside in the module (any executable, or hidden common in shared object), we
> can go ahead and use the increased alignment.

But consider say:
one TU:
struct S { char buf[15]; } s __attribute__((aligned (32)));

another TU:
char c = 7;
struct S { char buf[15]; } s = { { 1, 2 } };
char d = 8;
int main () { return 0; }

(the aligned(32) is there just to simulate the DATA_ALIGNMENT optimization
increase).  Linker warns about this (thus the question is if we want to
increase the alignment for optimization on commons at all) and doesn't align
it.

> This structure would seem to do the wrong thing if DATA_ABI_ALIGNMENT is
> defined, but DATA_ALIGNMENT isn't.  And while I realize you documented it, I
> don't like the restriction that D_A /must/ return something larger than D_A_A.
>  All that means is that in complex cases D_A will have to call D_A_A itself.

Yeah, I guess I can rearrange it.  The reason I wrote it that way was to
avoid an extra function call, but that is probably not big enough overhead.

> I would think that it would be better to rearrange as
> 
>   if (!D_U_A)
>     {
>   #ifdef D_A_A
>       align = ...
>   #endif
>   #ifdef D_A
>       if (d_b_t_c_d_p)
>         align = ...
>   #endif
>     }
> 
> Why the special case for TLS?  If we really want that special case surely that
> test should go into D_A_A itself, and not here in generic code.

I special case TLS for backwards ABI compatibility with GCC <= 4.8.1,
because we ignored DATA_ALIGNMENT for TLS vars, even if what it returned
wasn't just an optimization, but ABI requirement.  So, if you have:
__thread char a[16] = { 1, 2 };
in one shared library and
__thread char a[16] = { 1, 2 };
use (a);
in another shared library, compile one with GCC 4.8.1 e.g. and the other one
with 4.9.0, if use (a) would assume the psABI mandated 16 byte alignment,
but the actual definition was from 4.8.1 compiled object and was only 1 byte
aligned, it could crash.  Also, neither DATA_ALIGNMENT nor DATA_ABI_ALIGNMENT sees
the decl itself, it just looks at type and previously assumed alignment.

> > Bootstrapped/regtested on x86_64-linux and i686-linux.  No idea about other
> > targets, I've kept them all using DATA_ALIGNMENT, which is considered
> > optimization increase only now, if there is some ABI mandated alignment
> > increase on other targets, that should be done in DATA_ABI_ALIGNMENT as
> > well as DATA_ALIGNMENT.
> 
> I've had a brief look over the instances of D_A within the tree atm.  Most of
> them carry the cut-n-paste comment "for the same reasons".  These I believe
> never intended an ABI change, and were really only interested in optimization.

Thanks for looking into this.

> But these I think require a good hard look to see if they really intended an
> ABI alignment:
> 
> c6x	comment explicitly mentions abi
> cris	compiler options for alignment -- systemwide or local?
> mmix	comment mentions GETA instruction
> s390	comment mentions LARL instruction
> rs6000	SPE and E500 portion of the alignment non-optional?
> 
> Relevant port maintainers CCed.

	Jakub

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-07 20:43 ` Richard Henderson
  2013-06-07 21:14   ` Jakub Jelinek
@ 2013-06-07 22:56   ` Hans-Peter Nilsson
  2013-06-08 15:05     ` Jakub Jelinek
  2013-06-10 10:51   ` Bernd Schmidt
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 35+ messages in thread
From: Hans-Peter Nilsson @ 2013-06-07 22:56 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Jakub Jelinek, Richard Biener, Jan Hubicka, gcc-patches

On Fri, 7 Jun 2013, Richard Henderson wrote:
> I've had a brief look over the instances of D_A within the tree atm.  Most of
> them carry the cut-n-paste comment "for the same reasons".  These I believe
> never intended an ABI change, and were really only interested in optimization.
>
> But these I think require a good hard look to see if they really intended an
> ABI alignment:

I'm not sure what is about to change how?

> cris	compiler options for alignment -- systemwide or local?

No, DATA_ALIGNMENT in cris.h is not intended as an ABI
indication, but as an optimization when emitting data.
(This was the way to do it at the time.  Has this changed?)

The ABI is as indicated by BIGGEST_ALIGNMENT: 8 (bits; one
byte).  Nothing is guaranteed (to the data referer) to have a
bigger alignment - unless otherwise indicated by attribute-align.

(Unfortunately I can't change BIGGEST_ALIGNMENT to indicate that
atomic variables require "natural alignment", or actually not to
straddle a cache-boundary, as increasing BIGGEST_ALIGNMENT makes
GCC change the ABI.  But that's a slightly different issue.)

> mmix	comment mentions GETA instruction

Yep, data must be at least 32-bit-aligned so addresses can be
formed with a GETA insn.  BIGGEST_ALIGNMENT is 64 though and
STRICT_ALIGNMENT; natural alignment is required for proper
interpretation as the low bits are ignored.

brgds, H-P

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-07 22:56   ` Hans-Peter Nilsson
@ 2013-06-08 15:05     ` Jakub Jelinek
  0 siblings, 0 replies; 35+ messages in thread
From: Jakub Jelinek @ 2013-06-08 15:05 UTC (permalink / raw)
  To: Hans-Peter Nilsson
  Cc: Richard Henderson, Richard Biener, Jan Hubicka, gcc-patches

On Fri, Jun 07, 2013 at 06:56:34PM -0400, Hans-Peter Nilsson wrote:
> > cris	compiler options for alignment -- systemwide or local?
> 
> No, DATA_ALIGNMENT in cris.h is not intended as an ABI
> indication, but as an optimization when emitting data.
> (This was the way to do it at the time.  Has this changed?)
> 
> The ABI is as indicated by BIGGEST_ALIGNMENT: 8 (bits; one
> byte).  Nothing is guaranteed (to the data referer) to have a
> bigger alignment - unless otherwise indicated by attribute-align.
> 
> (Unfortunately I can't change BIGGEST_ALIGNMENT to indicate that
> atomic variables require "natural alignment", or actually not to
> straddle a cache-boundary, as increasing BIGGEST_ALIGNMENT makes
> GCC change the ABI.  But that's a slightly different issue.)

Right now it is unfortunately part of ABI, which is something the patch
attempts to cure in a backwards compatible way (i.e., data will be still
alignment the way it used to be, but code will no longer assume it is that
much aligned, unless it is DATA_ABI_ALIGNMENT).
> 
> > mmix	comment mentions GETA instruction
> 
> Yep, data must be at least 32-bit-aligned so addresses can be
> formed with a GETA insn.  BIGGEST_ALIGNMENT is 64 though and
> STRICT_ALIGNMENT; natural alignment is required for proper
> interpretation as the low bits are ignored.

Then mmix would probably want to define DATA_ABI_ALIGNMENT instead of
DATA_ALIGNMENT after the patch (or both).

	Jakub

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-07 21:14   ` Jakub Jelinek
@ 2013-06-08 15:13     ` Jakub Jelinek
  2013-06-10 14:52     ` Richard Henderson
  1 sibling, 0 replies; 35+ messages in thread
From: Jakub Jelinek @ 2013-06-08 15:13 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Richard Biener, Jan Hubicka, gcc-patches, bernds, hp, hp,
	uweigand, Andreas.Krebbel, dje.gcc

On Fri, Jun 07, 2013 at 11:14:19PM +0200, Jakub Jelinek wrote:
> > This structure would seem to do the wrong thing if DATA_ABI_ALIGNMENT is
> > defined, but DATA_ALIGNMENT isn't.  And while I realize you documented it, I
> > don't like the restriction that D_A /must/ return something larger than D_A_A.
> >  All that means is that in complex cases D_A will have to call D_A_A itself.
> 
> Yeah, I guess I can rearrange it.  The reason I wrote it that way was to
> avoid an extra function call, but that is probably not big enough overhead.

Here is the code rearranged so that DATA_ABI_ALIGNMENT is independent of
DATA_ALIGNMENT.  The rest of stuff is kept as is.

As for the commons getting bigger alignment than the ABI has for them, I'm
afraid the linker usually has no option but to warn and don't do anything.
Because if the non-common definition that is supposed to win over the common
one isn't sufficiently aligned (and it could be aligned just to the ABI
mandated boundary), that definition could be already in the middle of say
.data or other section and so the linker doesn't have the luxury of aligning
it individually.

E.g. vect_can_force_dr_alignment_p has been changed some time ago to:
  /* We cannot change alignment of common or external symbols as another
     translation unit may contain a definition with lower alignment.
     The rules of common symbol linking mean that the definition
     will override the common symbol.  The same is true for constant
     pool entries which may be shared and are not properly merged
     by LTO.  */
  if (DECL_EXTERNAL (decl)
      || DECL_COMMON (decl)
      || DECL_IN_CONSTANT_POOL (decl))
    return false;
but at that point we haven't changed align_variable.  Thus perhaps we want
in align_variable handle DECL_COMMON the same way as we handle TLS with >
word alignment.

2013-06-08  Jakub Jelinek  <jakub@redhat.com>

	PR target/56564
	* varasm.c (align_variable): Don't use DATA_ALIGNMENT or
	CONSTANT_ALIGNMENT if !decl_binds_to_current_def_p (decl).
	Use DATA_ABI_ALIGNMENT for that case instead if defined.
	(get_variable_align): New function.
	(get_variable_section, emit_bss, emit_common,
	assemble_variable_contents, place_block_symbol): Use
	get_variable_align instead of DECL_ALIGN.
	(assemble_noswitch_variable): Add align argument, use it
	instead of DECL_ALIGN.
	(assemble_variable): Adjust caller.  Use get_variable_align
	instead of DECL_ALIGN.
	* config/i386/i386.h (DATA_ALIGNMENT): Adjust x86_data_alignment
	caller.
	(DATA_ABI_ALIGNMENT): Define.
	* config/i386/i386-protos.h (x86_data_alignment): Adjust prototype.
	* config/i386/i386.c (x86_data_alignment): Add opt argument.  If
	opt is false, only return the psABI mandated alignment increase.
	* doc/tm.texi.in (DATA_ABI_ALIGNMENT): Document.
	* doc/tm.texi: Regenerated.

	* gcc.target/i386/pr56564-1.c: New test.
	* gcc.target/i386/pr56564-2.c: New test.
	* gcc.target/i386/pr56564-3.c: New test.
	* gcc.target/i386/pr56564-4.c: New test.
	* gcc.target/i386/avx256-unaligned-load-4.c: Add -fno-common.
	* gcc.target/i386/avx256-unaligned-store-1.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-4.c: Likewise.
	* gcc.target/i386/vect-sizes-1.c: Likewise.
	* gcc.target/i386/memcpy-1.c: Likewise.
	* gcc.dg/vect/costmodel/i386/costmodel-vect-31.c (tmp): Initialize.
	* gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c (tmp): Likewise.

--- gcc/varasm.c.jj	2013-06-07 13:17:17.000000000 +0200
+++ gcc/varasm.c	2013-06-08 16:53:40.717372488 +0200
@@ -966,13 +966,80 @@ align_variable (tree decl, bool dont_out
       align = MAX_OFILE_ALIGNMENT;
     }
 
-  /* On some machines, it is good to increase alignment sometimes.  */
   if (! DECL_USER_ALIGN (decl))
     {
+#ifdef DATA_ABI_ALIGNMENT
+      unsigned int data_abi_align
+	= DATA_ABI_ALIGNMENT (TREE_TYPE (decl), align);
+      /* For backwards compatibility, don't assume the ABI alignment for
+	 TLS variables.  */
+      if (! DECL_THREAD_LOCAL_P (decl) || data_abi_align <= BITS_PER_WORD)
+	align = data_abi_align;
+#endif
+
+      /* On some machines, it is good to increase alignment sometimes.
+	 But as DECL_ALIGN is used both for actually emitting the variable
+	 and for code accessing the variable as guaranteed alignment, we
+	 can only increase the alignment if it is a performance optimization
+	 if the references to it must bind to the current definition.  */
+      if (decl_binds_to_current_def_p (decl))
+	{
+#ifdef DATA_ALIGNMENT
+	  unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
+	  /* Don't increase alignment too much for TLS variables - TLS space
+	     is too precious.  */
+	  if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD)
+	    align = data_align;
+#endif
+#ifdef CONSTANT_ALIGNMENT
+	  if (DECL_INITIAL (decl) != 0
+	      && DECL_INITIAL (decl) != error_mark_node)
+	    {
+	      unsigned int const_align
+		= CONSTANT_ALIGNMENT (DECL_INITIAL (decl), align);
+	      /* Don't increase alignment too much for TLS variables - TLS
+		 space is too precious.  */
+	      if (! DECL_THREAD_LOCAL_P (decl) || const_align <= BITS_PER_WORD)
+		align = const_align;
+	    }
+#endif
+	}
+    }
+
+  /* Reset the alignment in case we have made it tighter, so we can benefit
+     from it in get_pointer_alignment.  */
+  DECL_ALIGN (decl) = align;
+}
+
+/* Return DECL_ALIGN (decl), possibly increased for optimization purposes
+   beyond what align_variable returned.  */
+
+static unsigned int
+get_variable_align (tree decl)
+{
+  unsigned int align = DECL_ALIGN (decl);
+
+  /* For user aligned vars or static vars align_variable already did
+     everything.  */
+  if (DECL_USER_ALIGN (decl) || !TREE_PUBLIC (decl))
+    return align;
+
+#ifdef DATA_ABI_ALIGNMENT
+  if (DECL_THREAD_LOCAL_P (decl))
+    align = DATA_ABI_ALIGNMENT (TREE_TYPE (decl), align);
+#endif
+
+  /* For decls that bind to the current definition, align_variable
+     did also everything, except for not assuming ABI required alignment
+     of TLS variables.  For other vars, increase the alignment here
+     as an optimization.  */
+  if (!decl_binds_to_current_def_p (decl))
+    {
+      /* On some machines, it is good to increase alignment sometimes.  */
 #ifdef DATA_ALIGNMENT
       unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
       /* Don't increase alignment too much for TLS variables - TLS space
-	 is too precious.  */
+         is too precious.  */
       if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD)
 	align = data_align;
 #endif
@@ -986,12 +1053,10 @@ align_variable (tree decl, bool dont_out
 	  if (! DECL_THREAD_LOCAL_P (decl) || const_align <= BITS_PER_WORD)
 	    align = const_align;
 	}
-#endif
     }
+#endif
 
-  /* Reset the alignment in case we have made it tighter, so we can benefit
-     from it in get_pointer_alignment.  */
-  DECL_ALIGN (decl) = align;
+  return align;
 }
 
 /* Return the section into which the given VAR_DECL or CONST_DECL
@@ -1043,7 +1108,8 @@ get_variable_section (tree decl, bool pr
 	return bss_noswitch_section;
     }
 
-  return targetm.asm_out.select_section (decl, reloc, DECL_ALIGN (decl));
+  return targetm.asm_out.select_section (decl, reloc,
+					 get_variable_align (decl));
 }
 
 /* Return the block into which object_block DECL should be placed.  */
@@ -1780,7 +1846,8 @@ emit_bss (tree decl ATTRIBUTE_UNUSED,
 	  unsigned HOST_WIDE_INT rounded ATTRIBUTE_UNUSED)
 {
 #if defined ASM_OUTPUT_ALIGNED_BSS
-  ASM_OUTPUT_ALIGNED_BSS (asm_out_file, decl, name, size, DECL_ALIGN (decl));
+  ASM_OUTPUT_ALIGNED_BSS (asm_out_file, decl, name, size,
+			  get_variable_align (decl));
   return true;
 #endif
 }
@@ -1796,10 +1863,11 @@ emit_common (tree decl ATTRIBUTE_UNUSED,
 {
 #if defined ASM_OUTPUT_ALIGNED_DECL_COMMON
   ASM_OUTPUT_ALIGNED_DECL_COMMON (asm_out_file, decl, name,
-				  size, DECL_ALIGN (decl));
+				  size, get_variable_align (decl));
   return true;
 #elif defined ASM_OUTPUT_ALIGNED_COMMON
-  ASM_OUTPUT_ALIGNED_COMMON (asm_out_file, name, size, DECL_ALIGN (decl));
+  ASM_OUTPUT_ALIGNED_COMMON (asm_out_file, name, size,
+			     get_variable_align (decl));
   return true;
 #else
   ASM_OUTPUT_COMMON (asm_out_file, name, size, rounded);
@@ -1828,7 +1896,8 @@ emit_tls_common (tree decl ATTRIBUTE_UNU
    NAME is the name of DECL's SYMBOL_REF.  */
 
 static void
-assemble_noswitch_variable (tree decl, const char *name, section *sect)
+assemble_noswitch_variable (tree decl, const char *name, section *sect,
+			    unsigned int align)
 {
   unsigned HOST_WIDE_INT size, rounded;
 
@@ -1850,7 +1919,7 @@ assemble_noswitch_variable (tree decl, c
 	     * (BIGGEST_ALIGNMENT / BITS_PER_UNIT));
 
   if (!sect->noswitch.callback (decl, name, size, rounded)
-      && (unsigned HOST_WIDE_INT) DECL_ALIGN_UNIT (decl) > rounded)
+      && (unsigned HOST_WIDE_INT) (align / BITS_PER_UNIT) > rounded)
     warning (0, "requested alignment for %q+D is greater than "
 	     "implemented alignment of %wu", decl, rounded);
 }
@@ -1880,7 +1949,7 @@ assemble_variable_contents (tree decl, c
 	/* Output the actual data.  */
 	output_constant (DECL_INITIAL (decl),
 			 tree_low_cst (DECL_SIZE_UNIT (decl), 1),
-			 DECL_ALIGN (decl));
+			 get_variable_align (decl));
       else
 	/* Leave space for it.  */
 	assemble_zeros (tree_low_cst (DECL_SIZE_UNIT (decl), 1));
@@ -1904,6 +1973,7 @@ assemble_variable (tree decl, int top_le
   const char *name;
   rtx decl_rtl, symbol;
   section *sect;
+  unsigned int align;
   bool asan_protected = false;
 
   /* This function is supposed to handle VARIABLES.  Ensure we have one.  */
@@ -2003,6 +2073,8 @@ assemble_variable (tree decl, int top_le
 
   set_mem_align (decl_rtl, DECL_ALIGN (decl));
 
+  align = get_variable_align (decl);
+
   if (TREE_PUBLIC (decl))
     maybe_assemble_visibility (decl);
 
@@ -2032,12 +2104,12 @@ assemble_variable (tree decl, int top_le
       place_block_symbol (symbol);
     }
   else if (SECTION_STYLE (sect) == SECTION_NOSWITCH)
-    assemble_noswitch_variable (decl, name, sect);
+    assemble_noswitch_variable (decl, name, sect, align);
   else
     {
       switch_to_section (sect);
-      if (DECL_ALIGN (decl) > BITS_PER_UNIT)
-	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (DECL_ALIGN_UNIT (decl)));
+      if (align > BITS_PER_UNIT)
+	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT));
       assemble_variable_contents (decl, name, dont_output_data);
       if (asan_protected)
 	{
@@ -6967,7 +7039,7 @@ place_block_symbol (rtx symbol)
   else
     {
       decl = SYMBOL_REF_DECL (symbol);
-      alignment = DECL_ALIGN (decl);
+      alignment = get_variable_align (decl);
       size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
       if (flag_asan && asan_protect_global (decl))
 	{
--- gcc/config/i386/i386.h.jj	2013-06-03 19:15:34.000000000 +0200
+++ gcc/config/i386/i386.h	2013-06-07 14:48:36.430589051 +0200
@@ -859,7 +859,18 @@ enum target_cpu_default
    cause character arrays to be word-aligned so that `strcpy' calls
    that copy constants to character arrays can be done inline.  */
 
-#define DATA_ALIGNMENT(TYPE, ALIGN) ix86_data_alignment ((TYPE), (ALIGN))
+#define DATA_ALIGNMENT(TYPE, ALIGN) \
+  ix86_data_alignment ((TYPE), (ALIGN), true)
+
+/* Similar to DATA_ALIGNMENT, but for the cases where the ABI mandates
+   some alignment increase, instead of optimization only purposes.  E.g.
+   AMD x86-64 psABI says that variables with array type larger than 15 bytes
+   must be aligned to 16 byte boundaries.
+
+   If this macro is not defined, then ALIGN is used.  */
+
+#define DATA_ABI_ALIGNMENT(TYPE, ALIGN) \
+  ix86_data_alignment ((TYPE), (ALIGN), false)
 
 /* If defined, a C expression to compute the alignment for a local
    variable.  TYPE is the data type, and ALIGN is the alignment that
--- gcc/config/i386/i386-protos.h.jj	2013-05-14 21:30:19.000000000 +0200
+++ gcc/config/i386/i386-protos.h	2013-06-07 13:31:21.937823575 +0200
@@ -207,7 +207,7 @@ extern void init_cumulative_args (CUMULA
 #endif	/* RTX_CODE  */
 
 #ifdef TREE_CODE
-extern int ix86_data_alignment (tree, int);
+extern int ix86_data_alignment (tree, int, bool);
 extern unsigned int ix86_local_alignment (tree, enum machine_mode,
 					  unsigned int);
 extern unsigned int ix86_minimum_alignment (tree, enum machine_mode,
--- gcc/config/i386/i386.c.jj	2013-06-07 13:17:17.000000000 +0200
+++ gcc/config/i386/i386.c	2013-06-07 13:37:24.845416361 +0200
@@ -25375,11 +25375,12 @@ ix86_constant_alignment (tree exp, int a
    instead of that alignment to align the object.  */
 
 int
-ix86_data_alignment (tree type, int align)
+ix86_data_alignment (tree type, int align, bool opt)
 {
   int max_align = optimize_size ? BITS_PER_WORD : MIN (256, MAX_OFILE_ALIGNMENT);
 
-  if (AGGREGATE_TYPE_P (type)
+  if (opt
+      && AGGREGATE_TYPE_P (type)
       && TYPE_SIZE (type)
       && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
       && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= (unsigned) max_align
@@ -25391,14 +25392,17 @@ ix86_data_alignment (tree type, int alig
      to 16byte boundary.  */
   if (TARGET_64BIT)
     {
-      if (AGGREGATE_TYPE_P (type)
-	   && TYPE_SIZE (type)
-	   && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
-	   && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 128
-	       || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 128)
+      if ((opt ? AGGREGATE_TYPE_P (type) : TREE_CODE (type) == ARRAY_TYPE)
+	  && TYPE_SIZE (type)
+	  && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
+	  && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 128
+	      || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 128)
 	return 128;
     }
 
+  if (!opt)
+    return align;
+
   if (TREE_CODE (type) == ARRAY_TYPE)
     {
       if (TYPE_MODE (TREE_TYPE (type)) == DFmode && align < 64)
--- gcc/doc/tm.texi.in.jj	2013-06-01 14:47:17.000000000 +0200
+++ gcc/doc/tm.texi.in	2013-06-07 14:47:10.192003375 +0200
@@ -1062,6 +1062,15 @@ arrays to be word-aligned so that @code{
 constants to character arrays can be done inline.
 @end defmac
 
+@defmac DATA_ABI_ALIGNMENT (@var{type}, @var{basic-align})
+Similar to @code{DATA_ALIGNMENT}, but for the cases where the ABI mandates
+some alignment increase, instead of optimization only purposes.  E.g.@
+AMD x86-64 psABI says that variables with array type larger than 15 bytes
+must be aligned to 16 byte boundaries.
+
+If this macro is not defined, then @var{basic-align} is used.
+@end defmac
+
 @defmac CONSTANT_ALIGNMENT (@var{constant}, @var{basic-align})
 If defined, a C expression to compute the alignment given to a constant
 that is being placed in memory.  @var{constant} is the constant and
--- gcc/doc/tm.texi.jj	2013-06-01 14:47:17.241680273 +0200
+++ gcc/doc/tm.texi	2013-06-07 14:47:29.400694547 +0200
@@ -1078,6 +1078,15 @@ arrays to be word-aligned so that @code{
 constants to character arrays can be done inline.
 @end defmac
 
+@defmac DATA_ABI_ALIGNMENT (@var{type}, @var{basic-align})
+Similar to @code{DATA_ALIGNMENT}, but for the cases where the ABI mandates
+some alignment increase, instead of optimization only purposes.  E.g.@
+AMD x86-64 psABI says that variables with array type larger than 15 bytes
+must be aligned to 16 byte boundaries.
+
+If this macro is not defined, then @var{basic-align} is used.
+@end defmac
+
 @defmac CONSTANT_ALIGNMENT (@var{constant}, @var{basic-align})
 If defined, a C expression to compute the alignment given to a constant
 that is being placed in memory.  @var{constant} is the constant and
--- gcc/testsuite/gcc.target/i386/pr56564-1.c.jj	2013-06-07 15:17:15.879403383 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-1.c	2013-06-07 15:44:18.386232149 +0200
@@ -0,0 +1,25 @@
+/* PR target/56564 */
+/* { dg-do compile { target { fpic && lp64 } } } */
+/* { dg-options "-O3 -fpic -fdump-tree-optimized" } */
+
+struct S { long a, b; } s = { 5, 6 };
+char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-tree-dump-times "&s" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "&t" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized" } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/pr56564-2.c.jj	2013-06-07 15:19:28.900986237 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-2.c	2013-06-07 15:44:25.120129885 +0200
@@ -0,0 +1,25 @@
+/* PR target/56564 */
+/* { dg-do compile { target { *-*-linux* && lp64 } } } */
+/* { dg-options "-O3 -fno-pic -fdump-tree-optimized" } */
+
+struct S { long a, b; } s = { 5, 6 };
+char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-tree-dump-times "&s" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "&t" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return 0" 2 "optimized" } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/pr56564-3.c.jj	2013-06-07 15:22:26.470079983 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-3.c	2013-06-07 15:44:34.872968234 +0200
@@ -0,0 +1,28 @@
+/* PR target/56564 */
+/* { dg-do compile { target { fpic && lp64 } } } */
+/* { dg-options "-O3 -fpic -fdump-tree-optimized" } */
+
+__thread struct S { long a, b; } s = { 5, 6 };
+__thread char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+/* For backwards compatibility we don't assume that t must
+   be aligned to 16 bytes, but align it anyway.  */
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-tree-dump-times "&s" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "&t" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return 0" 0 "optimized" } } */
+/* { dg-final { scan-assembler-not ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/pr56564-4.c.jj	2013-06-07 15:25:12.638326084 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-4.c	2013-06-07 15:45:59.489567940 +0200
@@ -0,0 +1,22 @@
+/* PR target/56564 */
+/* { dg-do compile { target { *-*-linux* && lp64 } } } */
+/* { dg-options "-O3 -fno-pic -fdump-tree-optimized" } */
+
+__thread struct S { long a, b; } s = { 5, 6 };
+__thread char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-assembler-not ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-load-4.c.jj	2012-10-16 13:15:44.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-load-4.c	2013-06-07 21:07:22.341380267 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store" } */
+/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-1.c.jj	2012-10-16 13:15:44.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-1.c	2013-06-07 21:06:06.911946765 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store" } */
+/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-3.c.jj	2012-10-16 13:15:44.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-3.c	2013-06-07 21:06:41.421248309 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store -mtune=generic" } */
+/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store -mtune=generic -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-4.c.jj	2012-10-16 13:15:44.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-4.c	2013-06-07 21:06:53.516302188 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store" } */
+/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/vect-sizes-1.c.jj	2010-11-01 09:06:30.000000000 +0100
+++ gcc/testsuite/gcc.target/i386/vect-sizes-1.c	2013-06-07 21:08:07.851581595 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -ffast-math -mavx -mtune=generic" } */
+/* { dg-options "-O3 -ffast-math -mavx -mtune=generic -fno-common" } */
 
 double a[1024];
 
--- gcc/testsuite/gcc.target/i386/memcpy-1.c.jj	2011-07-11 10:39:29.000000000 +0200
+++ gcc/testsuite/gcc.target/i386/memcpy-1.c	2013-06-07 21:08:46.263945653 +0200
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target ia32 } */
-/* { dg-options "-O2 -march=pentiumpro -minline-all-stringops" } */
+/* { dg-options "-O2 -march=pentiumpro -minline-all-stringops -fno-common" } */
 /* { dg-final { scan-assembler "rep" } } */
 /* { dg-final { scan-assembler "movs" } } */
 /* { dg-final { scan-assembler-not "test" } } */
--- gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c.jj	2009-11-04 08:15:26.000000000 +0100
+++ gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c	2013-06-07 20:58:02.091268267 +0200
@@ -18,7 +18,7 @@ struct s{
   struct t e;   /* unaligned (offset 2N+4N+4 B) */
 };
  
-struct s tmp;
+struct s tmp = { 1 };
 
 int main1 ()
 {  
--- gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c.jj	2009-11-04 08:15:26.000000000 +0100
+++ gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c	2013-06-07 20:58:38.201668248 +0200
@@ -18,7 +18,7 @@ struct s{
   struct t e;   /* unaligned (offset 2N+4N+4 B) */
 };
  
-struct s tmp;
+struct s tmp = { 1 };
 
 int main1 ()
 {  


	Jakub

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-07 20:43 ` Richard Henderson
  2013-06-07 21:14   ` Jakub Jelinek
  2013-06-07 22:56   ` Hans-Peter Nilsson
@ 2013-06-10 10:51   ` Bernd Schmidt
  2013-06-10 10:56     ` Jakub Jelinek
  2013-06-10 11:52   ` Ulrich Weigand
  2013-06-12 17:52   ` Edmar Wienskoski
  4 siblings, 1 reply; 35+ messages in thread
From: Bernd Schmidt @ 2013-06-10 10:51 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jakub Jelinek, Richard Biener, Jan Hubicka, gcc-patches, hp, hp,
	uweigand, Andreas.Krebbel, dje.gcc

On 06/07/2013 10:43 PM, Richard Henderson wrote:
> But these I think require a good hard look to see if they really intended an
> ABI alignment:
> 
> c6x	comment explicitly mentions abi

The ABI specifies a minimum alignment for arrays.


Bernd

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-10 10:51   ` Bernd Schmidt
@ 2013-06-10 10:56     ` Jakub Jelinek
  2013-06-10 11:03       ` Bernd Schmidt
  0 siblings, 1 reply; 35+ messages in thread
From: Jakub Jelinek @ 2013-06-10 10:56 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Richard Henderson, Richard Biener, Jan Hubicka, gcc-patches, hp,
	hp, uweigand, Andreas.Krebbel, dje.gcc

On Mon, Jun 10, 2013 at 12:51:05PM +0200, Bernd Schmidt wrote:
> On 06/07/2013 10:43 PM, Richard Henderson wrote:
> > But these I think require a good hard look to see if they really intended an
> > ABI alignment:
> > 
> > c6x	comment explicitly mentions abi
> 
> The ABI specifies a minimum alignment for arrays.

Thus after the patch c6x.h (DATA_ALIGNMENT) should be renamed to
DATA_ABI_ALIGNMENT, right?

	Jakub

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-10 10:56     ` Jakub Jelinek
@ 2013-06-10 11:03       ` Bernd Schmidt
  0 siblings, 0 replies; 35+ messages in thread
From: Bernd Schmidt @ 2013-06-10 11:03 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Henderson, Richard Biener, Jan Hubicka, gcc-patches, hp,
	hp, uweigand, Andreas.Krebbel, dje.gcc

On 06/10/2013 12:55 PM, Jakub Jelinek wrote:
> On Mon, Jun 10, 2013 at 12:51:05PM +0200, Bernd Schmidt wrote:
>> On 06/07/2013 10:43 PM, Richard Henderson wrote:
>>> But these I think require a good hard look to see if they really intended an
>>> ABI alignment:
>>>
>>> c6x	comment explicitly mentions abi
>>
>> The ABI specifies a minimum alignment for arrays.
> 
> Thus after the patch c6x.h (DATA_ALIGNMENT) should be renamed to
> DATA_ABI_ALIGNMENT, right?

I think so.


Bernd


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-07 20:43 ` Richard Henderson
                     ` (2 preceding siblings ...)
  2013-06-10 10:51   ` Bernd Schmidt
@ 2013-06-10 11:52   ` Ulrich Weigand
  2013-06-12 17:52   ` Edmar Wienskoski
  4 siblings, 0 replies; 35+ messages in thread
From: Ulrich Weigand @ 2013-06-10 11:52 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jakub Jelinek, Richard Biener, Jan Hubicka, gcc-patches, bernds,
	hp, hp, Andreas.Krebbel, dje.gcc

Richard Henderson wrote:

> s390	comment mentions LARL instruction

On s390(x) it is indeed an ABI requirement that all global symbols
are at least 2-aligned.  (Note that we skip that alignment requirement
if a symbol is marked as attribute((aligned(1)), but that attribute
must then be present for every use, too.)

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-07 21:14   ` Jakub Jelinek
  2013-06-08 15:13     ` Jakub Jelinek
@ 2013-06-10 14:52     ` Richard Henderson
  2013-06-10 15:45       ` Jakub Jelinek
  1 sibling, 1 reply; 35+ messages in thread
From: Richard Henderson @ 2013-06-10 14:52 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Biener, Jan Hubicka, gcc-patches, bernds, hp, hp,
	uweigand, Andreas.Krebbel, dje.gcc

On 06/07/2013 02:14 PM, Jakub Jelinek wrote:
>> > When the linker merges common blocks, it chooses both maximum size and maximum
>> > alignment.  Thus for any common block for which we can prove the block must
>> > reside in the module (any executable, or hidden common in shared object), we
>> > can go ahead and use the increased alignment.
> But consider say:
> one TU:
> struct S { char buf[15]; } s __attribute__((aligned (32)));
> 
> another TU:
> char c = 7;
> struct S { char buf[15]; } s = { { 1, 2 } };
> char d = 8;
> int main () { return 0; }
> 
> (the aligned(32) is there just to simulate the DATA_ALIGNMENT optimization
> increase).  Linker warns about this (thus the question is if we want to
> increase the alignment for optimization on commons at all) and doesn't align
> it.
> 

Oh, right.  I hadn't considered commons unifying with non-common variables,
and the failure of commoning in that case.  I'd mostly been thinking of
uninitialized Fortran-like common blocks, where it is more common for the
blocks to have nothing in common but the name.



r~

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-10 14:52     ` Richard Henderson
@ 2013-06-10 15:45       ` Jakub Jelinek
  2013-06-10 19:44         ` David Edelsohn
  2013-06-11  0:44         ` DJ Delorie
  0 siblings, 2 replies; 35+ messages in thread
From: Jakub Jelinek @ 2013-06-10 15:45 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Richard Biener, Jan Hubicka, gcc-patches, bernds, hp, hp,
	uweigand, Andreas.Krebbel, dje.gcc

On Mon, Jun 10, 2013 at 07:51:54AM -0700, Richard Henderson wrote:
> On 06/07/2013 02:14 PM, Jakub Jelinek wrote:
> >> > When the linker merges common blocks, it chooses both maximum size and maximum
> >> > alignment.  Thus for any common block for which we can prove the block must
> >> > reside in the module (any executable, or hidden common in shared object), we
> >> > can go ahead and use the increased alignment.
> > But consider say:
> > one TU:
> > struct S { char buf[15]; } s __attribute__((aligned (32)));
> > 
> > another TU:
> > char c = 7;
> > struct S { char buf[15]; } s = { { 1, 2 } };
> > char d = 8;
> > int main () { return 0; }
> > 
> > (the aligned(32) is there just to simulate the DATA_ALIGNMENT optimization
> > increase).  Linker warns about this (thus the question is if we want to
> > increase the alignment for optimization on commons at all) and doesn't align
> > it.
> > 
> 
> Oh, right.  I hadn't considered commons unifying with non-common variables,
> and the failure of commoning in that case.  I'd mostly been thinking of
> uninitialized Fortran-like common blocks, where it is more common for the
> blocks to have nothing in common but the name.

Ok, here is what I've committed to trunk (will wait for some time before
backporting).  As discussed with Honza on IRC, decl_binds_to_current_def_p
will need further fixing, it does the wrong thing for extern int var
__attribute__((visibility ("hidden"))) or hidden DECL_COMMON symbols.

And, we don't have any feedback about SPE/E500 rs6000 - DATA_ALIGNMENT vs.
DATA_ABI_ALIGNMENT yet.

2013-06-10  Jakub Jelinek  <jakub@redhat.com>

	PR target/56564
	* varasm.c (align_variable): Don't use DATA_ALIGNMENT or
	CONSTANT_ALIGNMENT if !decl_binds_to_current_def_p (decl).
	Use DATA_ABI_ALIGNMENT for that case instead if defined.
	(get_variable_align): New function.
	(get_variable_section, emit_bss, emit_common,
	assemble_variable_contents, place_block_symbol): Use
	get_variable_align instead of DECL_ALIGN.
	(assemble_noswitch_variable): Add align argument, use it
	instead of DECL_ALIGN.
	(assemble_variable): Adjust caller.  Use get_variable_align
	instead of DECL_ALIGN.
	* config/i386/i386.h (DATA_ALIGNMENT): Adjust x86_data_alignment
	caller.
	(DATA_ABI_ALIGNMENT): Define.
	* config/i386/i386-protos.h (x86_data_alignment): Adjust prototype.
	* config/i386/i386.c (x86_data_alignment): Add opt argument.  If
	opt is false, only return the psABI mandated alignment increase.
	* config/c6x/c6x.h (DATA_ALIGNMENT): Renamed to...
	(DATA_ABI_ALIGNMENT): ... this.
	* config/mmix/mmix.h (DATA_ALIGNMENT): Renamed to...
	(DATA_ABI_ALIGNMENT): ... this.
	* config/mmix/mmix.c (mmix_data_alignment): Adjust function comment.
	* config/s390/s390.h (DATA_ALIGNMENT): Renamed to...
	(DATA_ABI_ALIGNMENT): ... this.
	* doc/tm.texi.in (DATA_ABI_ALIGNMENT): Document.
	* doc/tm.texi: Regenerated.

	* gcc.target/i386/pr56564-1.c: New test.
	* gcc.target/i386/pr56564-2.c: New test.
	* gcc.target/i386/pr56564-3.c: New test.
	* gcc.target/i386/pr56564-4.c: New test.
	* gcc.target/i386/avx256-unaligned-load-4.c: Add -fno-common.
	* gcc.target/i386/avx256-unaligned-store-1.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-4.c: Likewise.
	* gcc.target/i386/vect-sizes-1.c: Likewise.
	* gcc.target/i386/memcpy-1.c: Likewise.
	* gcc.dg/vect/costmodel/i386/costmodel-vect-31.c (tmp): Initialize.
	* gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c (tmp): Likewise.

--- gcc/config/c6x/c6x.h.jj	2013-02-26 16:39:34.000000000 +0100
+++ gcc/config/c6x/c6x.h	2013-06-10 17:36:44.850082918 +0200
@@ -134,7 +134,7 @@ extern c6x_cpu_t c6x_arch;
    Really only externally visible arrays must be aligned this way, as
    only those are directly visible from another compilation unit.  But
    we don't have that information available here.  */
-#define DATA_ALIGNMENT(TYPE, ALIGN)					\
+#define DATA_ABI_ALIGNMENT(TYPE, ALIGN)					\
   (((ALIGN) < BITS_PER_UNIT * 8 && TREE_CODE (TYPE) == ARRAY_TYPE)	\
    ? BITS_PER_UNIT * 8 : (ALIGN))
 
--- gcc/config/mmix/mmix.h.jj	2013-01-11 09:03:16.000000000 +0100
+++ gcc/config/mmix/mmix.h	2013-06-10 17:36:05.585730695 +0200
@@ -164,7 +164,7 @@ struct GTY(()) machine_function
 /* Copied from elfos.h.  */
 #define MAX_OFILE_ALIGNMENT (32768 * 8)
 
-#define DATA_ALIGNMENT(TYPE, BASIC_ALIGN) \
+#define DATA_ABI_ALIGNMENT(TYPE, BASIC_ALIGN) \
  mmix_data_alignment (TYPE, BASIC_ALIGN)
 
 #define CONSTANT_ALIGNMENT(CONSTANT, BASIC_ALIGN) \
--- gcc/config/mmix/mmix.c.jj	2013-03-26 10:03:58.000000000 +0100
+++ gcc/config/mmix/mmix.c	2013-06-10 17:36:28.012360493 +0200
@@ -313,7 +313,7 @@ mmix_init_machine_status (void)
   return ggc_alloc_cleared_machine_function ();
 }
 
-/* DATA_ALIGNMENT.
+/* DATA_ABI_ALIGNMENT.
    We have trouble getting the address of stuff that is located at other
    than 32-bit alignments (GETA requirements), so try to give everything
    at least 32-bit alignment.  */
--- gcc/config/s390/s390.h.jj	2013-03-27 12:52:21.000000000 +0100
+++ gcc/config/s390/s390.h	2013-06-10 17:37:32.506293505 +0200
@@ -221,7 +221,7 @@ enum processor_flags
 
 /* Alignment on even addresses for LARL instruction.  */
 #define CONSTANT_ALIGNMENT(EXP, ALIGN) (ALIGN) < 16 ? 16 : (ALIGN)
-#define DATA_ALIGNMENT(TYPE, ALIGN) (ALIGN) < 16 ? 16 : (ALIGN)
+#define DATA_ABI_ALIGNMENT(TYPE, ALIGN) (ALIGN) < 16 ? 16 : (ALIGN)
 
 /* Alignment is not required by the hardware.  */
 #define STRICT_ALIGNMENT 0
--- gcc/config/i386/i386.h.jj	2013-06-09 20:27:03.727336138 +0200
+++ gcc/config/i386/i386.h	2013-06-10 17:35:32.922269839 +0200
@@ -859,7 +859,18 @@ enum target_cpu_default
    cause character arrays to be word-aligned so that `strcpy' calls
    that copy constants to character arrays can be done inline.  */
 
-#define DATA_ALIGNMENT(TYPE, ALIGN) ix86_data_alignment ((TYPE), (ALIGN))
+#define DATA_ALIGNMENT(TYPE, ALIGN) \
+  ix86_data_alignment ((TYPE), (ALIGN), true)
+
+/* Similar to DATA_ALIGNMENT, but for the cases where the ABI mandates
+   some alignment increase, instead of optimization only purposes.  E.g.
+   AMD x86-64 psABI says that variables with array type larger than 15 bytes
+   must be aligned to 16 byte boundaries.
+
+   If this macro is not defined, then ALIGN is used.  */
+
+#define DATA_ABI_ALIGNMENT(TYPE, ALIGN) \
+  ix86_data_alignment ((TYPE), (ALIGN), false)
 
 /* If defined, a C expression to compute the alignment for a local
    variable.  TYPE is the data type, and ALIGN is the alignment that
--- gcc/config/i386/i386-protos.h.jj	2013-06-09 20:27:03.899333325 +0200
+++ gcc/config/i386/i386-protos.h	2013-06-10 17:35:32.926269775 +0200
@@ -207,7 +207,7 @@ extern void init_cumulative_args (CUMULA
 #endif	/* RTX_CODE  */
 
 #ifdef TREE_CODE
-extern int ix86_data_alignment (tree, int);
+extern int ix86_data_alignment (tree, int, bool);
 extern unsigned int ix86_local_alignment (tree, enum machine_mode,
 					  unsigned int);
 extern unsigned int ix86_minimum_alignment (tree, enum machine_mode,
--- gcc/config/i386/i386.c.jj	2013-06-09 20:27:03.674336994 +0200
+++ gcc/config/i386/i386.c	2013-06-10 17:35:32.934269642 +0200
@@ -25375,11 +25375,12 @@ ix86_constant_alignment (tree exp, int a
    instead of that alignment to align the object.  */
 
 int
-ix86_data_alignment (tree type, int align)
+ix86_data_alignment (tree type, int align, bool opt)
 {
   int max_align = optimize_size ? BITS_PER_WORD : MIN (256, MAX_OFILE_ALIGNMENT);
 
-  if (AGGREGATE_TYPE_P (type)
+  if (opt
+      && AGGREGATE_TYPE_P (type)
       && TYPE_SIZE (type)
       && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
       && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= (unsigned) max_align
@@ -25391,14 +25392,17 @@ ix86_data_alignment (tree type, int alig
      to 16byte boundary.  */
   if (TARGET_64BIT)
     {
-      if (AGGREGATE_TYPE_P (type)
-	   && TYPE_SIZE (type)
-	   && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
-	   && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 128
-	       || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 128)
+      if ((opt ? AGGREGATE_TYPE_P (type) : TREE_CODE (type) == ARRAY_TYPE)
+	  && TYPE_SIZE (type)
+	  && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
+	  && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 128
+	      || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 128)
 	return 128;
     }
 
+  if (!opt)
+    return align;
+
   if (TREE_CODE (type) == ARRAY_TYPE)
     {
       if (TYPE_MODE (TREE_TYPE (type)) == DFmode && align < 64)
--- gcc/varasm.c.jj	2013-06-09 20:27:57.214463944 +0200
+++ gcc/varasm.c	2013-06-10 17:35:32.921269856 +0200
@@ -966,13 +966,80 @@ align_variable (tree decl, bool dont_out
       align = MAX_OFILE_ALIGNMENT;
     }
 
-  /* On some machines, it is good to increase alignment sometimes.  */
   if (! DECL_USER_ALIGN (decl))
     {
+#ifdef DATA_ABI_ALIGNMENT
+      unsigned int data_abi_align
+	= DATA_ABI_ALIGNMENT (TREE_TYPE (decl), align);
+      /* For backwards compatibility, don't assume the ABI alignment for
+	 TLS variables.  */
+      if (! DECL_THREAD_LOCAL_P (decl) || data_abi_align <= BITS_PER_WORD)
+	align = data_abi_align;
+#endif
+
+      /* On some machines, it is good to increase alignment sometimes.
+	 But as DECL_ALIGN is used both for actually emitting the variable
+	 and for code accessing the variable as guaranteed alignment, we
+	 can only increase the alignment if it is a performance optimization
+	 if the references to it must bind to the current definition.  */
+      if (decl_binds_to_current_def_p (decl))
+	{
+#ifdef DATA_ALIGNMENT
+	  unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
+	  /* Don't increase alignment too much for TLS variables - TLS space
+	     is too precious.  */
+	  if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD)
+	    align = data_align;
+#endif
+#ifdef CONSTANT_ALIGNMENT
+	  if (DECL_INITIAL (decl) != 0
+	      && DECL_INITIAL (decl) != error_mark_node)
+	    {
+	      unsigned int const_align
+		= CONSTANT_ALIGNMENT (DECL_INITIAL (decl), align);
+	      /* Don't increase alignment too much for TLS variables - TLS
+		 space is too precious.  */
+	      if (! DECL_THREAD_LOCAL_P (decl) || const_align <= BITS_PER_WORD)
+		align = const_align;
+	    }
+#endif
+	}
+    }
+
+  /* Reset the alignment in case we have made it tighter, so we can benefit
+     from it in get_pointer_alignment.  */
+  DECL_ALIGN (decl) = align;
+}
+
+/* Return DECL_ALIGN (decl), possibly increased for optimization purposes
+   beyond what align_variable returned.  */
+
+static unsigned int
+get_variable_align (tree decl)
+{
+  unsigned int align = DECL_ALIGN (decl);
+
+  /* For user aligned vars or static vars align_variable already did
+     everything.  */
+  if (DECL_USER_ALIGN (decl) || !TREE_PUBLIC (decl))
+    return align;
+
+#ifdef DATA_ABI_ALIGNMENT
+  if (DECL_THREAD_LOCAL_P (decl))
+    align = DATA_ABI_ALIGNMENT (TREE_TYPE (decl), align);
+#endif
+
+  /* For decls that bind to the current definition, align_variable
+     did also everything, except for not assuming ABI required alignment
+     of TLS variables.  For other vars, increase the alignment here
+     as an optimization.  */
+  if (!decl_binds_to_current_def_p (decl))
+    {
+      /* On some machines, it is good to increase alignment sometimes.  */
 #ifdef DATA_ALIGNMENT
       unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
       /* Don't increase alignment too much for TLS variables - TLS space
-	 is too precious.  */
+         is too precious.  */
       if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD)
 	align = data_align;
 #endif
@@ -986,12 +1053,10 @@ align_variable (tree decl, bool dont_out
 	  if (! DECL_THREAD_LOCAL_P (decl) || const_align <= BITS_PER_WORD)
 	    align = const_align;
 	}
-#endif
     }
+#endif
 
-  /* Reset the alignment in case we have made it tighter, so we can benefit
-     from it in get_pointer_alignment.  */
-  DECL_ALIGN (decl) = align;
+  return align;
 }
 
 /* Return the section into which the given VAR_DECL or CONST_DECL
@@ -1043,7 +1108,8 @@ get_variable_section (tree decl, bool pr
 	return bss_noswitch_section;
     }
 
-  return targetm.asm_out.select_section (decl, reloc, DECL_ALIGN (decl));
+  return targetm.asm_out.select_section (decl, reloc,
+					 get_variable_align (decl));
 }
 
 /* Return the block into which object_block DECL should be placed.  */
@@ -1780,7 +1846,8 @@ emit_bss (tree decl ATTRIBUTE_UNUSED,
 	  unsigned HOST_WIDE_INT rounded ATTRIBUTE_UNUSED)
 {
 #if defined ASM_OUTPUT_ALIGNED_BSS
-  ASM_OUTPUT_ALIGNED_BSS (asm_out_file, decl, name, size, DECL_ALIGN (decl));
+  ASM_OUTPUT_ALIGNED_BSS (asm_out_file, decl, name, size,
+			  get_variable_align (decl));
   return true;
 #endif
 }
@@ -1796,10 +1863,11 @@ emit_common (tree decl ATTRIBUTE_UNUSED,
 {
 #if defined ASM_OUTPUT_ALIGNED_DECL_COMMON
   ASM_OUTPUT_ALIGNED_DECL_COMMON (asm_out_file, decl, name,
-				  size, DECL_ALIGN (decl));
+				  size, get_variable_align (decl));
   return true;
 #elif defined ASM_OUTPUT_ALIGNED_COMMON
-  ASM_OUTPUT_ALIGNED_COMMON (asm_out_file, name, size, DECL_ALIGN (decl));
+  ASM_OUTPUT_ALIGNED_COMMON (asm_out_file, name, size,
+			     get_variable_align (decl));
   return true;
 #else
   ASM_OUTPUT_COMMON (asm_out_file, name, size, rounded);
@@ -1828,7 +1896,8 @@ emit_tls_common (tree decl ATTRIBUTE_UNU
    NAME is the name of DECL's SYMBOL_REF.  */
 
 static void
-assemble_noswitch_variable (tree decl, const char *name, section *sect)
+assemble_noswitch_variable (tree decl, const char *name, section *sect,
+			    unsigned int align)
 {
   unsigned HOST_WIDE_INT size, rounded;
 
@@ -1850,7 +1919,7 @@ assemble_noswitch_variable (tree decl, c
 	     * (BIGGEST_ALIGNMENT / BITS_PER_UNIT));
 
   if (!sect->noswitch.callback (decl, name, size, rounded)
-      && (unsigned HOST_WIDE_INT) DECL_ALIGN_UNIT (decl) > rounded)
+      && (unsigned HOST_WIDE_INT) (align / BITS_PER_UNIT) > rounded)
     warning (0, "requested alignment for %q+D is greater than "
 	     "implemented alignment of %wu", decl, rounded);
 }
@@ -1880,7 +1949,7 @@ assemble_variable_contents (tree decl, c
 	/* Output the actual data.  */
 	output_constant (DECL_INITIAL (decl),
 			 tree_low_cst (DECL_SIZE_UNIT (decl), 1),
-			 DECL_ALIGN (decl));
+			 get_variable_align (decl));
       else
 	/* Leave space for it.  */
 	assemble_zeros (tree_low_cst (DECL_SIZE_UNIT (decl), 1));
@@ -1904,6 +1973,7 @@ assemble_variable (tree decl, int top_le
   const char *name;
   rtx decl_rtl, symbol;
   section *sect;
+  unsigned int align;
   bool asan_protected = false;
 
   /* This function is supposed to handle VARIABLES.  Ensure we have one.  */
@@ -2003,6 +2073,8 @@ assemble_variable (tree decl, int top_le
 
   set_mem_align (decl_rtl, DECL_ALIGN (decl));
 
+  align = get_variable_align (decl);
+
   if (TREE_PUBLIC (decl))
     maybe_assemble_visibility (decl);
 
@@ -2032,12 +2104,12 @@ assemble_variable (tree decl, int top_le
       place_block_symbol (symbol);
     }
   else if (SECTION_STYLE (sect) == SECTION_NOSWITCH)
-    assemble_noswitch_variable (decl, name, sect);
+    assemble_noswitch_variable (decl, name, sect, align);
   else
     {
       switch_to_section (sect);
-      if (DECL_ALIGN (decl) > BITS_PER_UNIT)
-	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (DECL_ALIGN_UNIT (decl)));
+      if (align > BITS_PER_UNIT)
+	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT));
       assemble_variable_contents (decl, name, dont_output_data);
       if (asan_protected)
 	{
@@ -6959,7 +7031,7 @@ place_block_symbol (rtx symbol)
   else
     {
       decl = SYMBOL_REF_DECL (symbol);
-      alignment = DECL_ALIGN (decl);
+      alignment = get_variable_align (decl);
       size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
       if (flag_asan && asan_protect_global (decl))
 	{
--- gcc/doc/tm.texi.in.jj	2013-06-09 20:27:03.641337535 +0200
+++ gcc/doc/tm.texi.in	2013-06-10 17:35:32.983268835 +0200
@@ -1062,6 +1062,15 @@ arrays to be word-aligned so that @code{
 constants to character arrays can be done inline.
 @end defmac
 
+@defmac DATA_ABI_ALIGNMENT (@var{type}, @var{basic-align})
+Similar to @code{DATA_ALIGNMENT}, but for the cases where the ABI mandates
+some alignment increase, instead of optimization only purposes.  E.g.@
+AMD x86-64 psABI says that variables with array type larger than 15 bytes
+must be aligned to 16 byte boundaries.
+
+If this macro is not defined, then @var{basic-align} is used.
+@end defmac
+
 @defmac CONSTANT_ALIGNMENT (@var{constant}, @var{basic-align})
 If defined, a C expression to compute the alignment given to a constant
 that is being placed in memory.  @var{constant} is the constant and
--- gcc/doc/tm.texi.jj	2013-06-09 20:27:03.511339649 +0200
+++ gcc/doc/tm.texi	2013-06-10 17:35:33.053267681 +0200
@@ -1078,6 +1078,15 @@ arrays to be word-aligned so that @code{
 constants to character arrays can be done inline.
 @end defmac
 
+@defmac DATA_ABI_ALIGNMENT (@var{type}, @var{basic-align})
+Similar to @code{DATA_ALIGNMENT}, but for the cases where the ABI mandates
+some alignment increase, instead of optimization only purposes.  E.g.@
+AMD x86-64 psABI says that variables with array type larger than 15 bytes
+must be aligned to 16 byte boundaries.
+
+If this macro is not defined, then @var{basic-align} is used.
+@end defmac
+
 @defmac CONSTANT_ALIGNMENT (@var{constant}, @var{basic-align})
 If defined, a C expression to compute the alignment given to a constant
 that is being placed in memory.  @var{constant} is the constant and
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-4.c.jj	2013-06-09 20:27:03.333342550 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-4.c	2013-06-10 17:35:33.184265514 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store" } */
+/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-load-4.c.jj	2013-06-09 20:27:03.299343104 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-load-4.c	2013-06-10 17:35:33.106266802 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store" } */
+/* { dg-options "-O3 -dp -mavx -mno-avx256-split-unaligned-load -mno-avx256-split-unaligned-store -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/pr56564-4.c.jj	2013-06-10 17:35:33.086267133 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-4.c	2013-06-10 17:35:33.087267116 +0200
@@ -0,0 +1,22 @@
+/* PR target/56564 */
+/* { dg-do compile { target { *-*-linux* && lp64 } } } */
+/* { dg-options "-O3 -fno-pic -fdump-tree-optimized" } */
+
+__thread struct S { long a, b; } s = { 5, 6 };
+__thread char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-assembler-not ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/memcpy-1.c.jj	2013-06-09 20:27:03.248343936 +0200
+++ gcc/testsuite/gcc.target/i386/memcpy-1.c	2013-06-10 17:35:33.229264767 +0200
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target ia32 } */
-/* { dg-options "-O2 -march=pentiumpro -minline-all-stringops" } */
+/* { dg-options "-O2 -march=pentiumpro -minline-all-stringops -fno-common" } */
 /* { dg-final { scan-assembler "rep" } } */
 /* { dg-final { scan-assembler "movs" } } */
 /* { dg-final { scan-assembler-not "test" } } */
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-3.c.jj	2013-06-09 20:27:03.276343479 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-3.c	2013-06-10 17:35:33.128266441 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store -mtune=generic" } */
+/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store -mtune=generic -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.target/i386/vect-sizes-1.c.jj	2013-06-09 20:27:03.328342631 +0200
+++ gcc/testsuite/gcc.target/i386/vect-sizes-1.c	2013-06-10 17:35:33.194265350 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -ffast-math -mavx -mtune=generic" } */
+/* { dg-options "-O3 -ffast-math -mavx -mtune=generic -fno-common" } */
 
 double a[1024];
 
--- gcc/testsuite/gcc.target/i386/pr56564-1.c.jj	2013-06-10 17:35:33.062267530 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-1.c	2013-06-10 17:35:33.075267314 +0200
@@ -0,0 +1,25 @@
+/* PR target/56564 */
+/* { dg-do compile { target { fpic && lp64 } } } */
+/* { dg-options "-O3 -fpic -fdump-tree-optimized" } */
+
+struct S { long a, b; } s = { 5, 6 };
+char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-tree-dump-times "&s" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "&t" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized" } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/pr56564-3.c.jj	2013-06-10 17:35:33.086267133 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-3.c	2013-06-10 17:35:33.086267133 +0200
@@ -0,0 +1,28 @@
+/* PR target/56564 */
+/* { dg-do compile { target { fpic && lp64 } } } */
+/* { dg-options "-O3 -fpic -fdump-tree-optimized" } */
+
+__thread struct S { long a, b; } s = { 5, 6 };
+__thread char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+/* For backwards compatibility we don't assume that t must
+   be aligned to 16 bytes, but align it anyway.  */
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-tree-dump-times "&s" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "&t" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return 0" 0 "optimized" } } */
+/* { dg-final { scan-assembler-not ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/pr56564-2.c.jj	2013-06-10 17:35:33.075267314 +0200
+++ gcc/testsuite/gcc.target/i386/pr56564-2.c	2013-06-10 17:35:33.075267314 +0200
@@ -0,0 +1,25 @@
+/* PR target/56564 */
+/* { dg-do compile { target { *-*-linux* && lp64 } } } */
+/* { dg-options "-O3 -fno-pic -fdump-tree-optimized" } */
+
+struct S { long a, b; } s = { 5, 6 };
+char t[16] = { 7 };
+
+int
+foo (void)
+{
+  return ((__UINTPTR_TYPE__) &s) & 15;
+}
+
+int
+bar (void)
+{
+  return ((__UINTPTR_TYPE__) &t[0]) & 15;
+}
+
+/* { dg-final { scan-tree-dump-times "&s" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "&t" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return 0" 2 "optimized" } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]s:" { target { *-*-linux* } } } } */
+/* { dg-final { scan-assembler ".align\[ \t]*16\[^:]*\[\n\r]t:" { target { *-*-linux* } } } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-store-1.c.jj	2013-06-09 20:27:03.354342207 +0200
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-store-1.c	2013-06-10 17:35:33.115266654 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store" } */
+/* { dg-options "-O3 -dp -mavx -mavx256-split-unaligned-store -fno-common" } */
 
 #define N 1024
 
--- gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c.jj	2013-06-09 20:27:03.475340236 +0200
+++ gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c	2013-06-10 17:35:33.237264640 +0200
@@ -18,7 +18,7 @@ struct s{
   struct t e;   /* unaligned (offset 2N+4N+4 B) */
 };
  
-struct s tmp;
+struct s tmp = { 1 };
 
 int main1 ()
 {  
--- gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c.jj	2013-06-09 20:27:03.475340236 +0200
+++ gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c	2013-06-10 17:35:33.248264459 +0200
@@ -18,7 +18,7 @@ struct s{
   struct t e;   /* unaligned (offset 2N+4N+4 B) */
 };
  
-struct s tmp;
+struct s tmp = { 1 };
 
 int main1 ()
 {  


	Jakub

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-10 15:45       ` Jakub Jelinek
@ 2013-06-10 19:44         ` David Edelsohn
  2013-06-11  0:44         ` DJ Delorie
  1 sibling, 0 replies; 35+ messages in thread
From: David Edelsohn @ 2013-06-10 19:44 UTC (permalink / raw)
  To: Jakub Jelinek, Edmar Wienskoski
  Cc: Richard Henderson, Richard Biener, Jan Hubicka, GCC Patches,
	Bernd Schmidt, hp, Hans-Peter Nilsson, Ulrich Weigand,
	Andreas Krebbel

On Mon, Jun 10, 2013 at 11:44 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, Jun 10, 2013 at 07:51:54AM -0700, Richard Henderson wrote:
>> On 06/07/2013 02:14 PM, Jakub Jelinek wrote:
>> >> > When the linker merges common blocks, it chooses both maximum size and maximum
>> >> > alignment.  Thus for any common block for which we can prove the block must
>> >> > reside in the module (any executable, or hidden common in shared object), we
>> >> > can go ahead and use the increased alignment.
>> > But consider say:
>> > one TU:
>> > struct S { char buf[15]; } s __attribute__((aligned (32)));
>> >
>> > another TU:
>> > char c = 7;
>> > struct S { char buf[15]; } s = { { 1, 2 } };
>> > char d = 8;
>> > int main () { return 0; }
>> >
>> > (the aligned(32) is there just to simulate the DATA_ALIGNMENT optimization
>> > increase).  Linker warns about this (thus the question is if we want to
>> > increase the alignment for optimization on commons at all) and doesn't align
>> > it.
>> >
>>
>> Oh, right.  I hadn't considered commons unifying with non-common variables,
>> and the failure of commoning in that case.  I'd mostly been thinking of
>> uninitialized Fortran-like common blocks, where it is more common for the
>> blocks to have nothing in common but the name.
>
> Ok, here is what I've committed to trunk (will wait for some time before
> backporting).  As discussed with Honza on IRC, decl_binds_to_current_def_p
> will need further fixing, it does the wrong thing for extern int var
> __attribute__((visibility ("hidden"))) or hidden DECL_COMMON symbols.
>
> And, we don't have any feedback about SPE/E500 rs6000 - DATA_ALIGNMENT vs.
> DATA_ABI_ALIGNMENT yet.

SPE/e500 support was written by Aldy.  He or someone from Freescale
needs to comment.

- David

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-10 15:45       ` Jakub Jelinek
  2013-06-10 19:44         ` David Edelsohn
@ 2013-06-11  0:44         ` DJ Delorie
  2013-06-11  6:06           ` Jakub Jelinek
  1 sibling, 1 reply; 35+ messages in thread
From: DJ Delorie @ 2013-06-11  0:44 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches


> @@ -986,12 +1053,10 @@ align_variable (tree decl, bool dont_out
>  	  if (! DECL_THREAD_LOCAL_P (decl) || const_align <= BITS_PER_WORD)
>  	    align = const_align;
>  	}
> -#endif
>      }
> +#endif

I think this change in get_variable_align() is wrong; it results in
unbalanced braces inside an #ifdef, if the #ifdef body is not included
(i.e. CONSTANT_ALIGNMENT not defined), the compile fails...

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-11  0:44         ` DJ Delorie
@ 2013-06-11  6:06           ` Jakub Jelinek
  2013-06-11 15:20             ` DJ Delorie
  0 siblings, 1 reply; 35+ messages in thread
From: Jakub Jelinek @ 2013-06-11  6:06 UTC (permalink / raw)
  To: DJ Delorie; +Cc: gcc-patches

On Mon, Jun 10, 2013 at 08:44:05PM -0400, DJ Delorie wrote:
> 
> > @@ -986,12 +1053,10 @@ align_variable (tree decl, bool dont_out
> >  	  if (! DECL_THREAD_LOCAL_P (decl) || const_align <= BITS_PER_WORD)
> >  	    align = const_align;
> >  	}
> > -#endif
> >      }
> > +#endif
> 
> I think this change in get_variable_align() is wrong; it results in
> unbalanced braces inside an #ifdef, if the #ifdef body is not included
> (i.e. CONSTANT_ALIGNMENT not defined), the compile fails...

You are right, fixed thusly, committed as obvious:

2013-06-11  Jakub Jelinek  <jakub@redhat.com>

	PR target/56564
	* varasm.c (get_variable_align): Move #endif to the right place.

--- gcc/varasm.c	(revision 199933)
+++ gcc/varasm.c	(revision 199934)
@@ -1053,8 +1053,8 @@ get_variable_align (tree decl)
 	  if (! DECL_THREAD_LOCAL_P (decl) || const_align <= BITS_PER_WORD)
 	    align = const_align;
 	}
-    }
 #endif
+    }
 
   return align;
 }


	Jakub

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-11  6:06           ` Jakub Jelinek
@ 2013-06-11 15:20             ` DJ Delorie
  0 siblings, 0 replies; 35+ messages in thread
From: DJ Delorie @ 2013-06-11 15:20 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches


Thanks!

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-07 20:43 ` Richard Henderson
                     ` (3 preceding siblings ...)
  2013-06-10 11:52   ` Ulrich Weigand
@ 2013-06-12 17:52   ` Edmar Wienskoski
  2013-06-13  7:41     ` Alan Modra
  4 siblings, 1 reply; 35+ messages in thread
From: Edmar Wienskoski @ 2013-06-12 17:52 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jakub Jelinek, Richard Biener, Jan Hubicka, gcc-patches, bernds,
	hp, hp, uweigand, Andreas.Krebbel, David Edelsohn

The e500v2 (SPE) hardware is such that if the address of vector (double world
load / stores) are not double world aligned the instruction will trap.

So this alignment is not optional.

Edmar


On Fri, Jun 7, 2013 at 3:43 PM, Richard Henderson <rth@redhat.com> wrote:
> On 06/07/2013 12:25 PM, Jakub Jelinek wrote:
>> This PR is about DATA_ALIGNMENT macro increasing alignment of some decls
>> for optimization purposes beyond ABI mandated levels.  It is fine to emit
>> the vars aligned as much as we want for optimization purposes, but if we
>> can't be sure that references to that decl bind to the definition we
>> increased the alignment on (e.g. common variables, or -fpic code without
>> hidden visibility, weak vars etc.), we can't assume that alignment.
>
> When the linker merges common blocks, it chooses both maximum size and maximum
> alignment.  Thus for any common block for which we can prove the block must
> reside in the module (any executable, or hidden common in shared object), we
> can go ahead and use the increased alignment.
>
> It's only in shared objects with non-hidden common blocks that we have a
> problem, since in that case we may resolve the common block via copy reloc to a
> memory block in another module.
>
> So while decl_binds_to_current_def_p is a good starting point, I think we can
> do a little better with common blocks.  Which ought to take care of those
> vectorization regressions you mention.
>
>> @@ -966,8 +966,12 @@ align_variable (tree decl, bool dont_out
>>        align = MAX_OFILE_ALIGNMENT;
>>      }
>>
>> -  /* On some machines, it is good to increase alignment sometimes.  */
>> -  if (! DECL_USER_ALIGN (decl))
>> +  /* On some machines, it is good to increase alignment sometimes.
>> +     But as DECL_ALIGN is used both for actually emitting the variable
>> +     and for code accessing the variable as guaranteed alignment, we
>> +     can only increase the alignment if it is a performance optimization
>> +     if the references to it must bind to the current definition.  */
>> +  if (! DECL_USER_ALIGN (decl) && decl_binds_to_current_def_p (decl))
>>      {
>>  #ifdef DATA_ALIGNMENT
>>        unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
>> @@ -988,12 +992,69 @@ align_variable (tree decl, bool dont_out
>>       }
>>  #endif
>>      }
>> +#ifdef DATA_ABI_ALIGNMENT
>> +  else if (! DECL_USER_ALIGN (decl))
>> +    {
>> +      unsigned int data_align = DATA_ABI_ALIGNMENT (TREE_TYPE (decl), align);
>> +      /* For backwards compatibility, don't assume the ABI alignment for
>> +      TLS variables.  */
>> +      if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD)
>> +     align = data_align;
>> +    }
>> +#endif
>
> This structure would seem to do the wrong thing if DATA_ABI_ALIGNMENT is
> defined, but DATA_ALIGNMENT isn't.  And while I realize you documented it, I
> don't like the restriction that D_A /must/ return something larger than D_A_A.
>  All that means is that in complex cases D_A will have to call D_A_A itself.
>
> I would think that it would be better to rearrange as
>
>   if (!D_U_A)
>     {
>   #ifdef D_A_A
>       align = ...
>   #endif
>   #ifdef D_A
>       if (d_b_t_c_d_p)
>         align = ...
>   #endif
>     }
>
> Why the special case for TLS?  If we really want that special case surely that
> test should go into D_A_A itself, and not here in generic code.
>
>> Bootstrapped/regtested on x86_64-linux and i686-linux.  No idea about other
>> targets, I've kept them all using DATA_ALIGNMENT, which is considered
>> optimization increase only now, if there is some ABI mandated alignment
>> increase on other targets, that should be done in DATA_ABI_ALIGNMENT as
>> well as DATA_ALIGNMENT.
>
> I've had a brief look over the instances of D_A within the tree atm.  Most of
> them carry the cut-n-paste comment "for the same reasons".  These I believe
> never intended an ABI change, and were really only interested in optimization.
>
> But these I think require a good hard look to see if they really intended an
> ABI alignment:
>
> c6x     comment explicitly mentions abi
> cris    compiler options for alignment -- systemwide or local?
> mmix    comment mentions GETA instruction
> s390    comment mentions LARL instruction
> rs6000  SPE and E500 portion of the alignment non-optional?
>
> Relevant port maintainers CCed.
>
>
> r~

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-12 17:52   ` Edmar Wienskoski
@ 2013-06-13  7:41     ` Alan Modra
  2013-06-13 15:37       ` Alan Modra
  0 siblings, 1 reply; 35+ messages in thread
From: Alan Modra @ 2013-06-13  7:41 UTC (permalink / raw)
  To: Edmar Wienskoski; +Cc: gcc-patches, David Edelsohn

On Wed, Jun 12, 2013 at 12:52:03PM -0500, Edmar Wienskoski wrote:
> The e500v2 (SPE) hardware is such that if the address of vector (double world
> load / stores) are not double world aligned the instruction will trap.
> 
> So this alignment is not optional.

Vector type alignment is also specified by the ppc64 abi.  I think we
want the following.  Note that DATA_ALIGNMENT has been broken for
vectors right from the initial vector support (and the error was
copied for e500 double).  For example

typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
vec_align x = { 0, 0, 0, 0 };

currently loses the extra alignment.  Fixed by never decreasing
alignment in DATA_ABI_ALIGNMENT.  Testing in progress.  OK to
apply assuming bootstrap is good?  (I think I need a change in
offsettable_ok_by_alignment too.  I'll do that in a separate patch.)

	* config/rs6000/rs6000.h (DATA_ABI_ALIGNMENT): Define.
	(DATA_ALIGNMENT): Remove alignment already covered by above.
	(LOCAL_ALIGNMENT): Use both DATA_ABI_ALIGNMENT and DATA_ALIGNMENT.

Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 200055)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -817,7 +817,8 @@ extern unsigned rs6000_pointer_size;
    local store.  TYPE is the data type, and ALIGN is the alignment
    that the object would ordinarily have.  */
 #define LOCAL_ALIGNMENT(TYPE, ALIGN)				\
-  DATA_ALIGNMENT (TYPE, ALIGN)
+  ({unsigned int _align = DATA_ABI_ALIGNMENT (TYPE, ALIGN);	\
+    DATA_ALIGNMENT (TYPE, _align); })
 
 /* Alignment of field after `int : 0' in a structure.  */
 #define EMPTY_FIELD_BOUNDARY 32
@@ -837,21 +838,26 @@ extern unsigned rs6000_pointer_size;
    ? BITS_PER_WORD                                               \
    : (ALIGN))
 
-/* Make arrays of chars word-aligned for the same reasons.
-   Align vectors to 128 bits.  Align SPE vectors and E500 v2 doubles to
+/* Make arrays of chars word-aligned for the same reasons.  */
+#define DATA_ALIGNMENT(TYPE, ALIGN)					\
+  ((TREE_CODE (TYPE) == ARRAY_TYPE					\
+    && TYPE_MODE (TREE_TYPE (TYPE)) == QImode				\
+    && (ALIGN) < BITS_PER_WORD) ? BITS_PER_WORD : (ALIGN))
+
+/* Align vectors to 128 bits.  Align SPE vectors and E500 v2 doubles to
    64 bits.  */
-#define DATA_ALIGNMENT(TYPE, ALIGN)					\
+#define DATA_ABI_ALIGNMENT(TYPE, ALIGN)					\
   (TREE_CODE (TYPE) == VECTOR_TYPE					\
    ? (((TARGET_SPE && SPE_VECTOR_MODE (TYPE_MODE (TYPE)))		\
        || (TARGET_PAIRED_FLOAT && PAIRED_VECTOR_MODE (TYPE_MODE (TYPE)))) \
-      ? 64 : 128)							\
-   : ((TARGET_E500_DOUBLE						\
-       && TREE_CODE (TYPE) == REAL_TYPE					\
-       && TYPE_MODE (TYPE) == DFmode)					\
-      ? 64								\
-      : (TREE_CODE (TYPE) == ARRAY_TYPE					\
-	 && TYPE_MODE (TREE_TYPE (TYPE)) == QImode			\
-	 && (ALIGN) < BITS_PER_WORD) ? BITS_PER_WORD : (ALIGN)))
+      ? ((ALIGN) < 64 ? 64 : (ALIGN))					\
+      : ((ALIGN) < 128 ? 128 : (ALIGN)))				\
+   : (TARGET_E500_DOUBLE						\
+      && TREE_CODE (TYPE) == REAL_TYPE					\
+      && TYPE_MODE (TYPE) == DFmode					\
+      && (ALIGN) < 64)							\
+   ? 64									\
+   : (ALIGN))
 
 /* Nonzero if move instructions will actually fail to work
    when given unaligned data.  */


-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-13  7:41     ` Alan Modra
@ 2013-06-13 15:37       ` Alan Modra
  2013-06-13 15:42         ` Jakub Jelinek
  2013-06-17 23:37         ` David Edelsohn
  0 siblings, 2 replies; 35+ messages in thread
From: Alan Modra @ 2013-06-13 15:37 UTC (permalink / raw)
  To: Edmar Wienskoski, gcc-patches, David Edelsohn

On Thu, Jun 13, 2013 at 05:10:51PM +0930, Alan Modra wrote:
> On Wed, Jun 12, 2013 at 12:52:03PM -0500, Edmar Wienskoski wrote:
> > The e500v2 (SPE) hardware is such that if the address of vector (double world
> > load / stores) are not double world aligned the instruction will trap.
> > 
> > So this alignment is not optional.
> 
> Vector type alignment is also specified by the ppc64 abi.  I think we
> want the following.  Note that DATA_ALIGNMENT has been broken for
> vectors right from the initial vector support (and the error was
> copied for e500 double).  For example
> 
> typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
> vec_align x = { 0, 0, 0, 0 };
> 
> currently loses the extra alignment.  Fixed by never decreasing
> alignment in DATA_ABI_ALIGNMENT.  Testing in progress.  OK to
> apply assuming bootstrap is good?  (I think I need a change in
> offsettable_ok_by_alignment too.  I'll do that in a separate patch.)

Revised patch with offsettable_ok_by_alignment change, avoiding dumb
idea of using statement expressions.  This one actually bootstraps and
passes regression testing.

	* config/rs6000/rs6000.h (enum data_align): New.
	(LOCAL_ALIGNMENT, DATA_ALIGNMENT): Use rs6000_data_alignment.
	(DATA_ABI_ALIGNMENT): Define.
	(CONSTANT_ALIGNMENT): Correct comment.
	* config/rs6000/rs6000-protos.h (rs6000_data_alignment): Declare.
	* config/rs6000/rs6000.c (rs6000_data_alignment): New function.
	(offsettable_ok_by_alignment): Align by DATA_ABI_ALIGNMENT.
	Pass "type" not "decl" to DATA_ALIGNMENT.

Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 200055)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -813,12 +813,6 @@ extern unsigned rs6000_pointer_size;
 /* No data type wants to be aligned rounder than this.  */
 #define BIGGEST_ALIGNMENT 128
 
-/* A C expression to compute the alignment for a variables in the
-   local store.  TYPE is the data type, and ALIGN is the alignment
-   that the object would ordinarily have.  */
-#define LOCAL_ALIGNMENT(TYPE, ALIGN)				\
-  DATA_ALIGNMENT (TYPE, ALIGN)
-
 /* Alignment of field after `int : 0' in a structure.  */
 #define EMPTY_FIELD_BOUNDARY 32
 
@@ -828,8 +822,15 @@ extern unsigned rs6000_pointer_size;
 /* A bit-field declared as `int' forces `int' alignment for the struct.  */
 #define PCC_BITFIELD_TYPE_MATTERS 1
 
-/* Make strings word-aligned so strcpy from constants will be faster.
-   Make vector constants quadword aligned.  */
+enum data_align { align_abi, align_opt, align_both };
+
+/* A C expression to compute the alignment for a variables in the
+   local store.  TYPE is the data type, and ALIGN is the alignment
+   that the object would ordinarily have.  */
+#define LOCAL_ALIGNMENT(TYPE, ALIGN)				\
+  rs6000_data_alignment (TYPE, ALIGN, align_both)
+
+/* Make strings word-aligned so strcpy from constants will be faster.  */
 #define CONSTANT_ALIGNMENT(EXP, ALIGN)                           \
   (TREE_CODE (EXP) == STRING_CST	                         \
    && (STRICT_ALIGNMENT || !optimize_size)                       \
@@ -837,21 +838,14 @@ extern unsigned rs6000_pointer_size;
    ? BITS_PER_WORD                                               \
    : (ALIGN))
 
-/* Make arrays of chars word-aligned for the same reasons.
-   Align vectors to 128 bits.  Align SPE vectors and E500 v2 doubles to
+/* Make arrays of chars word-aligned for the same reasons.  */
+#define DATA_ALIGNMENT(TYPE, ALIGN) \
+  rs6000_data_alignment (TYPE, ALIGN, align_opt)
+
+/* Align vectors to 128 bits.  Align SPE vectors and E500 v2 doubles to
    64 bits.  */
-#define DATA_ALIGNMENT(TYPE, ALIGN)					\
-  (TREE_CODE (TYPE) == VECTOR_TYPE					\
-   ? (((TARGET_SPE && SPE_VECTOR_MODE (TYPE_MODE (TYPE)))		\
-       || (TARGET_PAIRED_FLOAT && PAIRED_VECTOR_MODE (TYPE_MODE (TYPE)))) \
-      ? 64 : 128)							\
-   : ((TARGET_E500_DOUBLE						\
-       && TREE_CODE (TYPE) == REAL_TYPE					\
-       && TYPE_MODE (TYPE) == DFmode)					\
-      ? 64								\
-      : (TREE_CODE (TYPE) == ARRAY_TYPE					\
-	 && TYPE_MODE (TREE_TYPE (TYPE)) == QImode			\
-	 && (ALIGN) < BITS_PER_WORD) ? BITS_PER_WORD : (ALIGN)))
+#define DATA_ABI_ALIGNMENT(TYPE, ALIGN) \
+  rs6000_data_alignment (TYPE, ALIGN, align_abi)
 
 /* Nonzero if move instructions will actually fail to work
    when given unaligned data.  */
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 200055)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -141,6 +141,7 @@ extern int rs6000_loop_align (rtx);
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
+extern unsigned int rs6000_data_alignment (tree, unsigned int, enum data_align);
 extern unsigned int rs6000_special_round_type_align (tree, unsigned int,
 						     unsigned int);
 extern unsigned int darwin_rs6000_special_round_type_align (tree, unsigned int,
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 200055)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -5384,6 +5390,44 @@ invalid_e500_subreg (rtx op, enum machine_mode mod
   return false;
 }
 
+/* Return alignment of TYPE.  Existing alignment is ALIGN.  HOW
+   selects whether the alignment is abi mandated, optional, or
+   both abi and optional alignment.  */
+   
+unsigned int
+rs6000_data_alignment (tree type, unsigned int align, enum data_align how)
+{
+  if (how != align_opt)
+    {
+      if (TREE_CODE (type) == VECTOR_TYPE)
+	{
+	  if ((TARGET_SPE && SPE_VECTOR_MODE (TYPE_MODE (type)))
+	      || (TARGET_PAIRED_FLOAT && PAIRED_VECTOR_MODE (TYPE_MODE (type))))
+	    {
+	      if (align < 64)
+		align = 64;
+	    }
+	  else if (align < 128)
+	    align = 128;
+	}
+      else if (TARGET_E500_DOUBLE
+	       && TREE_CODE (type) == REAL_TYPE
+	       && TYPE_MODE (type) == DFmode
+	       && align < 64)
+	align = 64;
+    }
+
+  if (how != align_abi)
+    {
+      if (TREE_CODE (type) == ARRAY_TYPE
+	  && TYPE_MODE (TREE_TYPE (type)) == QImode
+	  && align < BITS_PER_WORD)
+	align = BITS_PER_WORD;
+    }
+
+  return align;
+}
+
 /* AIX increases natural record alignment to doubleword if the first
    field is an FP double while the FP fields remain word aligned.  */
 
@@ -5774,10 +5818,11 @@ offsettable_ok_by_alignment (rtx op, HOST_WIDE_INT
       type = TREE_TYPE (decl);
 
       dalign = TYPE_ALIGN (type);
+      dalign = DATA_ABI_ALIGNMENT (type, dalign);
       if (CONSTANT_CLASS_P (decl))
 	dalign = CONSTANT_ALIGNMENT (decl, dalign);
       else
-	dalign = DATA_ALIGNMENT (decl, dalign);
+	dalign = DATA_ALIGNMENT (type, dalign);
 
       if (dsize == 0)
 	{

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-13 15:37       ` Alan Modra
@ 2013-06-13 15:42         ` Jakub Jelinek
  2013-06-13 22:48           ` Alan Modra
  2013-06-17 23:37         ` David Edelsohn
  1 sibling, 1 reply; 35+ messages in thread
From: Jakub Jelinek @ 2013-06-13 15:42 UTC (permalink / raw)
  To: Edmar Wienskoski, gcc-patches, David Edelsohn

On Fri, Jun 14, 2013 at 01:07:01AM +0930, Alan Modra wrote:
> @@ -5774,10 +5818,11 @@ offsettable_ok_by_alignment (rtx op, HOST_WIDE_INT
>        type = TREE_TYPE (decl);
>  
>        dalign = TYPE_ALIGN (type);
> +      dalign = DATA_ABI_ALIGNMENT (type, dalign);
>        if (CONSTANT_CLASS_P (decl))
>  	dalign = CONSTANT_ALIGNMENT (decl, dalign);
>        else
> -	dalign = DATA_ALIGNMENT (decl, dalign);
> +	dalign = DATA_ALIGNMENT (type, dalign);
>  
>        if (dsize == 0)
>  	{

What is this code trying to do?  Shouldn't it just use DECL_ALIGN
which should be set to the right value from get_variable_alignment?
I mean, if !decl_binds_to_current_def_p (decl), then using DATA_ALIGNMENT
or CONSTANT_ALIGNMENT (for anything but actually emitting the var into
object, or just as an optimization hint that very likely the decl will be
aligned enough, but not guaranteed), which are optimization, is wrong
(an ABI problem).

	Jakub

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-13 15:42         ` Jakub Jelinek
@ 2013-06-13 22:48           ` Alan Modra
  2013-06-14  9:00             ` Jakub Jelinek
  0 siblings, 1 reply; 35+ messages in thread
From: Alan Modra @ 2013-06-13 22:48 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Edmar Wienskoski, gcc-patches, David Edelsohn

On Thu, Jun 13, 2013 at 05:42:17PM +0200, Jakub Jelinek wrote:
> On Fri, Jun 14, 2013 at 01:07:01AM +0930, Alan Modra wrote:
> > @@ -5774,10 +5818,11 @@ offsettable_ok_by_alignment (rtx op, HOST_WIDE_INT
> >        type = TREE_TYPE (decl);
> >  
> >        dalign = TYPE_ALIGN (type);
> > +      dalign = DATA_ABI_ALIGNMENT (type, dalign);
> >        if (CONSTANT_CLASS_P (decl))
> >  	dalign = CONSTANT_ALIGNMENT (decl, dalign);
> >        else
> > -	dalign = DATA_ALIGNMENT (decl, dalign);
> > +	dalign = DATA_ALIGNMENT (type, dalign);
> >  
> >        if (dsize == 0)
> >  	{
> 
> What is this code trying to do?  Shouldn't it just use DECL_ALIGN
> which should be set to the right value from get_variable_alignment?
> I mean, if !decl_binds_to_current_def_p (decl), then using DATA_ALIGNMENT
> or CONSTANT_ALIGNMENT (for anything but actually emitting the var into
> object, or just as an optimization hint that very likely the decl will be
> aligned enough, but not guaranteed), which are optimization, is wrong
> (an ABI problem).

It is handling !DECL_P trees, which must be local.  I know I saw
STRING_CST here when I wrote offsettable_ok_by_alignment, hence the
use of CONSTANT_ALIGNMENT.  I'm not so sure about the need for
DATA_ALIGNMENT now, but if it was correct before then we ought to
be using both DATA_ABI_ALIGNMENT and DATA_ALIGNMENT after your
changes.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-13 22:48           ` Alan Modra
@ 2013-06-14  9:00             ` Jakub Jelinek
  2013-06-14 10:42               ` Alan Modra
  0 siblings, 1 reply; 35+ messages in thread
From: Jakub Jelinek @ 2013-06-14  9:00 UTC (permalink / raw)
  To: Edmar Wienskoski, gcc-patches, David Edelsohn

On Fri, Jun 14, 2013 at 08:18:19AM +0930, Alan Modra wrote:
> On Thu, Jun 13, 2013 at 05:42:17PM +0200, Jakub Jelinek wrote:
> > On Fri, Jun 14, 2013 at 01:07:01AM +0930, Alan Modra wrote:
> > > @@ -5774,10 +5818,11 @@ offsettable_ok_by_alignment (rtx op, HOST_WIDE_INT
> > >        type = TREE_TYPE (decl);
> > >  
> > >        dalign = TYPE_ALIGN (type);
> > > +      dalign = DATA_ABI_ALIGNMENT (type, dalign);
> > >        if (CONSTANT_CLASS_P (decl))
> > >  	dalign = CONSTANT_ALIGNMENT (decl, dalign);
> > >        else
> > > -	dalign = DATA_ALIGNMENT (decl, dalign);
> > > +	dalign = DATA_ALIGNMENT (type, dalign);
> > >  
> > >        if (dsize == 0)
> > >  	{
> > 
> > What is this code trying to do?  Shouldn't it just use DECL_ALIGN
> > which should be set to the right value from get_variable_alignment?
> > I mean, if !decl_binds_to_current_def_p (decl), then using DATA_ALIGNMENT
> > or CONSTANT_ALIGNMENT (for anything but actually emitting the var into
> > object, or just as an optimization hint that very likely the decl will be
> > aligned enough, but not guaranteed), which are optimization, is wrong
> > (an ABI problem).
> 
> It is handling !DECL_P trees, which must be local.  I know I saw
> STRING_CST here when I wrote offsettable_ok_by_alignment, hence the
> use of CONSTANT_ALIGNMENT.  I'm not so sure about the need for
> DATA_ALIGNMENT now, but if it was correct before then we ought to
> be using both DATA_ABI_ALIGNMENT and DATA_ALIGNMENT after your
> changes.

Yeah, then it makes sense.  Sorry for not looking up earlier that this is
the !DECL_P case.

As for the 
typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));                                                                            
vec_align x = { 0, 0, 0, 0 };                                                                                                                    
changes, that is ABI changing bugfix, so the question is, are you fine with
breaking the ABI (between 4.8 and 4.9, or if you wanted to backport it to
4.8 too (I certainly plan to backport the non-ppc DATA_ABI_ALIGNMENT changes
to 4.8.2, already am using it in our compilers))?  The other option is
to fix the ABI, but keep things backwards ABI compatible.  That would be
done by decreasing the alignment as it used to do before in DATA_ABI_ALIGNMENT,
and increasing it to the desirable level only in DATA_ALIGNMENT.  That has
the effect that when emitting the decls into assembly e.g. the above will
now be correctly 32 byte aligned, but accesses to such decl in compiler
generated code will only assume that alignment if
decl_binds_to_current_def_p, otherwise they will keep assuming the old
(broken) lowered alignment.  At least for 4.8 backport IMHO that would be a
better idea (but of course would need big comment explaning it).

	Jakub

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-14  9:00             ` Jakub Jelinek
@ 2013-06-14 10:42               ` Alan Modra
  2013-06-14 10:54                 ` Jakub Jelinek
  0 siblings, 1 reply; 35+ messages in thread
From: Alan Modra @ 2013-06-14 10:42 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Edmar Wienskoski, gcc-patches, David Edelsohn

On Fri, Jun 14, 2013 at 10:59:52AM +0200, Jakub Jelinek wrote:
> On Fri, Jun 14, 2013 at 08:18:19AM +0930, Alan Modra wrote:
> > It is handling !DECL_P trees, which must be local.  I know I saw
> > STRING_CST here when I wrote offsettable_ok_by_alignment, hence the
> > use of CONSTANT_ALIGNMENT.  I'm not so sure about the need for
> > DATA_ALIGNMENT now, but if it was correct before then we ought to
> > be using both DATA_ABI_ALIGNMENT and DATA_ALIGNMENT after your
> > changes.
> 
> Yeah, then it makes sense.  Sorry for not looking up earlier that this is
> the !DECL_P case.

Your comment prodded me into looking at whether the !DECL_P code is
needed in 4.9, and it looks like we never see !DECL_P trees..
Bootstrap and regression tests all langs on powerpc64 didn't hit a
gcc_unreachable() I put there, both with -fsection-anchors and
-fno-section-anchors.  David, please consider that piece of the patch
retracted.

> As for the 
> typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));                                                                            
> vec_align x = { 0, 0, 0, 0 };                                                                                                                    
> changes, that is ABI changing bugfix, so the question is, are you fine with
> breaking the ABI (between 4.8 and 4.9, or if you wanted to backport it to
> 4.8 too (I certainly plan to backport the non-ppc DATA_ABI_ALIGNMENT changes
> to 4.8.2, already am using it in our compilers))?  The other option is
> to fix the ABI, but keep things backwards ABI compatible.  That would be
> done by decreasing the alignment as it used to do before in DATA_ABI_ALIGNMENT,
> and increasing it to the desirable level only in DATA_ALIGNMENT.  That has
> the effect that when emitting the decls into assembly e.g. the above will
> now be correctly 32 byte aligned, but accesses to such decl in compiler
> generated code will only assume that alignment if
> decl_binds_to_current_def_p, otherwise they will keep assuming the old
> (broken) lowered alignment.  At least for 4.8 backport IMHO that would be a
> better idea (but of course would need big comment explaning it).

I see your point, but for there to be a real problem we'd need
a) A library exporting such a type with (supposed) increased
   alignment, and,
b) gcc would need to make use of the increased alignment.

(a) must be rare or non-existent or you'd think we would have had a
bug report about lack of user alignment in vector typedefs.  The code
has been like this since 2001-11-07, so users have had a long time to
discover it.  (Of course, this is an argument for just ignoring the
bug too.)

(b) doesn't happen in the rs6000 backend as far as I'm aware.  Do you
know whether there is some optimisation based on alignment in generic
parts of gcc?  A quick test like

typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
vec_align x = { 0, 0, 0, 0 };

long f1 (void)
{
  return (long) &x & -32;
}

static int y __attribute__ ((aligned(32)));

long f2 (void)
{
  return (long) &y & -32;
}

shows the "& -32" in both functions isn't optimised away.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-14 10:42               ` Alan Modra
@ 2013-06-14 10:54                 ` Jakub Jelinek
  2013-06-14 14:57                   ` Alan Modra
  0 siblings, 1 reply; 35+ messages in thread
From: Jakub Jelinek @ 2013-06-14 10:54 UTC (permalink / raw)
  To: Edmar Wienskoski, gcc-patches, David Edelsohn

On Fri, Jun 14, 2013 at 08:12:02PM +0930, Alan Modra wrote:
> > As for the 
> > typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));                                                                            
> > vec_align x = { 0, 0, 0, 0 };                                                                                                                    
> > changes, that is ABI changing bugfix, so the question is, are you fine with
> > breaking the ABI (between 4.8 and 4.9, or if you wanted to backport it to
> > 4.8 too (I certainly plan to backport the non-ppc DATA_ABI_ALIGNMENT changes
> > to 4.8.2, already am using it in our compilers))?  The other option is
> > to fix the ABI, but keep things backwards ABI compatible.  That would be
> > done by decreasing the alignment as it used to do before in DATA_ABI_ALIGNMENT,
> > and increasing it to the desirable level only in DATA_ALIGNMENT.  That has
> > the effect that when emitting the decls into assembly e.g. the above will
> > now be correctly 32 byte aligned, but accesses to such decl in compiler
> > generated code will only assume that alignment if
> > decl_binds_to_current_def_p, otherwise they will keep assuming the old
> > (broken) lowered alignment.  At least for 4.8 backport IMHO that would be a
> > better idea (but of course would need big comment explaning it).
> 
> I see your point, but for there to be a real problem we'd need
> a) A library exporting such a type with (supposed) increased
>    alignment, and,
> b) gcc would need to make use of the increased alignment.
> 
> (a) must be rare or non-existent or you'd think we would have had a
> bug report about lack of user alignment in vector typedefs.  The code
> has been like this since 2001-11-07, so users have had a long time to
> discover it.  (Of course, this is an argument for just ignoring the
> bug too.)

It doesn't have to be an exported symbol from a library, it is enough to
compile some objects using one compiler and other objects using another
compiler, then link into the same library.

> (b) doesn't happen in the rs6000 backend as far as I'm aware.  Do you
> know whether there is some optimisation based on alignment in generic
> parts of gcc?  A quick test like

Tons of them, the DECL_ALIGN value is used say by get_pointer_alignment,
vectorizer assumptions, is added to MEM_ATTRS, so anything looking at
alignment in RTL can optimize too.

> typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
> vec_align x = { 0, 0, 0, 0 };
> 
> long f1 (void)
> {
>   return (long) &x & -32;
> }

Try (long) &x & 31; ?  That &x & -32 not being optimized into &x
is guess a missed optimization.

Consider if you put:
typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
vec_align x = { 0, 0, 0, 0 };
into one TU and compile with gcc 4.8.1, then
typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
extern vec_align x;

long f1 (void)
{
  return (long) &x & 31;
}
in another TU and compile with gcc trunk after your patch.  I bet
it will be optimized into return 0; by the trunk + your patch compiler,
while the alignment will be actually just 16 byte.

	Jakub

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-14 10:54                 ` Jakub Jelinek
@ 2013-06-14 14:57                   ` Alan Modra
  0 siblings, 0 replies; 35+ messages in thread
From: Alan Modra @ 2013-06-14 14:57 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Edmar Wienskoski, gcc-patches, David Edelsohn

On Fri, Jun 14, 2013 at 12:54:40PM +0200, Jakub Jelinek wrote:
> On Fri, Jun 14, 2013 at 08:12:02PM +0930, Alan Modra wrote:
> > I see your point, but for there to be a real problem we'd need
> > a) A library exporting such a type with (supposed) increased
> >    alignment, and,
> > b) gcc would need to make use of the increased alignment.
> > 
> > (a) must be rare or non-existent or you'd think we would have had a
> > bug report about lack of user alignment in vector typedefs.  The code
> > has been like this since 2001-11-07, so users have had a long time to
> > discover it.  (Of course, this is an argument for just ignoring the
> > bug too.)
> 
> It doesn't have to be an exported symbol from a library, it is enough to
> compile some objects using one compiler and other objects using another
> compiler, then link into the same library.

OK.

> Try (long) &x & 31; ?  That &x & -32 not being optimized into &x
> is guess a missed optimization.

Huh, trust me to hit another bug. :)

> Consider if you put:
> typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
> vec_align x = { 0, 0, 0, 0 };
> into one TU and compile with gcc 4.8.1, then
> typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
> extern vec_align x;
> 
> long f1 (void)
> {
>   return (long) &x & 31;
> }
> in another TU and compile with gcc trunk after your patch.  I bet
> it will be optimized into return 0; by the trunk + your patch compiler,
> while the alignment will be actually just 16 byte.

Right.  Counterpoint is that gcc made exactly the same sort of error
across TUs and even in the same TU prior to my change.  eg.

typedef int vec_align __attribute__ ((vector_size(16), aligned(32)));
vec_align x = { 0, 0, 0, 0 };
long f1 (void) { return (long)&x & 31; }
int y __attribute__ ((vector_size(16), aligned(32))) = { 0, 0, 0, 0 };
long f2 (void) { return (long)&y & 31; }

compiles to

.L.f1:
        li 3,0
        blr
..
.L.f2:
        li 3,0
        blr
..
        .globl y
        .lcomm  y,16,32
        .type   y, @object
        .globl x
        .lcomm  x,16,16
        .type   x, @object
        .ident  "GCC: (GNU) 4.7.2 20120921 (Red Hat 4.7.2-2)"

My implementation of rs6000_data_alignment therefore doesn't introduce
a *new* ABI incompatibility.  I question whether it is worth
complicating rs6000_data_alignment, especially since your suggestion
of using the older buggy alignment in DATA_ABI_ALIGNMENT then
increasing in DATA_ALIGNMENT isn't as simple as it sounds.  We're
not talking about some fixed increase in DATA_ALIGNMENT but what we
want is the value of alignment before DATA_ABI_ALIGNMENT.  Perhaps
that could be retrieved from TYPE_ALIGN (type) and
MAX_OFILE_ALIGNMENT, but that would make our DATA_ALIGNMENT the only
target to need such tricks.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-13 15:37       ` Alan Modra
  2013-06-13 15:42         ` Jakub Jelinek
@ 2013-06-17 23:37         ` David Edelsohn
  1 sibling, 0 replies; 35+ messages in thread
From: David Edelsohn @ 2013-06-17 23:37 UTC (permalink / raw)
  To: Edmar Wienskoski, GCC Patches, Alan Modra

On Thu, Jun 13, 2013 at 11:37 AM, Alan Modra <amodra@gmail.com> wrote:

> Revised patch with offsettable_ok_by_alignment change, avoiding dumb
> idea of using statement expressions.  This one actually bootstraps and
> passes regression testing.
>
>         * config/rs6000/rs6000.h (enum data_align): New.
>         (LOCAL_ALIGNMENT, DATA_ALIGNMENT): Use rs6000_data_alignment.
>         (DATA_ABI_ALIGNMENT): Define.
>         (CONSTANT_ALIGNMENT): Correct comment.
>         * config/rs6000/rs6000-protos.h (rs6000_data_alignment): Declare.
>         * config/rs6000/rs6000.c (rs6000_data_alignment): New function.

The revised patch, without the DECL_P part is okay.

The original code produced the necessary alignment and neither of us
can find any code in public packages that increases the alignment for
PPC vector types.  While there is the possibility that a user could
encounter an object file produced by an older GCC with less strict
alignment and a version of GCC with this fix would make an incorrect
assumption, this does not seem very likely in practice.

Thanks, David

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: FW: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
       [not found] ` <0EFAB2BDD0F67E4FB6CCC8B9F87D75692B5204DB@IRSMSX101.ger.corp.intel.com>
@ 2013-06-19  7:02   ` Igor Zamyatin
  2013-06-19  7:05     ` Jakub Jelinek
  0 siblings, 1 reply; 35+ messages in thread
From: Igor Zamyatin @ 2013-06-19  7:02 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek

The change also affects vectorizer in avx case which could be seen for
gcc.dg/tree-ssa/loop-19.c test.

After the change report says

loop-19_bad.c:16: note: === vect_analyze_data_refs_alignment ===
loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
loop-19_bad.c:16: note: can't force alignment of ref: a[j_9]
loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
loop-19_bad.c:16: note: can't force alignment of ref: c[j_9]

AFAICS first condition in ix86_data_alignment was true before the
change so 256 was a return value.

Do we need to tweak this test also?

Thanks,
Igor

> Hi!
>
> This PR is about DATA_ALIGNMENT macro increasing alignment of some decls for optimization purposes beyond ABI mandated levels.  It is fine to emit the vars aligned as much as we want for optimization purposes, but if we can't be sure that references to that decl bind to the definition we increased the alignment on (e.g. common variables, or -fpic code without hidden visibility, weak vars etc.), we can't assume that alignment.
> As DECL_ALIGN is used for both the alignment emitted for the definitions and alignment assumed on code referring to it, this patch increases DECL_ALIGN only on decls where decl_binds_to_current_def_p is true, and otherwise the optimization part on top of that emits only when aligning definition.
> On x86_64, DATA_ALIGNMENT macro was partly an optimization, partly ABI mandated alignment increase, so I've introduced a new macro, DATA_ABI_ALIGNMENT, which is the ABI mandated increase only (on x86-64 I think the only one is that arrays with size 16 bytes or more (and VLAs, but that is not handled by DATA*ALIGNMENT) are at least 16 byte aligned).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux.  No idea about other targets, I've kept them all using DATA_ALIGNMENT, which is considered optimization increase only now, if there is some ABI mandated alignment increase on other targets, that should be done in DATA_ABI_ALIGNMENT as well as DATA_ALIGNMENT.  The patch causes some vectorization regressions (tweaked in the testsuite), especially for common vars where we used to align say common arrays to 256 bits rather than the ABI mandated 128 bits, or for -fpic code, but I'm afraid we need to live with that, if you compile another file with say icc or some other compiler which doesn't increase alignment beyond ABI mandated level and that other file defines the var say as non-common, we have wrong-code.
>
> 2013-06-07  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/56564
>         * varasm.c (align_variable): Don't use DATA_ALIGNMENT or
>         CONSTANT_ALIGNMENT if !decl_binds_to_current_def_p (decl).
>         Use DATA_ABI_ALIGNMENT for that case instead if defined.
>         (get_variable_align): New function.
>         (get_variable_section, emit_bss, emit_common,
>         assemble_variable_contents, place_block_symbol): Use
>         get_variable_align instead of DECL_ALIGN.
>         (assemble_noswitch_variable): Add align argument, use it
>         instead of DECL_ALIGN.
>         (assemble_variable): Adjust caller.  Use get_variable_align
>         instead of DECL_ALIGN.
>         * config/i386/i386.h (DATA_ALIGNMENT): Adjust x86_data_alignment
>         caller.
>         (DATA_ABI_ALIGNMENT): Define.
>         * config/i386/i386-protos.h (x86_data_alignment): Adjust prototype.
>         * config/i386/i386.c (x86_data_alignment): Add opt argument.  If
>         opt is false, only return the psABI mandated alignment increase.
>         * doc/tm.texi.in (DATA_ABI_ALIGNMENT): Document.
>         * doc/tm.texi: Regenerated.
>
>         * gcc.target/i386/pr56564-1.c: New test.
>         * gcc.target/i386/pr56564-2.c: New test.
>         * gcc.target/i386/pr56564-3.c: New test.
>         * gcc.target/i386/pr56564-4.c: New test.
>         * gcc.target/i386/avx256-unaligned-load-4.c: Add -fno-common.
>         * gcc.target/i386/avx256-unaligned-store-1.c: Likewise.
>         * gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
>         * gcc.target/i386/avx256-unaligned-store-4.c: Likewise.
>         * gcc.target/i386/vect-sizes-1.c: Likewise.
>         * gcc.target/i386/memcpy-1.c: Likewise.
>         * gcc.dg/vect/costmodel/i386/costmodel-vect-31.c (tmp): Initialize.
>         * gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c (tmp): Likewise.
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: FW: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-19  7:02   ` FW: " Igor Zamyatin
@ 2013-06-19  7:05     ` Jakub Jelinek
  0 siblings, 0 replies; 35+ messages in thread
From: Jakub Jelinek @ 2013-06-19  7:05 UTC (permalink / raw)
  To: Igor Zamyatin; +Cc: gcc-patches

On Wed, Jun 19, 2013 at 11:01:59AM +0400, Igor Zamyatin wrote:
> The change also affects vectorizer in avx case which could be seen for
> gcc.dg/tree-ssa/loop-19.c test.
> 
> After the change report says
> 
> loop-19_bad.c:16: note: === vect_analyze_data_refs_alignment ===
> loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
> loop-19_bad.c:16: note: can't force alignment of ref: a[j_9]
> loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
> loop-19_bad.c:16: note: can't force alignment of ref: c[j_9]
> 
> AFAICS first condition in ix86_data_alignment was true before the
> change so 256 was a return value.
> 
> Do we need to tweak this test also?

I'd add -fno-common to the test.

	Jakub

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-19  7:22 ` Jakub Jelinek
  2013-06-19  8:38   ` Richard Biener
@ 2013-06-19 19:27   ` Kirill Yukhin
  1 sibling, 0 replies; 35+ messages in thread
From: Kirill Yukhin @ 2013-06-19 19:27 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Igor Zamyatin, gcc-patches

Patch preapproved. Jakub
Hi,
Checked into trunk: http://gcc.gnu.org/ml/gcc-cvs/2013-06/msg00646.html

Thanks, K

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-19  8:44     ` Jakub Jelinek
@ 2013-06-19 16:32       ` Mike Stump
  0 siblings, 0 replies; 35+ messages in thread
From: Mike Stump @ 2013-06-19 16:32 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, Igor Zamyatin, gcc-patches

On Jun 19, 2013, at 1:44 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Jun 19, 2013 at 10:38:47AM +0200, Richard Biener wrote:
>> On Wed, Jun 19, 2013 at 9:22 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>> On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote:
>>>> Right, as you did for other cases. It works here as well.
>>> 
>>> Patch preapproved.
>> 
>> I wonder how much code breaks these days when we enable -fno-common by
>> default? ...
> 
> Somebody would need to try it ;).

Been there done that.  That experiment has been running for at least 10 years now…  :-)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-19  8:38   ` Richard Biener
  2013-06-19  8:44     ` Jakub Jelinek
@ 2013-06-19 16:25     ` Mike Stump
  1 sibling, 0 replies; 35+ messages in thread
From: Mike Stump @ 2013-06-19 16:25 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, Igor Zamyatin, gcc-patches

On Jun 19, 2013, at 1:38 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> On Wed, Jun 19, 2013 at 9:22 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote:
>>> Right, as you did for other cases. It works here as well.
>> 
>> Patch preapproved.
> 
> I wonder how much code breaks these days when we enable -fno-common by
> default?

Not much.  gcc as Apple shipped it, has always been no-common, and indeed the shared library scheme doesn't like common.  There are a few test cases that would need -fcommon, but I don't think that is a big deal.  Most oss I think is -fno-common friendly.  I think gcc should default to c99, and I think c99 mode (and later) could use -fno-common by default.  For pre c99 modes, I'd probably just leave it to the dust bin of history.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-19  8:38   ` Richard Biener
@ 2013-06-19  8:44     ` Jakub Jelinek
  2013-06-19 16:32       ` Mike Stump
  2013-06-19 16:25     ` Mike Stump
  1 sibling, 1 reply; 35+ messages in thread
From: Jakub Jelinek @ 2013-06-19  8:44 UTC (permalink / raw)
  To: Richard Biener; +Cc: Igor Zamyatin, gcc-patches

On Wed, Jun 19, 2013 at 10:38:47AM +0200, Richard Biener wrote:
> On Wed, Jun 19, 2013 at 9:22 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> > On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote:
> >>  Right, as you did for other cases. It works here as well.
> >
> > Patch preapproved.
> 
> I wonder how much code breaks these days when we enable -fno-common by
> default? ...

Somebody would need to try it ;).  From vectorization POV, it surely would
be better if -fno-common was the default.

	Jakub

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-19  7:22 ` Jakub Jelinek
@ 2013-06-19  8:38   ` Richard Biener
  2013-06-19  8:44     ` Jakub Jelinek
  2013-06-19 16:25     ` Mike Stump
  2013-06-19 19:27   ` Kirill Yukhin
  1 sibling, 2 replies; 35+ messages in thread
From: Richard Biener @ 2013-06-19  8:38 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Igor Zamyatin, gcc-patches

On Wed, Jun 19, 2013 at 9:22 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote:
>>  Right, as you did for other cases. It works here as well.
>
> Patch preapproved.

I wonder how much code breaks these days when we enable -fno-common by
default? ...

Richard.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
  2013-06-19  7:12 Igor Zamyatin
@ 2013-06-19  7:22 ` Jakub Jelinek
  2013-06-19  8:38   ` Richard Biener
  2013-06-19 19:27   ` Kirill Yukhin
  0 siblings, 2 replies; 35+ messages in thread
From: Jakub Jelinek @ 2013-06-19  7:22 UTC (permalink / raw)
  To: Igor Zamyatin; +Cc: gcc-patches

On Wed, Jun 19, 2013 at 11:12:21AM +0400, Igor Zamyatin wrote:
>  Right, as you did for other cases. It works here as well.

Patch preapproved.

	Jakub

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)
@ 2013-06-19  7:12 Igor Zamyatin
  2013-06-19  7:22 ` Jakub Jelinek
  0 siblings, 1 reply; 35+ messages in thread
From: Igor Zamyatin @ 2013-06-19  7:12 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

 Right, as you did for other cases. It works here as well.

Thanks,
Igor

On Wed, Jun 19, 2013 at 11:05 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Jun 19, 2013 at 11:01:59AM +0400, Igor Zamyatin wrote:
>> The change also affects vectorizer in avx case which could be seen for
>> gcc.dg/tree-ssa/loop-19.c test.
>>
>> After the change report says
>>
>> loop-19_bad.c:16: note: === vect_analyze_data_refs_alignment ===
>> loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
>> loop-19_bad.c:16: note: can't force alignment of ref: a[j_9]
>> loop-19_bad.c:16: note: vect_compute_data_ref_alignment:
>> loop-19_bad.c:16: note: can't force alignment of ref: c[j_9]
>>
>> AFAICS first condition in ix86_data_alignment was true before the
>> change so 256 was a return value.
>>
>> Do we need to tweak this test also?
>
> I'd add -fno-common to the test.
>
>         Jakub

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2013-06-19 19:27 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-07 19:26 [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564) Jakub Jelinek
2013-06-07 20:43 ` Richard Henderson
2013-06-07 21:14   ` Jakub Jelinek
2013-06-08 15:13     ` Jakub Jelinek
2013-06-10 14:52     ` Richard Henderson
2013-06-10 15:45       ` Jakub Jelinek
2013-06-10 19:44         ` David Edelsohn
2013-06-11  0:44         ` DJ Delorie
2013-06-11  6:06           ` Jakub Jelinek
2013-06-11 15:20             ` DJ Delorie
2013-06-07 22:56   ` Hans-Peter Nilsson
2013-06-08 15:05     ` Jakub Jelinek
2013-06-10 10:51   ` Bernd Schmidt
2013-06-10 10:56     ` Jakub Jelinek
2013-06-10 11:03       ` Bernd Schmidt
2013-06-10 11:52   ` Ulrich Weigand
2013-06-12 17:52   ` Edmar Wienskoski
2013-06-13  7:41     ` Alan Modra
2013-06-13 15:37       ` Alan Modra
2013-06-13 15:42         ` Jakub Jelinek
2013-06-13 22:48           ` Alan Modra
2013-06-14  9:00             ` Jakub Jelinek
2013-06-14 10:42               ` Alan Modra
2013-06-14 10:54                 ` Jakub Jelinek
2013-06-14 14:57                   ` Alan Modra
2013-06-17 23:37         ` David Edelsohn
     [not found] ` <0EFAB2BDD0F67E4FB6CCC8B9F87D75692B5204DB@IRSMSX101.ger.corp.intel.com>
2013-06-19  7:02   ` FW: " Igor Zamyatin
2013-06-19  7:05     ` Jakub Jelinek
2013-06-19  7:12 Igor Zamyatin
2013-06-19  7:22 ` Jakub Jelinek
2013-06-19  8:38   ` Richard Biener
2013-06-19  8:44     ` Jakub Jelinek
2013-06-19 16:32       ` Mike Stump
2013-06-19 16:25     ` Mike Stump
2013-06-19 19:27   ` Kirill Yukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).