public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 1/3] Remove support for obsolete x86 -malign-foo options
  2017-04-18 18:30 [PATCH 0/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] version 8 Denys Vlasenko
@ 2017-04-18 18:30 ` Denys Vlasenko
  2017-05-06  7:22   ` Uros Bizjak
  2017-04-18 18:30 ` [PATCH 2/3] Temporary remove "at least 8 byte alignment" code from x86 Denys Vlasenko
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Denys Vlasenko @ 2017-04-18 18:30 UTC (permalink / raw)
  To: gcc-patches
  Cc: Denys Vlasenko, Andrew Pinski, Uros Bizjak, Bernd Schmidt,
	Sandra Loosemore

2017-04-18  Denys Vlasenko  <dvlasenk@redhat.com>

    * config/i386/i386-common.c (ix86_handle_option): Remove support
    for obsolete -malign-loops, -malign-jumps and -malign-functions
    options.
    * config/i386/i386.opt: Likewise.

Index: gcc/common/config/i386/i386-common.c
===================================================================
--- gcc/common/config/i386/i386-common.c	(revision 240663)
+++ gcc/common/config/i386/i386-common.c	(working copy)
@@ -998,38 +998,6 @@ ix86_handle_option (struct gcc_options *opts,
 	}
       return true;
 
-
-  /* Comes from final.c -- no real reason to change it.  */
-#define MAX_CODE_ALIGN 16
-
-    case OPT_malign_loops_:
-      warning_at (loc, 0, "-malign-loops is obsolete, use -falign-loops");
-      if (value > MAX_CODE_ALIGN)
-	error_at (loc, "-malign-loops=%d is not between 0 and %d",
-		  value, MAX_CODE_ALIGN);
-      else
-	opts->x_align_loops = 1 << value;
-      return true;
-
-    case OPT_malign_jumps_:
-      warning_at (loc, 0, "-malign-jumps is obsolete, use -falign-jumps");
-      if (value > MAX_CODE_ALIGN)
-	error_at (loc, "-malign-jumps=%d is not between 0 and %d",
-		  value, MAX_CODE_ALIGN);
-      else
-	opts->x_align_jumps = 1 << value;
-      return true;
-
-    case OPT_malign_functions_:
-      warning_at (loc, 0,
-		  "-malign-functions is obsolete, use -falign-functions");
-      if (value > MAX_CODE_ALIGN)
-	error_at (loc, "-malign-functions=%d is not between 0 and %d",
-		  value, MAX_CODE_ALIGN);
-      else
-	opts->x_align_functions = 1 << value;
-      return true;
-
     case OPT_mbranch_cost_:
       if (value > 5)
 	{
Index: gcc/config/i386/i386.opt
===================================================================
--- gcc/config/i386/i386.opt	(revision 240663)
+++ gcc/config/i386/i386.opt	(working copy)
@@ -205,18 +205,6 @@ malign-double
 Target Report Mask(ALIGN_DOUBLE) Save
 Align some doubles on dword boundary.
 
-malign-functions=
-Target RejectNegative Joined UInteger
-Function starts are aligned to this power of 2.
-
-malign-jumps=
-Target RejectNegative Joined UInteger
-Jump targets are aligned to this power of 2.
-
-malign-loops=
-Target RejectNegative Joined UInteger
-Loop code aligned to this power of 2.
-
 malign-stringops
 Target RejectNegative Report InverseMask(NO_ALIGN_STRINGOPS, ALIGN_STRINGOPS) Save
 Align destination of the string operations.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 0/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] version 8
@ 2017-04-18 18:30 Denys Vlasenko
  2017-04-18 18:30 ` [PATCH 1/3] Remove support for obsolete x86 -malign-foo options Denys Vlasenko
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Denys Vlasenko @ 2017-04-18 18:30 UTC (permalink / raw)
  To: gcc-patches
  Cc: Denys Vlasenko, Andrew Pinski, Uros Bizjak, Bernd Schmidt,
	Sandra Loosemore

These patches are for this bug:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66240
"RFE: extend -falign-xyz syntax"

An extended explanation is in commit message of patch 3.

The test program:

int g();
int f(int i) {
        i *= 3;
        while (--i > 100) {
 L1:            if (g()) goto L1;
                if (g()) goto L2;
        }
        return i;
 L2:    return 123;
}

"-O2" assembly before the patch:	After the patch:
        .text                           	.text
        .p2align 4,,15                  	.p2align 4
        .globl  f                       	.globl	f
        .type   f, @function            	.type	f, @function
f:                                      f:
.LFB0:                                  .LFB0:
        pushq   %rbx                    	pushq	%rbx
        leal    (%rdi,%rdi,2), %ebx     	leal	(%rdi,%rdi,2), %ebx
        .p2align 4,,10                  	.p2align 4,,10
        .p2align 3                      	.p2align 3
.L2:                                    .L2:
        subl    $1, %ebx                	subl	$1, %ebx
        cmpl    $100, %ebx              	cmpl	$100, %ebx
        jle     .L1                     	jle	.L1
        .p2align 4,,10                  	.p2align 4,,10
        .p2align 3                      	.p2align 3
.L3:                                    .L3:
        xorl    %eax, %eax              	xorl	%eax, %eax
        call    g                       	call	g
        testl   %eax, %eax              	testl	%eax, %eax
        jne     .L3                     	jne	.L3
        call    g                       	call	g
        testl   %eax, %eax              	testl	%eax, %eax
        je      .L2                     	je	.L2
        movl    $123, %ebx              	movl	$123, %ebx
.L4:                                    .L4:
.L1:                                    .L1:
        movl    %ebx, %eax              	movl	%ebx, %eax
        popq    %rbx                    	popq	%rbx
        ret                             	ret

This is version 8 of the patch set.

Changes since version 7:

* Documentation fixes

Changes since version 6:

* Rediffed to accomodate changes introduced by recently introduced
  -flimit-function-alignment

Changes since version 5:

* Changes in rs6000, mips, alpha, visium, sh, rx, spu to accomodate
  new alignment options.
* Explicitly list secondary alignment of 8 ("n,m,8") in x86 tables
  for all types of jump targets.

Changes since version 4:

* Deleted rather than NOPed -malign-foo=N support.
* Improved behavior match with x86 8-byte subalignment for labels.

Changes since version 3:

* Improved documentation in invoke.texi
* Fixed x86-specific calculation of default N2 value:
  previous version was doing it incorrectly for cross-compile

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/3] Temporary remove "at least 8 byte alignment" code from x86
  2017-04-18 18:30 [PATCH 0/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] version 8 Denys Vlasenko
  2017-04-18 18:30 ` [PATCH 1/3] Remove support for obsolete x86 -malign-foo options Denys Vlasenko
@ 2017-04-18 18:30 ` Denys Vlasenko
  2017-04-18 18:46 ` [PATCH 3/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] Denys Vlasenko
  2017-05-05 14:40 ` [PATCH 0/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] version 8 Denys Vlasenko
  3 siblings, 0 replies; 9+ messages in thread
From: Denys Vlasenko @ 2017-04-18 18:30 UTC (permalink / raw)
  To: gcc-patches
  Cc: Denys Vlasenko, Andrew Pinski, Uros Bizjak, Bernd Schmidt,
	Sandra Loosemore

This change drops forced alignment to 8 if requested alignment is higher
than 8: before the patch, -falign-functions=9 was generating

        .p2align 4,,8
        .p2align 3

which means: "align to 16 if the skip is 8 bytes or less; else align to 8".
After this change, ".p2align 3" is not emitted.

This behavior will be implemented differently by the next patch.

The new SUBALIGN_LOG define will be used by the next patch.

While we are here, avoid generating ".p2align N,,2^N-1" -
it is functionally equivalent to ".p2align N". In this case, use the latter.

2017-04-18  Denys Vlasenko  <dvlasenk@redhat.com>

    * config/i386/dragonfly.h: (ASM_OUTPUT_MAX_SKIP_ALIGN):
    Use a simpler align directive also if MAXSKIP = ALIGN-1.
    * config/i386/gas.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.
    * config/i386/lynx.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.
    * config/i386/netbsd-elf.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.
    * config/i386/i386.h (ASM_OUTPUT_MAX_SKIP_PAD): Likewise.
    * config/i386/freebsd.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Remove "If N
    is large, do at least 8 byte alignment" code. Add SUBALIGN_LOG
    define. Use a simpler align directive also if MAXSKIP = ALIGN-1.
    * config/i386/gnu-user.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.
    * config/i386/iamcu.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.
    * config/i386/openbsdelf.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.
    * config/i386/x86-64.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.

Index: gcc/config/i386/dragonfly.h
===================================================================
--- gcc/config/i386/dragonfly.h	(revision 239860)
+++ gcc/config/i386/dragonfly.h	(working copy)
@@ -69,10 +69,12 @@ see the files COPYING3 and COPYING.RUNTIME respect
 
 #ifdef HAVE_GAS_MAX_SKIP_P2ALIGN
 #undef  ASM_OUTPUT_MAX_SKIP_ALIGN
-#define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE, LOG, MAX_SKIP)					\
-  if ((LOG) != 0) {														\
-    if ((MAX_SKIP) == 0) fprintf ((FILE), "\t.p2align %d\n", (LOG));	\
-    else fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
+#define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE, LOG, MAX_SKIP)			\
+  if ((LOG) != 0) {							\
+    if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1<<(LOG))-1)			\
+      fprintf ((FILE), "\t.p2align %d\n", (LOG));			\
+    else								\
+      fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
   }
 #endif
 
Index: gcc/config/i386/freebsd.h
===================================================================
--- gcc/config/i386/freebsd.h	(revision 239860)
+++ gcc/config/i386/freebsd.h	(working copy)
@@ -92,9 +92,9 @@ along with GCC; see the file COPYING3.  If not see
 
 /* A C statement to output to the stdio stream FILE an assembler
    command to advance the location counter to a multiple of 1<<LOG
-   bytes if it is within MAX_SKIP bytes.
+   bytes if it is within MAX_SKIP bytes.  */
 
-   This is used to align code labels according to Intel recommendations.  */
+#define SUBALIGN_LOG 3
 
 #ifdef HAVE_GAS_MAX_SKIP_P2ALIGN
 #undef  ASM_OUTPUT_MAX_SKIP_ALIGN
@@ -101,16 +101,10 @@ along with GCC; see the file COPYING3.  If not see
 #define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE,LOG,MAX_SKIP)			\
   do {									\
     if ((LOG) != 0) {							\
-      if ((MAX_SKIP) == 0) fprintf ((FILE), "\t.p2align %d\n", (LOG));	\
-      else {								\
+      if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1<<(LOG))-1)		\
+	fprintf ((FILE), "\t.p2align %d\n", (LOG));			\
+      else								\
 	fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
-	/* Make sure that we have at least 8 byte alignment if > 8 byte \
-	   alignment is preferred.  */					\
-	if ((LOG) > 3							\
-	    && (1 << (LOG)) > ((MAX_SKIP) + 1)				\
-	    && (MAX_SKIP) >= 7)						\
-	  fputs ("\t.p2align 3\n", (FILE));				\
-      }									\
     }									\
   } while (0)
 #endif
Index: gcc/config/i386/gas.h
===================================================================
--- gcc/config/i386/gas.h	(revision 239860)
+++ gcc/config/i386/gas.h	(working copy)
@@ -72,10 +72,12 @@ along with GCC; see the file COPYING3.  If not see
 
 #ifdef HAVE_GAS_MAX_SKIP_P2ALIGN
 #  define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE,LOG,MAX_SKIP) \
-     if ((LOG) != 0) {\
-       if ((MAX_SKIP) == 0) fprintf ((FILE), "\t.p2align %d\n", (LOG)); \
-       else fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP)); \
-     }
+    if ((LOG) != 0) { \
+      if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1<<(LOG))-1)		\
+	fprintf ((FILE), "\t.p2align %d\n", (LOG));			\
+      else								\
+	fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
+    }
 #endif
 \f
 /* A C statement or statements which output an assembler instruction
Index: gcc/config/i386/gnu-user.h
===================================================================
--- gcc/config/i386/gnu-user.h	(revision 239860)
+++ gcc/config/i386/gnu-user.h	(working copy)
@@ -94,24 +94,18 @@ along with GCC; see the file COPYING3.  If not see
 
 /* A C statement to output to the stdio stream FILE an assembler
    command to advance the location counter to a multiple of 1<<LOG
-   bytes if it is within MAX_SKIP bytes.
+   bytes if it is within MAX_SKIP bytes.  */
 
-   This is used to align code labels according to Intel recommendations.  */
+#define SUBALIGN_LOG 3
 
 #ifdef HAVE_GAS_MAX_SKIP_P2ALIGN
 #define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE,LOG,MAX_SKIP)			\
   do {									\
     if ((LOG) != 0) {							\
-      if ((MAX_SKIP) == 0) fprintf ((FILE), "\t.p2align %d\n", (LOG));	\
-      else {								\
+      if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1<<(LOG))-1)		\
+	fprintf ((FILE), "\t.p2align %d\n", (LOG));			\
+      else								\
 	fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
-	/* Make sure that we have at least 8 byte alignment if > 8 byte \
-	   alignment is preferred.  */					\
-	if ((LOG) > 3							\
-	    && (1 << (LOG)) > ((MAX_SKIP) + 1)				\
-	    && (MAX_SKIP) >= 7)						\
-	  fputs ("\t.p2align 3\n", (FILE));				\
-      }									\
     }									\
   } while (0)
 #endif
Index: gcc/config/i386/i386.h
===================================================================
--- gcc/config/i386/i386.h	(revision 239860)
+++ gcc/config/i386/i386.h	(working copy)
@@ -2271,7 +2271,7 @@ do {									\
 #define ASM_OUTPUT_MAX_SKIP_PAD(FILE, LOG, MAX_SKIP)			\
   if ((LOG) != 0)							\
     {									\
-      if ((MAX_SKIP) == 0)						\
+      if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1<<(LOG))-1)		\
         fprintf ((FILE), "\t.p2align %d\n", (LOG));			\
       else								\
         fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
Index: gcc/config/i386/iamcu.h
===================================================================
--- gcc/config/i386/iamcu.h	(revision 239860)
+++ gcc/config/i386/iamcu.h	(working copy)
@@ -62,23 +62,17 @@ see the files COPYING3 and COPYING.RUNTIME respect
 
 /* A C statement to output to the stdio stream FILE an assembler
    command to advance the location counter to a multiple of 1<<LOG
-   bytes if it is within MAX_SKIP bytes.
+   bytes if it is within MAX_SKIP bytes.  */
 
-   This is used to align code labels according to Intel recommendations.  */
+#define SUBALIGN_LOG 3
 
 #define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE,LOG,MAX_SKIP)			\
   do {									\
     if ((LOG) != 0) {							\
-      if ((MAX_SKIP) == 0) fprintf ((FILE), "\t.p2align %d\n", (LOG));	\
-      else {								\
+      if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1<<(LOG))-1)		\
+	fprintf ((FILE), "\t.p2align %d\n", (LOG));			\
+      else								\
 	fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
-	/* Make sure that we have at least 8 byte alignment if > 8 byte \
-	   alignment is preferred.  */					\
-	if ((LOG) > 3							\
-	    && (1 << (LOG)) > ((MAX_SKIP) + 1)				\
-	    && (MAX_SKIP) >= 7)						\
-	  fputs ("\t.p2align 3\n", (FILE));				\
-      }									\
     }									\
   } while (0)
 
Index: gcc/config/i386/lynx.h
===================================================================
--- gcc/config/i386/lynx.h	(revision 239860)
+++ gcc/config/i386/lynx.h	(working copy)
@@ -61,8 +61,10 @@ along with GCC; see the file COPYING3.  If not see
 #define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE,LOG,MAX_SKIP)			\
   do {									\
     if ((LOG) != 0) {							\
-      if ((MAX_SKIP) == 0) fprintf ((FILE), "\t.p2align %d\n", (LOG));	\
-      else fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
+      if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1<<(LOG))-1)		\
+	fprintf ((FILE), "\t.p2align %d\n", (LOG));			\
+      else								\
+	fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
     }									\
   } while (0)
 #endif
Index: gcc/config/i386/netbsd-elf.h
===================================================================
--- gcc/config/i386/netbsd-elf.h	(revision 239860)
+++ gcc/config/i386/netbsd-elf.h	(working copy)
@@ -104,8 +104,10 @@ along with GCC; see the file COPYING3.  If not see
 #ifdef HAVE_GAS_MAX_SKIP_P2ALIGN
 #define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE, LOG, MAX_SKIP)			\
   if ((LOG) != 0) {							\
-    if ((MAX_SKIP) == 0) fprintf ((FILE), "\t.p2align %d\n", (LOG));	\
-    else fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
+    if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1<<(LOG))-1)			\
+      fprintf ((FILE), "\t.p2align %d\n", (LOG));			\
+    else								\
+      fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
   }
 #endif
 
Index: gcc/config/i386/openbsdelf.h
===================================================================
--- gcc/config/i386/openbsdelf.h	(revision 239860)
+++ gcc/config/i386/openbsdelf.h	(working copy)
@@ -63,24 +63,18 @@ along with GCC; see the file COPYING3.  If not see
 
 /* A C statement to output to the stdio stream FILE an assembler
    command to advance the location counter to a multiple of 1<<LOG
-   bytes if it is within MAX_SKIP bytes.
+   bytes if it is within MAX_SKIP bytes.  */
 
-   This is used to align code labels according to Intel recommendations.  */
+#define SUBALIGN_LOG 3
 
 #ifdef HAVE_GAS_MAX_SKIP_P2ALIGN
 #define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE,LOG,MAX_SKIP)			\
   do {									\
     if ((LOG) != 0) {							\
-      if ((MAX_SKIP) == 0) fprintf ((FILE), "\t.p2align %d\n", (LOG));	\
-      else {								\
+      if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1<<(LOG))-1)		\
+	fprintf ((FILE), "\t.p2align %d\n", (LOG));			\
+      else								\
 	fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
-	/* Make sure that we have at least 8 byte alignment if > 8 byte \
-	   alignment is preferred.  */					\
-	if ((LOG) > 3							\
-	    && (1 << (LOG)) > ((MAX_SKIP) + 1)				\
-	    && (MAX_SKIP) >= 7)						\
-	  fputs ("\t.p2align 3\n", (FILE));				\
-      }									\
     }									\
   } while (0)
 #endif
Index: gcc/config/i386/x86-64.h
===================================================================
--- gcc/config/i386/x86-64.h	(revision 239860)
+++ gcc/config/i386/x86-64.h	(working copy)
@@ -61,20 +61,16 @@ see the files COPYING3 and COPYING.RUNTIME respect
 
 /* This is used to align code labels according to Intel recommendations.  */
 
+#define SUBALIGN_LOG 3
+
 #ifdef HAVE_GAS_MAX_SKIP_P2ALIGN
 #define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE,LOG,MAX_SKIP)			\
   do {									\
     if ((LOG) != 0) {							\
-      if ((MAX_SKIP) == 0) fprintf ((FILE), "\t.p2align %d\n", (LOG));	\
-      else {								\
+      if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1<<(LOG))-1)		\
+	fprintf ((FILE), "\t.p2align %d\n", (LOG));			\
+      else								\
 	fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\
-	/* Make sure that we have at least 8 byte alignment if > 8 byte \
-	   alignment is preferred.  */					\
-	if ((LOG) > 3							\
-	    && (1 << (LOG)) > ((MAX_SKIP) + 1)				\
-	    && (MAX_SKIP) >= 7)						\
-	  fputs ("\t.p2align 3\n", (FILE));				\
-      }									\
     }									\
   } while (0)
 #undef  ASM_OUTPUT_MAX_SKIP_PAD
@@ -81,7 +77,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #define ASM_OUTPUT_MAX_SKIP_PAD(FILE, LOG, MAX_SKIP)			\
   if ((LOG) != 0)							\
     {									\
-      if ((MAX_SKIP) == 0)						\
+      if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1<<(LOG))-1)		\
         fprintf ((FILE), "\t.p2align %d\n", (LOG));			\
       else								\
         fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));	\

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 3/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]]
  2017-04-18 18:30 [PATCH 0/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] version 8 Denys Vlasenko
  2017-04-18 18:30 ` [PATCH 1/3] Remove support for obsolete x86 -malign-foo options Denys Vlasenko
  2017-04-18 18:30 ` [PATCH 2/3] Temporary remove "at least 8 byte alignment" code from x86 Denys Vlasenko
@ 2017-04-18 18:46 ` Denys Vlasenko
  2017-04-18 19:12   ` Sandra Loosemore
  2017-05-05 14:40 ` [PATCH 0/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] version 8 Denys Vlasenko
  3 siblings, 1 reply; 9+ messages in thread
From: Denys Vlasenko @ 2017-04-18 18:46 UTC (permalink / raw)
  To: gcc-patches
  Cc: Denys Vlasenko, Andrew Pinski, Uros Bizjak, Bernd Schmidt,
	Sandra Loosemore

falign-functions=N is too simplistic.

Ingo Molnar ran some tests and it seems that on latest x86 CPUs, 64-byte alignment
of functions runs fastest (he tried many other possibilites):
this way, after a call CPU can fetch a lot of insns in the first cacheline fill.

However, developers are less than thrilled by the idea of a slam-dunk 64-byte
aligning everything. Too much waste:
        On 05/20/2015 02:47 AM, Linus Torvalds wrote:
        > At the same time, I have to admit that I abhor a 64-byte function
        > alignment, when we have a fair number of functions that are (much)
        > smaller than that.
        >
        > Is there some way to get gcc to take the size of the function into
        > account? Because aligning a 16-byte or 32-byte function on a 64-byte
        > alignment is just criminally nasty and wasteful.

This change makes it possible to align functions to 64-byte boundaries *if*
this does not introduce huge amount of padding.

Example syntax is -falign-functions=64,9: "align to 64 by skipping up to
9 bytes (not inclusive)". IOW: "after a call insn, CPU will always be able
to fetch at least 9 bytes of insns".

x86 had a tweak: -falign-functions=N with N > 8 was adding secondary alignment.
For example, falign-functions=10 was emitting this before every function:
	.p2align 4,,9
	.p2align 3
This tweak was removed by the previous patch. Now it is reinstated
by the logic that if falign-functions=N[,M] is specified and N > 8,
then default value of N2 is 8, not 1. Now this can be suppressed by
falign-functions=N,M,1 - which wasn't possible before.
In general, optional N2,M2 pair can be used to generate any secondary
alignment user wants.

Subalignment for loops/jumps/labels are trickier to fully implement.
The implementation in this patch uses falign-labels subalignment values
for any of these three types of labels - but only if "main" alignment
triggers. With -O2 defaults, this provides a matching behavior on x86:
loops and jumps are aligned (to 16-32 bytes depending on selected CPU)
and subaligned to 8 bytes. Labels are not aligned.

Testing:

Tested that with -falign-functions=N (tried 8, 15, 16, 17...) the alignment
directives are the same before and after the patch.
Tested that -falign-functions=N,N (two equal parameters) works exactly
like -falign-functions=N.

No change from past behavior:
Tested that "-falign-functions" uses an arch-dependent alignment.
Tested that "-O2" uses an arch-dependent alignment.
Tested that "-O2 -falign-functions=N" uses explicitly given alignment.

2017-04-18  Denys Vlasenko  <dvlasenk@redhat.com>

    * doc/invoke.texi: Update option documentation.
    * common.opt (-falign-functions=): Accept a string instead of integer.
    (-falign-jumps=): Likewise.
    (-falign-labels=): Likewise.
    (-falign-loops=): Likewise.
    * flags.h (struct target_flag_state): Revamp how alignment data is stored:
    for each of four alignment types, store two pairs of log/maxskip values.
    * toplev.c (read_uint): New function.
    (read_log_maxskip): New function.
    (parse_N_M): New function.
    (init_alignments): Rename to parse_alignment_opts, make globally visible.
    Set align_foo[0/1].log/maxskip from
    specified falign-FOO=N[,M[,N[,M]]] options.
    * toplev.h (parse_alignment_opts): Now globally visible.
    (min_align_loops_log): Variable which holds arch override for minimal
    alignment of loops.
    (min_align_jumps_log): Likewise for jumps.
    (min_align_labels_log): Likewise for labels.
    (min_align_functions_log): Likewise for functions.
    * varasm.c (assemble_start_function): Call two ASM_OUTPUT_MAX_SKIP_ALIGN
    macros, first for N,M and second time for N2,M2 from
    falign-functions=N,M,N2,M2. This generates 0, 1, or 2 align directives.
    * final.c (final_scan_insn): If a label, jump or loop target
    is being aligned, emit a secondary alignment directive.
    * config/i386/i386.c (struct ptt): Change foo_align members from
    integers to strings. Add align_label member. Set it to "0,0,8"
    on the processors which have maxskips > 7 for loops and jumps -
    this preserves existing behaviout of adding 8-byte subalign.
    * config/i386/i386.c (processor_target_table): Likewise.
    * config/aarch64/aarch64-protos.h (struct tune_params):
    Change foo_align members from integers to strings.
    * config/aarch64/aarch64.c (<cpu>_tunings):
    Change foo_align field values from integers to strings.
    * config/arm/arm.c (arm_override_options_after_change_1):
    Fix if() condition to detect that -falign-functions is specified,
    change code which sets arch-default alignment.
    * config/i386/i386.c (ix86_default_align): Likewise.
    * config/rs6000/rs6000.c (rs6000_option_override_internal): Likewise.
    * config/mips/mips.c (mips_set_compression_mode): Likewise.
    * config/alpha/alpha.c (alpha_override_options_after_change): Likewise.
    * config/visium/visium.c (visium_option_override): Likewise.
    * config/sh/sh.c (sh_override_options_after_change): Likewise.
    * config/rx/rx.c (rx_option_override): Likewise.
    * config/rx/rx.h (JUMP_ALIGN): Use new variables to access alignment
    information.
    (LABEL_ALIGN): Likewise.
    (LOOP_ALIGN): Likewise.
    * config/spu/spu.c (spu_sched_init): Call parse_alignment_opts(), then
    use new variables to access alignment information.
    * config/sh/sh.c (sh_override_options_after_change): Likewise.
    * testsuite/gcc.target/i386/falign-functions.c: New file.

Index: gcc/common.opt
===================================================================
--- gcc/common.opt	(revision 246948)
+++ gcc/common.opt	(working copy)
@@ -921,35 +921,35 @@ Common Report Var(flag_aggressive_loop_optimizatio
 Aggressively optimize loops using language constraints.
 
 falign-functions
-Common Report Var(align_functions,0) Optimization UInteger
+Common Report Var(flag_align_functions) Optimization
 Align the start of functions.
 
 falign-functions=
-Common RejectNegative Joined UInteger Var(align_functions)
+Common RejectNegative Joined Var(str_align_functions)
 
 flimit-function-alignment
 Common Report Var(flag_limit_function_alignment) Optimization Init(0)
 
 falign-jumps
-Common Report Var(align_jumps,0) Optimization UInteger
+Common Report Var(flag_align_jumps) Optimization
 Align labels which are only reached by jumping.
 
 falign-jumps=
-Common RejectNegative Joined UInteger Var(align_jumps)
+Common RejectNegative Joined Var(str_align_jumps)
 
 falign-labels
-Common Report Var(align_labels,0) Optimization UInteger
+Common Report Var(flag_align_labels) Optimization
 Align all labels.
 
 falign-labels=
-Common RejectNegative Joined UInteger Var(align_labels)
+Common RejectNegative Joined Var(str_align_labels)
 
 falign-loops
-Common Report Var(align_loops,0) Optimization UInteger
+Common Report Var(flag_align_loops) Optimization
 Align the start of loops.
 
 falign-loops=
-Common RejectNegative Joined UInteger Var(align_loops)
+Common RejectNegative Joined Var(str_align_loops)
 
 fargument-alias
 Common Ignore
Index: gcc/config/aarch64/aarch64-protos.h
===================================================================
--- gcc/config/aarch64/aarch64-protos.h	(revision 246948)
+++ gcc/config/aarch64/aarch64-protos.h	(working copy)
@@ -214,9 +214,9 @@ struct tune_params
   int memmov_cost;
   int issue_rate;
   unsigned int fusible_ops;
-  int function_align;
-  int jump_align;
-  int loop_align;
+  const char *function_align;
+  const char *jump_align;
+  const char *loop_align;
   int int_reassoc_width;
   int fp_reassoc_width;
   int vec_reassoc_width;
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c	(revision 246948)
+++ gcc/config/aarch64/aarch64.c	(working copy)
@@ -537,9 +537,9 @@ static const struct tune_params generic_tunings =
   4, /* memmov_cost  */
   2, /* issue_rate  */
   (AARCH64_FUSE_AES_AESMC), /* fusible_ops  */
-  8,	/* function_align.  */
-  8,	/* jump_align.  */
-  4,	/* loop_align.  */
+  "8",	/* function_align.  */
+  "8",	/* jump_align.  */
+  "4",	/* loop_align.  */
   2,	/* int_reassoc_width.  */
   4,	/* fp_reassoc_width.  */
   1,	/* vec_reassoc_width.  */
@@ -563,9 +563,9 @@ static const struct tune_params cortexa35_tunings
   1, /* issue_rate  */
   (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD
    | AARCH64_FUSE_MOVK_MOVK | AARCH64_FUSE_ADRP_LDR), /* fusible_ops  */
-  16,	/* function_align.  */
-  8,	/* jump_align.  */
-  8,	/* loop_align.  */
+  "16",	/* function_align.  */
+  "8",	/* jump_align.  */
+  "8",	/* loop_align.  */
   2,	/* int_reassoc_width.  */
   4,	/* fp_reassoc_width.  */
   1,	/* vec_reassoc_width.  */
@@ -589,9 +589,9 @@ static const struct tune_params cortexa53_tunings
   2, /* issue_rate  */
   (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD
    | AARCH64_FUSE_MOVK_MOVK | AARCH64_FUSE_ADRP_LDR), /* fusible_ops  */
-  16,	/* function_align.  */
-  8,	/* jump_align.  */
-  8,	/* loop_align.  */
+  "16",	/* function_align.  */
+  "8",	/* jump_align.  */
+  "8",	/* loop_align.  */
   2,	/* int_reassoc_width.  */
   4,	/* fp_reassoc_width.  */
   1,	/* vec_reassoc_width.  */
@@ -615,9 +615,9 @@ static const struct tune_params cortexa57_tunings
   3, /* issue_rate  */
   (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD
    | AARCH64_FUSE_MOVK_MOVK), /* fusible_ops  */
-  16,	/* function_align.  */
-  8,	/* jump_align.  */
-  8,	/* loop_align.  */
+  "16",	/* function_align.  */
+  "8",	/* jump_align.  */
+  "8",	/* loop_align.  */
   2,	/* int_reassoc_width.  */
   4,	/* fp_reassoc_width.  */
   1,	/* vec_reassoc_width.  */
@@ -641,9 +641,9 @@ static const struct tune_params cortexa72_tunings
   3, /* issue_rate  */
   (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD
    | AARCH64_FUSE_MOVK_MOVK), /* fusible_ops  */
-  16,	/* function_align.  */
-  8,	/* jump_align.  */
-  8,	/* loop_align.  */
+  "16",	/* function_align.  */
+  "8",	/* jump_align.  */
+  "8",	/* loop_align.  */
   2,	/* int_reassoc_width.  */
   4,	/* fp_reassoc_width.  */
   1,	/* vec_reassoc_width.  */
@@ -667,9 +667,9 @@ static const struct tune_params cortexa73_tunings
   2, /* issue_rate.  */
   (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD
    | AARCH64_FUSE_MOVK_MOVK | AARCH64_FUSE_ADRP_LDR), /* fusible_ops  */
-  16,	/* function_align.  */
-  8,	/* jump_align.  */
-  8,	/* loop_align.  */
+  "16",	/* function_align.  */
+  "8",	/* jump_align.  */
+  "8",	/* loop_align.  */
   2,	/* int_reassoc_width.  */
   4,	/* fp_reassoc_width.  */
   1,	/* vec_reassoc_width.  */
@@ -692,9 +692,9 @@ static const struct tune_params exynosm1_tunings =
   4,	/* memmov_cost  */
   3,	/* issue_rate  */
   (AARCH64_FUSE_AES_AESMC), /* fusible_ops  */
-  4,	/* function_align.  */
-  4,	/* jump_align.  */
-  4,	/* loop_align.  */
+  "4",	/* function_align.  */
+  "4",	/* jump_align.  */
+  "4",	/* loop_align.  */
   2,	/* int_reassoc_width.  */
   4,	/* fp_reassoc_width.  */
   1,	/* vec_reassoc_width.  */
@@ -717,9 +717,9 @@ static const struct tune_params thunderx_tunings =
   6, /* memmov_cost  */
   2, /* issue_rate  */
   AARCH64_FUSE_CMP_BRANCH, /* fusible_ops  */
-  8,	/* function_align.  */
-  8,	/* jump_align.  */
-  8,	/* loop_align.  */
+  "8",	/* function_align.  */
+  "8",	/* jump_align.  */
+  "8",	/* loop_align.  */
   2,	/* int_reassoc_width.  */
   4,	/* fp_reassoc_width.  */
   1,	/* vec_reassoc_width.  */
@@ -742,9 +742,9 @@ static const struct tune_params xgene1_tunings =
   6, /* memmov_cost  */
   4, /* issue_rate  */
   AARCH64_FUSE_NOTHING, /* fusible_ops  */
-  16,	/* function_align.  */
-  8,	/* jump_align.  */
-  16,	/* loop_align.  */
+  "16",	/* function_align.  */
+  "8",	/* jump_align.  */
+  "16",	/* loop_align.  */
   2,	/* int_reassoc_width.  */
   4,	/* fp_reassoc_width.  */
   1,	/* vec_reassoc_width.  */
@@ -768,9 +768,9 @@ static const struct tune_params qdf24xx_tunings =
   4, /* issue_rate  */
   (AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD
    | AARCH64_FUSE_MOVK_MOVK), /* fuseable_ops  */
-  16,	/* function_align.  */
-  8,	/* jump_align.  */
-  16,	/* loop_align.  */
+  "16",	/* function_align.  */
+  "8",	/* jump_align.  */
+  "16",	/* loop_align.  */
   2,	/* int_reassoc_width.  */
   4,	/* fp_reassoc_width.  */
   1,	/* vec_reassoc_width.  */
@@ -793,9 +793,9 @@ static const struct tune_params thunderx2t99_tunin
   4, /* memmov_cost.  */
   4, /* issue_rate.  */
   (AARCH64_FUSE_CMP_BRANCH | AARCH64_FUSE_AES_AESMC), /* fusible_ops  */
-  16,	/* function_align.  */
-  8,	/* jump_align.  */
-  16,	/* loop_align.  */
+  "16",	/* function_align.  */
+  "8",	/* jump_align.  */
+  "16",	/* loop_align.  */
   3,	/* int_reassoc_width.  */
   2,	/* fp_reassoc_width.  */
   2,	/* vec_reassoc_width.  */
Index: gcc/config/alpha/alpha.c
===================================================================
--- gcc/config/alpha/alpha.c	(revision 246948)
+++ gcc/config/alpha/alpha.c	(working copy)
@@ -609,13 +609,13 @@ alpha_override_options_after_change (void)
   /* ??? Kludge these by not doing anything if we don't optimize.  */
   if (optimize > 0)
     {
-      if (align_loops <= 0)
-	align_loops = 16;
-      if (align_jumps <= 0)
-	align_jumps = 16;
+      if (flag_align_loops && !str_align_loops)
+	str_align_loops = "16";
+      if (flag_align_jumps && !str_align_jumps)
+	str_align_jumps = "16";
     }
-  if (align_functions <= 0)
-    align_functions = 16;
+  if (flag_align_functions && !str_align_functions)
+    str_align_functions = "16";
 }
 \f
 /* Returns 1 if VALUE is a mask that contains full bytes of zero or ones.  */
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 246948)
+++ gcc/config/arm/arm.c	(working copy)
@@ -2902,9 +2902,10 @@ static GTY(()) tree init_optimize;
 static void
 arm_override_options_after_change_1 (struct gcc_options *opts)
 {
-  if (opts->x_align_functions <= 0)
-    opts->x_align_functions = TARGET_THUMB_P (opts->x_target_flags)
-      && opts->x_optimize_size ? 2 : 4;
+  /* -falign-functions without argument: supply one */
+  if (opts->x_flag_align_functions && !opts->x_str_align_functions)
+    opts->x_str_align_functions = TARGET_THUMB_P (opts->x_target_flags)
+      && opts->x_optimize_size ? "2" : "4";
 }
 
 /* Implement targetm.override_options_after_change.  */
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 246948)
+++ gcc/config/i386/i386.c	(working copy)
@@ -2636,45 +2636,47 @@ struct ptt
 {
   const char *const name;			/* processor name  */
   const struct processor_costs *cost;		/* Processor costs */
-  const int align_loop;				/* Default alignments.  */
-  const int align_loop_max_skip;
-  const int align_jump;
-  const int align_jump_max_skip;
-  const int align_func;
+  const char *const align_loop;			/* Default alignments.  */
+  const char *const align_jump;
+  const char *const align_label;
+  const char *const align_func;
 };
 
 /* This table must be in sync with enum processor_type in i386.h.  */ 
 static const struct ptt processor_target_table[PROCESSOR_max] =
 {
-  {"generic", &generic_cost, 16, 10, 16, 10, 16},
-  {"i386", &i386_cost, 4, 3, 4, 3, 4},
-  {"i486", &i486_cost, 16, 15, 16, 15, 16},
-  {"pentium", &pentium_cost, 16, 7, 16, 7, 16},
-  {"lakemont", &lakemont_cost, 16, 7, 16, 7, 16},
-  {"pentiumpro", &pentiumpro_cost, 16, 15, 16, 10, 16},
-  {"pentium4", &pentium4_cost, 0, 0, 0, 0, 0},
-  {"nocona", &nocona_cost, 0, 0, 0, 0, 0},
-  {"core2", &core_cost, 16, 10, 16, 10, 16},
-  {"nehalem", &core_cost, 16, 10, 16, 10, 16},
-  {"sandybridge", &core_cost, 16, 10, 16, 10, 16},
-  {"haswell", &core_cost, 16, 10, 16, 10, 16},
-  {"bonnell", &atom_cost, 16, 15, 16, 7, 16},
-  {"silvermont", &slm_cost, 16, 15, 16, 7, 16},
-  {"knl", &slm_cost, 16, 15, 16, 7, 16},
-  {"skylake-avx512", &core_cost, 16, 10, 16, 10, 16},
-  {"intel", &intel_cost, 16, 15, 16, 7, 16},
-  {"geode", &geode_cost, 0, 0, 0, 0, 0},
-  {"k6", &k6_cost, 32, 7, 32, 7, 32},
-  {"athlon", &athlon_cost, 16, 7, 16, 7, 16},
-  {"k8", &k8_cost, 16, 7, 16, 7, 16},
-  {"amdfam10", &amdfam10_cost, 32, 24, 32, 7, 32},
-  {"bdver1", &bdver1_cost, 16, 10, 16, 7, 11},
-  {"bdver2", &bdver2_cost, 16, 10, 16, 7, 11},
-  {"bdver3", &bdver3_cost, 16, 10, 16, 7, 11},
-  {"bdver4", &bdver4_cost, 16, 10, 16, 7, 11},
-  {"btver1", &btver1_cost, 16, 10, 16, 7, 11},
-  {"btver2", &btver2_cost, 16, 10, 16, 7, 11},
-  {"znver1", &znver1_cost, 16, 15, 16, 15, 16}
+/* The "0,0,8" label alignment specified for some processors generates
+   secondary 8-byte alignment only for those label/jump/loop targets
+   which have primary alignment.  */
+  {"generic",    &generic_cost,   "16,11,8", "16,11,8", "0,0,8", "16"},
+  {"i386",       &i386_cost,      "4",       "4",       NULL,    "4" },
+  {"i486",       &i486_cost,      "16,16,8", "16,16,8", "0,0,8", "16"},
+  {"pentium",    &pentium_cost,   "16,8,8",  "16,8,8",  "0,0,8", "16"},
+  {"lakemont",   &lakemont_cost,  "16,8,8",  "16,8,8",  "0,0,8", "16"},
+  {"pentiumpro", &pentiumpro_cost,"16,16,8", "16,11,8", "0,0,8", "16"},
+  {"pentium4",   &pentium4_cost,  NULL,      NULL,      NULL,    NULL},
+  {"nocona",     &nocona_cost,    NULL,      NULL,      NULL,    NULL},
+  {"core2",      &core_cost,      "16,11,8", "16,11,8", "0,0,8", "16"},
+  {"nehalem",    &core_cost,      "16,11,8", "16,11,8", "0,0,8", "16"},
+  {"sandybridge",&core_cost,      "16,11,8", "16,11,8", "0,0,8", "16"},
+  {"haswell",    &core_cost,      "16,11,8", "16,11,8", "0,0,8", "16"},
+  {"bonnell",    &atom_cost,      "16,16,8", "16,8,8",  "0,0,8", "16"},
+  {"silvermont", &slm_cost,       "16,16,8", "16,8,8",  "0,0,8", "16"},
+  {"knl",        &slm_cost,       "16,16,8", "16,8,8",  "0,0,8", "16"},
+  {"skylake-avx512", &core_cost,  "16,11,8", "16,11,8", "0,0,8", "16"},
+  {"intel",      &intel_cost,     "16,16,8", "16,8,8",  "0,0,8", "16"},
+  {"geode",      &geode_cost,     NULL,      NULL,      NULL,    NULL},
+  {"k6",         &k6_cost,        "32,8,8",  "32,8,8",  "0,0,8", "32"},
+  {"athlon",     &athlon_cost,    "16,8,8",  "16,8,8",  "0,0,8", "16"},
+  {"k8",         &k8_cost,        "16,8,8",  "16,8,8",  "0,0,8", "16"},
+  {"amdfam10",   &amdfam10_cost,  "32,25,8", "32,8,8",  "0,0,8", "32"},
+  {"bdver1",     &bdver1_cost,    "16,11,8", "16,8,8",  "0,0,8", "11"},
+  {"bdver2",     &bdver2_cost,    "16,11,8", "16,8,8",  "0,0,8", "11"},
+  {"bdver3",     &bdver3_cost,    "16,11,8", "16,8,8",  "0,0,8", "11"},
+  {"bdver4",     &bdver4_cost,    "16,11,8", "16,8,8",  "0,0,8", "11"},
+  {"btver1",     &btver1_cost,    "16,11,8", "16,8,8",  "0,0,8", "11"},
+  {"btver2",     &btver2_cost,    "16,11,8", "16,8,8",  "0,0,8", "11"},
+  {"znver1",     &znver1_cost,    "16,16,8", "16,16,8", "0,0,8", "16"}
 };
 \f
 static unsigned int
@@ -4856,20 +4858,23 @@ set_ix86_tune_features (enum processor_type ix86_t
 static void
 ix86_default_align (struct gcc_options *opts)
 {
-  if (opts->x_align_loops == 0)
+  /* -falign-foo without argument: supply one */
+  if (opts->x_flag_align_loops && !opts->x_str_align_loops)
     {
-      opts->x_align_loops = processor_target_table[ix86_tune].align_loop;
-      align_loops_max_skip = processor_target_table[ix86_tune].align_loop_max_skip;
+      opts->x_str_align_loops = processor_target_table[ix86_tune].align_loop;
     }
-  if (opts->x_align_jumps == 0)
+  if (opts->x_flag_align_jumps && !opts->x_str_align_jumps)
     {
-      opts->x_align_jumps = processor_target_table[ix86_tune].align_jump;
-      align_jumps_max_skip = processor_target_table[ix86_tune].align_jump_max_skip;
+      opts->x_str_align_jumps = processor_target_table[ix86_tune].align_jump;
     }
-  if (opts->x_align_functions == 0)
+  if (opts->x_flag_align_labels && !opts->x_str_align_labels)
     {
-      opts->x_align_functions = processor_target_table[ix86_tune].align_func;
+      opts->x_str_align_labels = processor_target_table[ix86_tune].align_label;
     }
+  if (opts->x_flag_align_functions && !opts->x_str_align_functions)
+    {
+      opts->x_str_align_functions = processor_target_table[ix86_tune].align_func;
+    }
 }
 
 /* Implement TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE hook.  */
Index: gcc/config/mips/mips.c
===================================================================
--- gcc/config/mips/mips.c	(revision 246948)
+++ gcc/config/mips/mips.c	(working copy)
@@ -488,9 +488,9 @@ unsigned int mips_base_compression_flags;
 static int mips_base_schedule_insns; /* flag_schedule_insns */
 static int mips_base_reorder_blocks_and_partition; /* flag_reorder... */
 static int mips_base_move_loop_invariants; /* flag_move_loop_invariants */
-static int mips_base_align_loops; /* align_loops */
-static int mips_base_align_jumps; /* align_jumps */
-static int mips_base_align_functions; /* align_functions */
+static const char *mips_base_align_loops; /* align_loops */
+static const char *mips_base_align_jumps; /* align_jumps */
+static const char *mips_base_align_functions; /* align_functions */
 
 /* Index [M][R] is true if register R is allowed to hold a value of mode M.  */
 bool mips_hard_regno_mode_ok[(int) MAX_MACHINE_MODE][FIRST_PSEUDO_REGISTER];
@@ -19453,12 +19453,12 @@ mips_set_compression_mode (unsigned int compressio
       /* Provide default values for align_* for 64-bit targets.  */
       if (TARGET_64BIT)
 	{
-	  if (align_loops == 0)
-	    align_loops = 8;
-	  if (align_jumps == 0)
-	    align_jumps = 8;
-	  if (align_functions == 0)
-	    align_functions = 8;
+	  if (flag_align_loops && !str_align_loops)
+	    str_align_loops = "8";
+	  if (flag_align_jumps && !str_align_jumps)
+	    str_align_jumps = "8";
+	  if (flag_align_functions && !str_align_functions)
+	    str_align_functions = "8";
 	}
 
       targetm.min_anchor_offset = -32768;
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 246948)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -5218,29 +5218,25 @@ rs6000_option_override_internal (bool global_init_
 	  if (rs6000_cpu == PROCESSOR_TITAN
 	      || rs6000_cpu == PROCESSOR_CELL)
 	    {
-	      if (align_functions <= 0)
-		align_functions = 8;
-	      if (align_jumps <= 0)
-		align_jumps = 8;
-	      if (align_loops <= 0)
-		align_loops = 8;
+	      if (flag_align_functions && !str_align_functions)
+		str_align_functions = "8";
+	      if (flag_align_jumps && !str_align_jumps)
+		str_align_jumps = "8";
+	      if (flag_align_loops && !str_align_loops)
+		str_align_loops = "8";
 	    }
 	  if (rs6000_align_branch_targets)
 	    {
-	      if (align_functions <= 0)
-		align_functions = 16;
-	      if (align_jumps <= 0)
-		align_jumps = 16;
-	      if (align_loops <= 0)
+	      if (flag_align_functions && !str_align_functions)
+		str_align_functions = "16";
+	      if (flag_align_jumps && !str_align_jumps)
+		str_align_jumps = "16";
+	      if (flag_align_loops && !str_align_loops)
 		{
 		  can_override_loop_align = 1;
-		  align_loops = 16;
+		  str_align_loops = "16";
 		}
 	    }
-	  if (align_jumps_max_skip <= 0)
-	    align_jumps_max_skip = 15;
-	  if (align_loops_max_skip <= 0)
-	    align_loops_max_skip = 15;
 	}
 
       /* Arrange to save and restore machine status around nested functions.  */
Index: gcc/config/rx/rx.c
===================================================================
--- gcc/config/rx/rx.c	(revision 246948)
+++ gcc/config/rx/rx.c	(working copy)
@@ -2820,12 +2820,15 @@ rx_option_override (void)
   rx_override_options_after_change ();
 
   /* These values are bytes, not log.  */
-  if (align_jumps == 0 && ! optimize_size)
-    align_jumps = ((rx_cpu_type == RX100 || rx_cpu_type == RX200) ? 4 : 8);
-  if (align_loops == 0 && ! optimize_size)
-    align_loops = ((rx_cpu_type == RX100 || rx_cpu_type == RX200) ? 4 : 8);
-  if (align_labels == 0 && ! optimize_size)
-    align_labels = ((rx_cpu_type == RX100 || rx_cpu_type == RX200) ? 4 : 8);
+  if (! optimize_size)
+    {
+      if (flag_align_jumps && !str_align_jumps)
+	str_align_jumps = ((rx_cpu_type == RX100 || rx_cpu_type == RX200) ? "4" : "8");
+      if (flag_align_loops && !str_align_loops)
+	str_align_loops = ((rx_cpu_type == RX100 || rx_cpu_type == RX200) ? "4" : "8");
+      if (flag_align_labels && !str_align_labels)
+	str_align_labels = ((rx_cpu_type == RX100 || rx_cpu_type == RX200) ? "4" : "8");
+    }
 }
 
 \f
Index: gcc/config/rx/rx.h
===================================================================
--- gcc/config/rx/rx.h	(revision 246948)
+++ gcc/config/rx/rx.h	(working copy)
@@ -432,9 +432,9 @@ typedef unsigned int CUMULATIVE_ARGS;
 /* Compute the alignment needed for label X in various situations.
    If the user has specified an alignment then honour that, otherwise
    use rx_align_for_label.  */
-#define JUMP_ALIGN(x)				(align_jumps > 1 ? align_jumps_log : rx_align_for_label (x, 0))
-#define LABEL_ALIGN(x)				(align_labels > 1 ? align_labels_log : rx_align_for_label (x, 3))
-#define LOOP_ALIGN(x)				(align_loops > 1 ? align_loops_log : rx_align_for_label (x, 2))
+#define JUMP_ALIGN(x)				(align_jumps_log > 0 ? align_jumps_log : rx_align_for_label (x, 0))
+#define LABEL_ALIGN(x)				(align_labels_log > 0 ? align_labels_log : rx_align_for_label (x, 3))
+#define LOOP_ALIGN(x)				(align_loops_log > 0 ? align_loops_log : rx_align_for_label (x, 2))
 #define LABEL_ALIGN_AFTER_BARRIER(x)		rx_align_for_label (x, 0)
 
 #define ASM_OUTPUT_MAX_SKIP_ALIGN(STREAM, LOG, MAX_SKIP)	\
Index: gcc/config/sh/sh.c
===================================================================
--- gcc/config/sh/sh.c	(revision 246948)
+++ gcc/config/sh/sh.c	(working copy)
@@ -984,16 +984,16 @@ sh_override_options_after_change (void)
       Aligning all jumps increases the code size, even if it might
       result in slightly faster code.  Thus, it is set to the smallest 
       alignment possible if not specified by the user.  */
-  if (align_loops == 0)
-    align_loops = optimize_size ? 2 : 4;
+  if (flag_align_loops && !str_align_loops)
+    str_align_loops = optimize_size ? "2" : "4";
 
-  if (align_jumps == 0)
-    align_jumps = 2;
-  else if (align_jumps < 2)
-    align_jumps = 2;
+  if (flag_align_jumps && !str_align_jumps)
+    str_align_jumps = "2";
+  else
+    min_align_jumps_log = 1;
 
-  if (align_functions == 0)
-    align_functions = optimize_size ? 2 : 4;
+  if (flag_align_functions && !str_align_functions)
+    str_align_functions = optimize_size ? "2" : "4";
 
   /* The linker relaxation code breaks when a function contains
      alignments that are larger than that at the start of a
@@ -1000,13 +1000,13 @@ sh_override_options_after_change (void)
      compilation unit.  */
   if (TARGET_RELAX)
     {
-      int min_align = align_loops > align_jumps ? align_loops : align_jumps;
+      parse_alignment_opts ();
+      min_align_functions_log = align_loops_log > align_jumps_log ?
+				align_loops_log : align_jumps_log;
 
       /* Also take possible .long constants / mova tables into account.	*/
-      if (min_align < 4)
-	min_align = 4;
-      if (align_functions < min_align)
-	align_functions = min_align;
+      if (min_align_functions_log < 2)
+	min_align_functions_log = 2;
     }
 }
 \f
Index: gcc/config/spu/spu.c
===================================================================
--- gcc/config/spu/spu.c	(revision 246948)
+++ gcc/config/spu/spu.c	(working copy)
@@ -2767,7 +2767,8 @@ static void
 spu_sched_init (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED,
 		int max_ready ATTRIBUTE_UNUSED)
 {
-  if (align_labels > 4 || align_loops > 4 || align_jumps > 4)
+  parse_alignment_opts ();
+  if (align_labels_log > 2 || align_loops_log > 2 || align_jumps_log > 2)
     {
       /* When any block might be at least 8-byte aligned, assume they
          will all be at least 8-byte aligned to make sure dual issue
Index: gcc/config/visium/visium.c
===================================================================
--- gcc/config/visium/visium.c	(revision 246948)
+++ gcc/config/visium/visium.c	(working copy)
@@ -413,12 +413,12 @@ visium_option_override (void)
 
   /* Align functions on 256-byte (32-quadword) for GR5 and 64-byte (8-quadword)
      boundaries for GR6 so they start a new burst mode window.  */
-  if (align_functions == 0)
+  if (flag_align_functions && !str_align_functions)
     {
       if (visium_cpu == PROCESSOR_GR6)
-	align_functions = 64;
+	str_align_functions = "64";
       else
-	align_functions = 256;
+	str_align_functions = "256";
 
       /* Allow the size of compilation units to double because of inlining.
 	 In practice the global size of the object code is hardly affected
@@ -429,26 +429,25 @@ visium_option_override (void)
     }
 
   /* Likewise for loops.  */
-  if (align_loops == 0)
+  if (flag_align_loops && !str_align_loops)
     {
       if (visium_cpu == PROCESSOR_GR6)
-	align_loops = 64;
+	str_align_loops = "64";
       else
 	{
-	  align_loops = 256;
 	  /* But not if they are too far away from a 256-byte boundary.  */
-	  align_loops_max_skip = 31;
+	  str_align_loops = "256,32";
 	}
     }
 
   /* Align all jumps on quadword boundaries for the burst mode, and even
      on 8-quadword boundaries for GR6 so they start a new window.  */
-  if (align_jumps == 0)
+  if (flag_align_jumps && !str_align_jumps)
     {
       if (visium_cpu == PROCESSOR_GR6)
-	align_jumps = 64;
+	str_align_jumps = "64";
       else
-	align_jumps = 8;
+	str_align_jumps = "8";
     }
 
   /* We register a machine-specific pass.  This pass must be scheduled as
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 246948)
+++ gcc/doc/invoke.texi	(working copy)
@@ -351,9 +351,11 @@ Objective-C and Objective-C++ Dialects}.
 
 @item Optimization Options
 @xref{Optimize Options,,Options that Control Optimization}.
-@gccoptlist{-faggressive-loop-optimizations  -falign-functions[=@var{n}] @gol
--falign-jumps[=@var{n}] @gol
--falign-labels[=@var{n}]  -falign-loops[=@var{n}] @gol
+@gccoptlist{-faggressive-loop-optimizations @gol
+-falign-functions[=@var{n}[,@var{m},[@var{n2}[,@var{m2}]]]] @gol
+-falign-jumps[=@var{n}[,@var{m},[@var{n2}[,@var{m2}]]]] @gol
+-falign-labels[=@var{n}[,@var{m},[@var{n2}[,@var{m2}]]]] @gol
+-falign-loops[=@var{n}[,@var{m},[@var{n2}[,@var{m2}]]]] @gol
 -fassociative-math  -fauto-profile  -fauto-profile[=@var{path}] @gol
 -fauto-inc-dec  -fbranch-probabilities @gol
 -fbranch-target-load-optimize  -fbranch-target-load-optimize2 @gol
@@ -8672,19 +8674,36 @@ The @option{-fstrict-overflow} option is enabled a
 
 @item -falign-functions
 @itemx -falign-functions=@var{n}
+@itemx -falign-functions=@var{n},@var{m}
+@itemx -falign-functions=@var{n},@var{m},@var{n2}
+@itemx -falign-functions=@var{n},@var{m},@var{n2},@var{m2}
 @opindex falign-functions
 Align the start of functions to the next power-of-two greater than
-@var{n}, skipping up to @var{n} bytes.  For instance,
-@option{-falign-functions=32} aligns functions to the next 32-byte
-boundary, but @option{-falign-functions=24} aligns to the next
-32-byte boundary only if this can be done by skipping 23 bytes or less.
+@var{n}, skipping up to @var{m}-1 bytes.  This ensures that at least
+the first @var{m} bytes of the function can be fetched by the CPU
+without crossing an @var{n}-byte alignment boundary.
 
-@option{-fno-align-functions} and @option{-falign-functions=1} are
-equivalent and mean that functions are not aligned.
+If @var{m} is not specified, it defaults to @var{n}.
 
+Examples: @option{-falign-functions=32} aligns functions to the next
+32-byte boundary, @option{-falign-functions=24} aligns to the next
+32-byte boundary only if this can be done by skipping 23 bytes or less,
+@option{-falign-functions=32,7} aligns to the next
+32-byte boundary only if this can be done by skipping 6 bytes or less.
+
+The second pair of @var{n2},@var{m2} values allows you to specify
+a secondary alignment: @option{-falign-functions=64,7,32,3} aligns to
+the next 64-byte boundary if this can be done by skipping 6 bytes or less,
+otherwise aligns to the next 32-byte boundary if this can be done
+by skipping 2 bytes or less.
+If @var{m2} is not specified, it defaults to @var{n2}.
+
 Some assemblers only support this flag when @var{n} is a power of two;
 in that case, it is rounded up.
 
+@option{-fno-align-functions} and @option{-falign-functions=1} are
+equivalent and mean that functions are not aligned.
+
 If @var{n} is not specified or is zero, use a machine-dependent default.
 
 Enabled at levels @option{-O2}, @option{-O3}.
@@ -8697,12 +8716,13 @@ skip more bytes than the size of the function.
 
 @item -falign-labels
 @itemx -falign-labels=@var{n}
+@itemx -falign-labels=@var{n},@var{m}
+@itemx -falign-labels=@var{n},@var{m},@var{n2}
+@itemx -falign-labels=@var{n},@var{m},@var{n2},@var{m2}
 @opindex falign-labels
-Align all branch targets to a power-of-two boundary, skipping up to
-@var{n} bytes like @option{-falign-functions}.  This option can easily
-make code slower, because it must insert dummy operations for when the
-branch target is reached in the usual flow of the code.
+Align all branch targets to a power-of-two boundary.
 
+Parameters of this option are analogous to the @option{-falign-functions} option.
 @option{-fno-align-labels} and @option{-falign-labels=1} are
 equivalent and mean that labels are not aligned.
 
@@ -8716,12 +8736,15 @@ Enabled at levels @option{-O2}, @option{-O3}.
 
 @item -falign-loops
 @itemx -falign-loops=@var{n}
+@itemx -falign-loops=@var{n},@var{m}
+@itemx -falign-loops=@var{n},@var{m},@var{n2}
+@itemx -falign-loops=@var{n},@var{m},@var{n2},@var{m2}
 @opindex falign-loops
-Align loops to a power-of-two boundary, skipping up to @var{n} bytes
-like @option{-falign-functions}.  If the loops are
-executed many times, this makes up for any execution of the dummy
-operations.
+Align loops to a power-of-two boundary.  If the loops are executed
+many times, this makes up for any execution of the dummy padding
+instructions.
 
+Parameters of this option are analogous to the @option{-falign-functions} option.
 @option{-fno-align-loops} and @option{-falign-loops=1} are
 equivalent and mean that loops are not aligned.
 
@@ -8731,12 +8754,15 @@ Enabled at levels @option{-O2}, @option{-O3}.
 
 @item -falign-jumps
 @itemx -falign-jumps=@var{n}
+@itemx -falign-jumps=@var{n},@var{m}
+@itemx -falign-jumps=@var{n},@var{m},@var{n2}
+@itemx -falign-jumps=@var{n},@var{m},@var{n2},@var{m2}
 @opindex falign-jumps
 Align branch targets to a power-of-two boundary, for branch targets
-where the targets can only be reached by jumping, skipping up to @var{n}
-bytes like @option{-falign-functions}.  In this case, no dummy operations
-need be executed.
+where the targets can only be reached by jumping.  In this case,
+no dummy operations need be executed.
 
+Parameters of this option are analogous to the @option{-falign-functions} option.
 @option{-fno-align-jumps} and @option{-falign-jumps=1} are
 equivalent and mean that loops are not aligned.
 
Index: gcc/final.c
===================================================================
--- gcc/final.c	(revision 246948)
+++ gcc/final.c	(working copy)
@@ -2429,6 +2429,12 @@ final_scan_insn (rtx_insn *insn, FILE *file, int o
 	    {
 #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
 	      ASM_OUTPUT_MAX_SKIP_ALIGN (file, align, max_skip);
+	      /* Above, we don't know whether a label, jump or loop
+		 alignment was used. Conservatively apply
+		 label subalignment, not jump or loop
+		 subalignment (they are almost always larger).  */
+	      ASM_OUTPUT_MAX_SKIP_ALIGN (file, align_labels[1].log,
+					 align_labels[1].maxskip);
 #else
 #ifdef ASM_OUTPUT_ALIGN_WITH_NOP
               ASM_OUTPUT_ALIGN_WITH_NOP (file, align);
Index: gcc/flags.h
===================================================================
--- gcc/flags.h	(revision 246948)
+++ gcc/flags.h	(working copy)
@@ -43,19 +43,22 @@ extern bool final_insns_dump_p;
 /* Other basic status info about current function.  */
 
 /* Target-dependent global state.  */
-struct target_flag_state {
+struct align_flags {
   /* Values of the -falign-* flags: how much to align labels in code.
-     0 means `use default', 1 means `don't align'.
-     For each variable, there is an _log variant which is the power
-     of two not less than the variable, for .align output.  */
-  int x_align_loops_log;
-  int x_align_loops_max_skip;
-  int x_align_jumps_log;
-  int x_align_jumps_max_skip;
-  int x_align_labels_log;
-  int x_align_labels_max_skip;
-  int x_align_functions_log;
+     log is "align to 2^log" (so 0 means no alignment).
+     maxskip is the maximum allowed amount of padding to insert. */
+  int log;
+  int maxskip;
+};
 
+struct target_flag_state {
+  /* Each falign-foo can generate up to two levels of alignment:
+     -falign-foo=N,M[,N2,M2] */
+  struct align_flags x_align_loops[2];
+  struct align_flags x_align_jumps[2];
+  struct align_flags x_align_labels[2];
+  struct align_flags x_align_functions[2];
+
   /* The excess precision currently in effect.  */
   enum excess_precision x_flag_excess_precision;
 };
@@ -67,20 +70,21 @@ extern struct target_flag_state *this_target_flag_
 #define this_target_flag_state (&default_target_flag_state)
 #endif
 
-#define align_loops_log \
-  (this_target_flag_state->x_align_loops_log)
-#define align_loops_max_skip \
-  (this_target_flag_state->x_align_loops_max_skip)
-#define align_jumps_log \
-  (this_target_flag_state->x_align_jumps_log)
-#define align_jumps_max_skip \
-  (this_target_flag_state->x_align_jumps_max_skip)
-#define align_labels_log \
-  (this_target_flag_state->x_align_labels_log)
-#define align_labels_max_skip \
-  (this_target_flag_state->x_align_labels_max_skip)
-#define align_functions_log \
-  (this_target_flag_state->x_align_functions_log)
+#define align_loops              (this_target_flag_state->x_align_loops)
+#define align_jumps              (this_target_flag_state->x_align_jumps)
+#define align_labels             (this_target_flag_state->x_align_labels)
+#define align_functions          (this_target_flag_state->x_align_functions)
+#define align_loops_log          (align_loops[0].log)
+#define align_jumps_log          (align_jumps[0].log)
+#define align_labels_log         (align_labels[0].log)
+#define align_functions_log      (align_functions[0].log)
+#define align_loops_max_skip     (align_loops[0].maxskip)
+#define align_jumps_max_skip     (align_jumps[0].maxskip)
+#define align_labels_max_skip    (align_labels[0].maxskip)
+#define align_functions_max_skip (align_functions[0].maxskip)
+/* String representaions of the above options are available in
+   const char *str_align_foo. NULL if not set. */
+
 #define flag_excess_precision \
   (this_target_flag_state->x_flag_excess_precision)
 
Index: gcc/testsuite/gcc.target/i386/falign-functions.c
===================================================================
--- gcc/testsuite/gcc.target/i386/falign-functions.c	(nonexistent)
+++ gcc/testsuite/gcc.target/i386/falign-functions.c	(working copy)
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -falign-functions=64,8" } */
+/* { dg-final { scan-assembler ".p2align 6,,7" } } */
+
+void
+test_func (void)
+{
+}
Index: gcc/toplev.c
===================================================================
--- gcc/toplev.c	(revision 246948)
+++ gcc/toplev.c	(working copy)
@@ -1177,31 +1177,111 @@ target_supports_section_anchors_p (void)
   return true;
 }
 
-/* Default the align_* variables to 1 if they're still unset, and
-   set up the align_*_log variables.  */
+/* Read a decimal number from string FLAG, up to end of line or comma.
+   Emit error message if number ends with any other character.
+   Return pointer past comma, or NULL if end of line.  */
+static const char *
+read_uint (const char *flag, const char *name, int *np)
+{
+  const char *flag_start = flag;
+  int n = 0;
+  char c;
+
+  while ((c = *flag++) >= '0' && c <= '9')
+    n = n*10 + (c-'0');
+  *np = n & 0x3fffffff; /* avoid accidentally negative numbers */
+  if (c == '\0')
+    return NULL;
+  if (c == ',')
+    return flag;
+
+  error_at (UNKNOWN_LOCATION, "-falign-%s parameter is bad at '%s'",
+            name, flag_start);
+  return NULL;
+}
+
+/* Parse "N[,M][,...]" string FLAG into struct align_flags A.
+   Return pointer past second comma, or NULL if end of line.  */
+static const char *
+read_log_maxskip (const char *flag, const char *name, struct align_flags *a)
+{
+  int n, m;
+  flag = read_uint (flag, name, &a->log);
+  n = a->log;
+  if (n != 0)
+    a->log = floor_log2 (n * 2 - 1);
+  if (!flag)
+    {
+      a->maxskip = n ? n - 1 : 0;
+      return flag;
+    }
+  flag = read_uint (flag, name, &a->maxskip);
+  m = a->maxskip;
+  if (m > n) m = n;
+  if (m > 0) m--; /* -falign-foo=N,M means M-1 max bytes of padding, not M */
+  a->maxskip = m;
+  return flag;
+}
+
+/* Parse "N[,M[,N2[,M2]]]" string FLAG into a pair of struct align_flags.  */
 static void
-init_alignments (void)
+parse_N_M (const char *flag, const char *name, struct align_flags a[2],
+	   unsigned int min_align_log)
 {
-  if (align_loops <= 0)
-    align_loops = 1;
-  if (align_loops_max_skip > align_loops)
-    align_loops_max_skip = align_loops - 1;
-  align_loops_log = floor_log2 (align_loops * 2 - 1);
-  if (align_jumps <= 0)
-    align_jumps = 1;
-  if (align_jumps_max_skip > align_jumps)
-    align_jumps_max_skip = align_jumps - 1;
-  align_jumps_log = floor_log2 (align_jumps * 2 - 1);
-  if (align_labels <= 0)
-    align_labels = 1;
-  align_labels_log = floor_log2 (align_labels * 2 - 1);
-  if (align_labels_max_skip > align_labels)
-    align_labels_max_skip = align_labels - 1;
-  if (align_functions <= 0)
-    align_functions = 1;
-  align_functions_log = floor_log2 (align_functions * 2 - 1);
+  if (flag)
+    {
+      flag = read_log_maxskip (flag, name, &a[0]);
+      if (flag)
+	flag = read_log_maxskip (flag, name, &a[1]);
+#ifdef SUBALIGN_LOG
+      else
+	{
+	  /* N2[,M2] is not specified. This arch has a default for N2.
+	     Before -falign-foo=N,M,N2,M2 was introduced, x86 had a tweak.
+	     -falign-functions=N with N > 8 was adding secondary alignment.
+	     -falign-functions=10 was emitting this before every function:
+			.p2align 4,,9
+			.p2align 3
+	     Now this behavior (and more) can be explicitly requested:
+	     -falign-functions=16,10,8
+	     Retain old behavior if N2 is missing: */
+
+	  int align = 1 << a[0].log;
+	  int subalign = 1 << SUBALIGN_LOG;
+
+	  if (a[0].log > SUBALIGN_LOG && a[0].maxskip >= subalign - 1)
+	    {
+	      /* Set N2 unless subalign can never have any effect */
+	      if (align > a[0].maxskip + 1)
+		a[1].log = SUBALIGN_LOG;
+	    }
+	}
+#endif
+    }
+  if ((unsigned int)a[0].log < min_align_log)
+    {
+      a[0].log = min_align_log;
+      a[0].maxskip = (1 << min_align_log) - 1;
+    }
 }
 
+/* Minimum alignment requirements, if arch has them.  */
+unsigned int min_align_loops_log = 0;
+unsigned int min_align_jumps_log = 0;
+unsigned int min_align_labels_log = 0;
+unsigned int min_align_functions_log = 0;
+
+/* Process -falign-foo=N[,M[,N2[,M2]]] options.  */
+void
+parse_alignment_opts (void)
+{
+  parse_N_M (str_align_loops, "loops", align_loops, min_align_loops_log);
+  parse_N_M (str_align_jumps, "jumps", align_jumps, min_align_jumps_log);
+  parse_N_M (str_align_labels, "labels", align_labels, min_align_labels_log);
+  parse_N_M (str_align_functions, "functions", align_functions,
+	     min_align_functions_log);
+}
+
 /* Process the options that have been parsed.  */
 static void
 process_options (void)
@@ -1640,7 +1720,7 @@ static void
 backend_init_target (void)
 {
   /* Initialize alignment variables.  */
-  init_alignments ();
+  parse_alignment_opts ();
 
   /* This depends on stack_pointer_rtx.  */
   init_fake_stack_mems ();
Index: gcc/toplev.h
===================================================================
--- gcc/toplev.h	(revision 246948)
+++ gcc/toplev.h	(working copy)
@@ -93,6 +93,13 @@ extern bool set_src_pwd		       (const char *);
 extern HOST_WIDE_INT get_random_seed (bool);
 extern const char *set_random_seed (const char *);
 
+extern unsigned int min_align_loops_log;
+extern unsigned int min_align_jumps_log;
+extern unsigned int min_align_labels_log;
+extern unsigned int min_align_functions_log;
+
+extern void parse_alignment_opts (void);
+
 extern void initialize_rtl (void);
 
 #endif /* ! GCC_TOPLEV_H */
Index: gcc/varasm.c
===================================================================
--- gcc/varasm.c	(revision 246948)
+++ gcc/varasm.c	(working copy)
@@ -1792,9 +1792,9 @@ assemble_start_function (tree decl, const char *fn
       && optimize_function_for_speed_p (cfun))
     {
 #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
-      int align_log = align_functions_log;
+      int align_log = align_functions[0].log;
 #endif
-      int max_skip = align_functions - 1;
+      int max_skip = align_functions[0].maxskip;
       if (flag_limit_function_alignment && crtl->max_insn_address > 0
 	  && max_skip >= crtl->max_insn_address)
 	max_skip = crtl->max_insn_address - 1;
@@ -1801,8 +1801,11 @@ assemble_start_function (tree decl, const char *fn
 
 #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
       ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, align_log, max_skip);
+      if (max_skip == align_functions[0].maxskip)
+        ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, align_functions[1].log,
+				   align_functions[1].maxskip);
 #else
-      ASM_OUTPUT_ALIGN (asm_out_file, align_functions_log);
+      ASM_OUTPUT_ALIGN (asm_out_file, align_functions[0].log);
 #endif
     }
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]]
  2017-04-18 18:46 ` [PATCH 3/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] Denys Vlasenko
@ 2017-04-18 19:12   ` Sandra Loosemore
  0 siblings, 0 replies; 9+ messages in thread
From: Sandra Loosemore @ 2017-04-18 19:12 UTC (permalink / raw)
  To: Denys Vlasenko, gcc-patches; +Cc: Andrew Pinski, Uros Bizjak, Bernd Schmidt

On 04/18/2017 12:30 PM, Denys Vlasenko wrote:
>
> 2017-04-18  Denys Vlasenko  <dvlasenk@redhat.com>
>
>      * doc/invoke.texi: Update option documentation.
>      [snip]

The documentation part of this version is OK.

-Sandra


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] version 8
  2017-04-18 18:30 [PATCH 0/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] version 8 Denys Vlasenko
                   ` (2 preceding siblings ...)
  2017-04-18 18:46 ` [PATCH 3/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] Denys Vlasenko
@ 2017-05-05 14:40 ` Denys Vlasenko
  3 siblings, 0 replies; 9+ messages in thread
From: Denys Vlasenko @ 2017-05-05 14:40 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andrew Pinski, Uros Bizjak, Bernd Schmidt, Sandra Loosemore

On 04/18/2017 08:30 PM, Denys Vlasenko wrote:
> These patches are for this bug:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66240
> "RFE: extend -falign-xyz syntax"

Ping.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] Remove support for obsolete x86 -malign-foo options
  2017-04-18 18:30 ` [PATCH 1/3] Remove support for obsolete x86 -malign-foo options Denys Vlasenko
@ 2017-05-06  7:22   ` Uros Bizjak
  2017-05-11 12:24     ` Denys Vlasenko
  2018-02-12 10:07     ` Martin Liška
  0 siblings, 2 replies; 9+ messages in thread
From: Uros Bizjak @ 2017-05-06  7:22 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: gcc-patches, Andrew Pinski, Bernd Schmidt, Sandra Loosemore

On Tue, Apr 18, 2017 at 8:30 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> 2017-04-18  Denys Vlasenko  <dvlasenk@redhat.com>
>
>     * config/i386/i386-common.c (ix86_handle_option): Remove support
>     for obsolete -malign-loops, -malign-jumps and -malign-functions
>     options.
>     * config/i386/i386.opt: Likewise.
> Index: gcc/common/config/i386/i386-common.c
> ===================================================================
> --- gcc/common/config/i386/i386-common.c        (revision 240663)
> +++ gcc/common/config/i386/i386-common.c        (working copy)
> @@ -998,38 +998,6 @@ ix86_handle_option (struct gcc_options *opts,
>         }
>        return true;
>
> -
> -  /* Comes from final.c -- no real reason to change it.  */
> -#define MAX_CODE_ALIGN 16
> -
> -    case OPT_malign_loops_:
> -      warning_at (loc, 0, "-malign-loops is obsolete, use -falign-loops");
> -      if (value > MAX_CODE_ALIGN)
> -       error_at (loc, "-malign-loops=%d is not between 0 and %d",
> -                 value, MAX_CODE_ALIGN);
> -      else
> -       opts->x_align_loops = 1 << value;
> -      return true;
> -
> -    case OPT_malign_jumps_:
> -      warning_at (loc, 0, "-malign-jumps is obsolete, use -falign-jumps");
> -      if (value > MAX_CODE_ALIGN)
> -       error_at (loc, "-malign-jumps=%d is not between 0 and %d",
> -                 value, MAX_CODE_ALIGN);
> -      else
> -       opts->x_align_jumps = 1 << value;
> -      return true;
> -
> -    case OPT_malign_functions_:
> -      warning_at (loc, 0,
> -                 "-malign-functions is obsolete, use -falign-functions");
> -      if (value > MAX_CODE_ALIGN)
> -       error_at (loc, "-malign-functions=%d is not between 0 and %d",
> -                 value, MAX_CODE_ALIGN);
> -      else
> -       opts->x_align_functions = 1 << value;
> -      return true;
> -
>      case OPT_mbranch_cost_:
>        if (value > 5)
>         {
> Index: gcc/config/i386/i386.opt
> ===================================================================
> --- gcc/config/i386/i386.opt    (revision 240663)
> +++ gcc/config/i386/i386.opt    (working copy)
> @@ -205,18 +205,6 @@ malign-double
>  Target Report Mask(ALIGN_DOUBLE) Save
>  Align some doubles on dword boundary.
>
> -malign-functions=
> -Target RejectNegative Joined UInteger
> -Function starts are aligned to this power of 2.
> -
> -malign-jumps=
> -Target RejectNegative Joined UInteger
> -Jump targets are aligned to this power of 2.
> -
> -malign-loops=
> -Target RejectNegative Joined UInteger
> -Loop code aligned to this power of 2.
> -
>  malign-stringops
>  Target RejectNegative Report InverseMask(NO_ALIGN_STRINGOPS, ALIGN_STRINGOPS) Save
>  Align destination of the string operations.

Instead of removing the above definitions, please rather redefine them
in a similar way -mcpu in i386.opt is obsoleted, e.g.:

malign-functions=
Target RejectNegative Joined Undocumented Alias(falign-functions=)
Warn(%<-malign-functions%> is obsolete, use %<-falign-functions%>)

This cleanup should be done a long time ago, the patch can be
committed independently of other patches in the series.

Uros.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] Remove support for obsolete x86 -malign-foo options
  2017-05-06  7:22   ` Uros Bizjak
@ 2017-05-11 12:24     ` Denys Vlasenko
  2018-02-12 10:07     ` Martin Liška
  1 sibling, 0 replies; 9+ messages in thread
From: Denys Vlasenko @ 2017-05-11 12:24 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches, Andrew Pinski, Bernd Schmidt, Sandra Loosemore

On 05/06/2017 09:20 AM, Uros Bizjak wrote:
> On Tue, Apr 18, 2017 at 8:30 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
>> 2017-04-18  Denys Vlasenko  <dvlasenk@redhat.com>
>>
>>     * config/i386/i386-common.c (ix86_handle_option): Remove support
>>     for obsolete -malign-loops, -malign-jumps and -malign-functions
>>     options.
>>     * config/i386/i386.opt: Likewise.
...
>> --- gcc/config/i386/i386.opt    (revision 240663)
>> +++ gcc/config/i386/i386.opt    (working copy)
>> @@ -205,18 +205,6 @@ malign-double
>>  Target Report Mask(ALIGN_DOUBLE) Save
>>  Align some doubles on dword boundary.
>>
>> -malign-functions=
>> -Target RejectNegative Joined UInteger
>> -Function starts are aligned to this power of 2.
>> -
>> -malign-jumps=
>> -Target RejectNegative Joined UInteger
>> -Jump targets are aligned to this power of 2.
>> -
>> -malign-loops=
>> -Target RejectNegative Joined UInteger
>> -Loop code aligned to this power of 2.
>> -
>>  malign-stringops
>>  Target RejectNegative Report InverseMask(NO_ALIGN_STRINGOPS, ALIGN_STRINGOPS) Save
>>  Align destination of the string operations.
>
> Instead of removing the above definitions, please rather redefine them
> in a similar way -mcpu in i386.opt is obsoleted

They were already obsoleted sixteen years ago. The warning message
was added:

    if (ix86_align_loops_string)
      {
-      i = atoi (ix86_align_loops_string);
-      if (i < 0 || i > MAX_CODE_ALIGN)
-       error ("-malign-loops=%d is not between 0 and %d", i, MAX_CODE_ALIGN);
-      else
-       ix86_align_loops = i;
+      warning ("-malign-loops is obsolete, use -falign-loops");

in the year 2001:

commit a2b35d8705efb23182c3e4b75a5e7727b6ddfc88
Author: geoffk <geoffk@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Fri May 4 06:31:27 2001 +0000

             * invoke.texi (i386 Options): Delete references to -malign-jumps,
             -malign-loops, -malign-functions.
             * i386.c (ix86_align_funcs): Delete.
             (ix86_align_loops): Delete.
             (ix86_align_jumps): Delete.
             (override_options): Mark -malign-* as obsolete.  Emulate their
             behaviour with the -falign-* options.  Default -falign-* from
             the processor table.
             * i386.h (FUNCTION_BOUNDARY): Define to 16; revert Richard Kenner's
             patch of Wed May 2 13:09:36 2001.
             (LOOP_ALIGN): Delete.
             (LOOP_ALIGN_MAX_SKIP): Delete.
             (LABEL_ALIGN_AFTER_BARRIER): Delete.
             (LABEL_ALIGN_AFTER_BARRIER_MAX_SKIP): Delete.

     git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@41825 138bc75d-0d04-0410-961f-82ee72b054a4


I would think sixteen years of receiving these warnings should enough
for everyone to switch to the -falign options.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] Remove support for obsolete x86 -malign-foo options
  2017-05-06  7:22   ` Uros Bizjak
  2017-05-11 12:24     ` Denys Vlasenko
@ 2018-02-12 10:07     ` Martin Liška
  1 sibling, 0 replies; 9+ messages in thread
From: Martin Liška @ 2018-02-12 10:07 UTC (permalink / raw)
  To: Uros Bizjak, Denys Vlasenko
  Cc: gcc-patches, Andrew Pinski, Bernd Schmidt, Sandra Loosemore

On 05/06/2017 09:20 AM, Uros Bizjak wrote:
> On Tue, Apr 18, 2017 at 8:30 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
>> 2017-04-18  Denys Vlasenko  <dvlasenk@redhat.com>
>>
>>     * config/i386/i386-common.c (ix86_handle_option): Remove support
>>     for obsolete -malign-loops, -malign-jumps and -malign-functions
>>     options.
>>     * config/i386/i386.opt: Likewise.
>> Index: gcc/common/config/i386/i386-common.c
>> ===================================================================
>> --- gcc/common/config/i386/i386-common.c        (revision 240663)
>> +++ gcc/common/config/i386/i386-common.c        (working copy)
>> @@ -998,38 +998,6 @@ ix86_handle_option (struct gcc_options *opts,
>>         }
>>        return true;
>>
>> -
>> -  /* Comes from final.c -- no real reason to change it.  */
>> -#define MAX_CODE_ALIGN 16
>> -
>> -    case OPT_malign_loops_:
>> -      warning_at (loc, 0, "-malign-loops is obsolete, use -falign-loops");
>> -      if (value > MAX_CODE_ALIGN)
>> -       error_at (loc, "-malign-loops=%d is not between 0 and %d",
>> -                 value, MAX_CODE_ALIGN);
>> -      else
>> -       opts->x_align_loops = 1 << value;
>> -      return true;
>> -
>> -    case OPT_malign_jumps_:
>> -      warning_at (loc, 0, "-malign-jumps is obsolete, use -falign-jumps");
>> -      if (value > MAX_CODE_ALIGN)
>> -       error_at (loc, "-malign-jumps=%d is not between 0 and %d",
>> -                 value, MAX_CODE_ALIGN);
>> -      else
>> -       opts->x_align_jumps = 1 << value;
>> -      return true;
>> -
>> -    case OPT_malign_functions_:
>> -      warning_at (loc, 0,
>> -                 "-malign-functions is obsolete, use -falign-functions");
>> -      if (value > MAX_CODE_ALIGN)
>> -       error_at (loc, "-malign-functions=%d is not between 0 and %d",
>> -                 value, MAX_CODE_ALIGN);
>> -      else
>> -       opts->x_align_functions = 1 << value;
>> -      return true;
>> -
>>      case OPT_mbranch_cost_:
>>        if (value > 5)
>>         {
>> Index: gcc/config/i386/i386.opt
>> ===================================================================
>> --- gcc/config/i386/i386.opt    (revision 240663)
>> +++ gcc/config/i386/i386.opt    (working copy)
>> @@ -205,18 +205,6 @@ malign-double
>>  Target Report Mask(ALIGN_DOUBLE) Save
>>  Align some doubles on dword boundary.
>>
>> -malign-functions=
>> -Target RejectNegative Joined UInteger
>> -Function starts are aligned to this power of 2.
>> -
>> -malign-jumps=
>> -Target RejectNegative Joined UInteger
>> -Jump targets are aligned to this power of 2.
>> -
>> -malign-loops=
>> -Target RejectNegative Joined UInteger
>> -Loop code aligned to this power of 2.
>> -
>>  malign-stringops
>>  Target RejectNegative Report InverseMask(NO_ALIGN_STRINGOPS, ALIGN_STRINGOPS) Save
>>  Align destination of the string operations.
> 
> Instead of removing the above definitions, please rather redefine them
> in a similar way -mcpu in i386.opt is obsoleted, e.g.:
> 
> malign-functions=
> Target RejectNegative Joined Undocumented Alias(falign-functions=)
> Warn(%<-malign-functions%> is obsolete, use %<-falign-functions%>)

Please correct me but doing the alias is not simple as value of -malign-functions
option is a power of 2, while -falign-functions= is an absolute value.
Thus -malign-functions=5 == -falign-functions=32.

I believe the legacy options are not problem for the patch series as it only sets
value of -falign-functions option.

Martin

> 
> This cleanup should be done a long time ago, the patch can be
> committed independently of other patches in the series.
> 
> Uros.
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-02-12 10:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-18 18:30 [PATCH 0/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] version 8 Denys Vlasenko
2017-04-18 18:30 ` [PATCH 1/3] Remove support for obsolete x86 -malign-foo options Denys Vlasenko
2017-05-06  7:22   ` Uros Bizjak
2017-05-11 12:24     ` Denys Vlasenko
2018-02-12 10:07     ` Martin Liška
2017-04-18 18:30 ` [PATCH 2/3] Temporary remove "at least 8 byte alignment" code from x86 Denys Vlasenko
2017-04-18 18:46 ` [PATCH 3/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] Denys Vlasenko
2017-04-18 19:12   ` Sandra Loosemore
2017-05-05 14:40 ` [PATCH 0/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] version 8 Denys Vlasenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).