public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* New option -flimit-function-alignment
@ 2016-10-11 14:11 Bernd Schmidt
  2016-10-11 14:24 ` Denys Vlasenko
  0 siblings, 1 reply; 8+ messages in thread
From: Bernd Schmidt @ 2016-10-11 14:11 UTC (permalink / raw)
  To: GCC Patches, Denys Vlasenko

[-- Attachment #1: Type: text/plain, Size: 617 bytes --]

Denys has submitted some patches to add more capabilities to the 
-falign-* options, but these still have some issues, and the original 
ideas seems to have been to allow for large alignments without 
over-aligning small functions. The following patch implements that idea 
by taking into account the function size as computed during 
shorten_branches: let's say we use "-falign-functions=128 
-flimit-function-alignment", then a 15-byte function will be 16-byte 
aligned rather than 128-byte aligned.

Bootstrapped and tested on x86_64-linux. Denys, does this solve your 
problem? Anyone else, ok to commit?


Bernd

[-- Attachment #2: limit-align.diff --]
[-- Type: text/x-patch, Size: 4277 bytes --]

gcc/
	* common.opt (flimit-function-alignment): New.
	* doc/invoke.texi (-flimit-function-alignment): Document.
	* emit-rtl.h (struct rtl_data): Add max_insn_address field.
	* final.c (shorten_branches): Set it.
	* varasm.c (assemble_start_function): Limit alignment to it
	if requested.

gcc/testsuite/
	* gcc.target/i386/align-limit.c: New test.

Index: gcc/common.opt
===================================================================
--- gcc/common.opt	(revision 240861)
+++ gcc/common.opt	(working copy)
@@ -906,6 +906,9 @@ Align the start of functions.
 falign-functions=
 Common RejectNegative Joined UInteger Var(align_functions)
 
+flimit-function-alignment
+Common Report Var(flag_limit_function_alignment) Optimization Init(0)
+
 falign-jumps
 Common Report Var(align_jumps,0) Optimization UInteger
 Align labels which are only reached by jumping.
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 240861)
+++ gcc/doc/invoke.texi	(working copy)
@@ -368,7 +368,7 @@ Objective-C and Objective-C++ Dialects}.
 -fno-ira-share-spill-slots @gol
 -fisolate-erroneous-paths-dereference -fisolate-erroneous-paths-attribute @gol
 -fivopts -fkeep-inline-functions -fkeep-static-functions @gol
--fkeep-static-consts -flive-range-shrinkage @gol
+-fkeep-static-consts -flimit-function-alignment -flive-range-shrinkage @gol
 -floop-block -floop-interchange -floop-strip-mine @gol
 -floop-unroll-and-jam -floop-nest-optimize @gol
 -floop-parallelize-all -flra-remat -flto -flto-compression-level @gol
@@ -8262,6 +8262,11 @@ If @var{n} is not specified or is zero,
 
 Enabled at levels @option{-O2}, @option{-O3}.
 
+@item -flimit-function-alignment
+When function alignment is requested by @option{-falign-functions}, use
+a size estimate to prevent functions from being over-aligned.  This
+limits the alignment to be no larger than the function itself.
+
 @item -falign-labels
 @itemx -falign-labels=@var{n}
 @opindex falign-labels
Index: gcc/emit-rtl.h
===================================================================
--- gcc/emit-rtl.h	(revision 240861)
+++ gcc/emit-rtl.h	(working copy)
@@ -284,6 +284,9 @@ struct GTY(()) rtl_data {
      to eliminable regs (like the frame pointer) are set if an asm
      sets them.  */
   HARD_REG_SET asm_clobbers;
+
+  /* The highest address seen during shorten_branches.  */
+  unsigned HOST_WIDE_INT max_insn_address;
 };
 
 #define return_label (crtl->x_return_label)
Index: gcc/final.c
===================================================================
--- gcc/final.c	(revision 240861)
+++ gcc/final.c	(working copy)
@@ -1462,7 +1462,7 @@ shorten_branches (rtx_insn *first)
       if (!increasing)
 	break;
     }
-
+  crtl->max_insn_address = insn_current_address;
   free (varying_length);
 }
 
Index: gcc/varasm.c
===================================================================
--- gcc/varasm.c	(revision 240861)
+++ gcc/varasm.c	(working copy)
@@ -1791,9 +1791,19 @@ assemble_start_function (tree decl, cons
       && align_functions_log > align
       && optimize_function_for_speed_p (cfun))
     {
+      int align_log = align_functions_log;
+      int max_skip = align_functions - 1;
+      if (flag_limit_function_alignment && crtl->max_insn_address > 0)
+	{
+	  int size_log = ceil_log2 (crtl->max_insn_address);
+	  if (size_log < align_log)
+	    align_log = size_log;
+	  int sz = 1 << size_log;
+	  if (sz < max_skip)
+	    max_skip = sz - 1;
+	}
 #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
-      ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file,
-				 align_functions_log, align_functions - 1);
+      ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, align_log, max_skip);
 #else
       ASM_OUTPUT_ALIGN (asm_out_file, align_functions_log);
 #endif
Index: gcc/testsuite/gcc.target/i386/align-limit.c
===================================================================
--- gcc/testsuite/gcc.target/i386/align-limit.c	(nonexistent)
+++ gcc/testsuite/gcc.target/i386/align-limit.c	(working copy)
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -falign-functions=64 -flimit-function-alignment" } */
+/* { dg-final { scan-assembler ".p2align 1,,1" } } */
+/* { dg-final { scan-assembler-not ".p2align 6,,63" } } */
+
+void
+test_func (void)
+{
+}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: New option -flimit-function-alignment
  2016-10-11 14:11 New option -flimit-function-alignment Bernd Schmidt
@ 2016-10-11 14:24 ` Denys Vlasenko
  2016-10-11 20:14   ` Bernd Schmidt
  0 siblings, 1 reply; 8+ messages in thread
From: Denys Vlasenko @ 2016-10-11 14:24 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches



On 10/11/2016 04:11 PM, Bernd Schmidt wrote:
> Denys has submitted some patches to add more capabilities to the
> -falign-* options, but these still have some issues, and the original
> ideas seems to have been to allow for large alignments without
> over-aligning small functions. The following patch implements that
> idea by taking into account the function size as computed during
> shorten_branches: let's say we use "-falign-functions=128
> -flimit-function-alignment", then a 15-byte function will be 16-byte
> aligned rather than 128-byte aligned.
>
> Bootstrapped and tested on x86_64-linux. Denys, does this solve your
> problem?

This is better than current behavior, but this is not what I want.

15-byte function does not need to be aligned to 16 bytes on a machine
with 128-byte L1I cachelines. It needs to be aligned to 128 bytes
if there are less than 15 bytes remaining; if there are MORE than 15 bytes
remaining, why align at all? It's a waste of space.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: New option -flimit-function-alignment
  2016-10-11 14:24 ` Denys Vlasenko
@ 2016-10-11 20:14   ` Bernd Schmidt
  2016-10-12 19:27     ` Denys Vlasenko
  0 siblings, 1 reply; 8+ messages in thread
From: Bernd Schmidt @ 2016-10-11 20:14 UTC (permalink / raw)
  To: Denys Vlasenko, GCC Patches

On 10/11/2016 04:23 PM, Denys Vlasenko wrote:
>
> This is better than current behavior, but this is not what I want.
>
> 15-byte function does not need to be aligned to 16 bytes on a machine
> with 128-byte L1I cachelines. It needs to be aligned to 128 bytes
> if there are less than 15 bytes remaining; if there are MORE than 15 bytes
> remaining, why align at all? It's a waste of space.

So that sounds like you'd want to modify the max_skip value based on the 
size of the function - which would be a fairly minor change. We could 
call that -falign-functions-max-skip-size or something like that.


Bernd

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: New option -flimit-function-alignment
  2016-10-11 20:14   ` Bernd Schmidt
@ 2016-10-12 19:27     ` Denys Vlasenko
  2016-10-14 18:28       ` Bernd Schmidt
  0 siblings, 1 reply; 8+ messages in thread
From: Denys Vlasenko @ 2016-10-12 19:27 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches

On 10/11/2016 10:14 PM, Bernd Schmidt wrote:
> On 10/11/2016 04:23 PM, Denys Vlasenko wrote:
>>
>> This is better than current behavior, but this is not what I want.
>>
>> 15-byte function does not need to be aligned to 16 bytes on a
>> machine with 128-byte L1I cachelines. It needs to be aligned to 128
>> bytes if there are less than 15 bytes remaining; if there are MORE
>> than 15 bytes remaining, why align at all? It's a waste of space.
>
> So that sounds like you'd want to modify the max_skip value based on
> the size of the function - which would be a fairly minor change.

Yes, something like "if max_skip >= func_size, temporarily lower
max_skip to func_size-1" (because otherwise we can create padding
bigger-or-equal to the entire function in size, which is stupid
- it's better to just put the function in that space).

This would be a nice.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: New option -flimit-function-alignment
  2016-10-12 19:27     ` Denys Vlasenko
@ 2016-10-14 18:28       ` Bernd Schmidt
  2016-11-07 14:18         ` Bernd Schmidt
  2016-11-16 18:23         ` Jeff Law
  0 siblings, 2 replies; 8+ messages in thread
From: Bernd Schmidt @ 2016-10-14 18:28 UTC (permalink / raw)
  To: Denys Vlasenko, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 410 bytes --]

On 10/12/2016 09:27 PM, Denys Vlasenko wrote:
> Yes, something like "if max_skip >= func_size, temporarily lower
> max_skip to func_size-1" (because otherwise we can create padding
> bigger-or-equal to the entire function in size, which is stupid
> - it's better to just put the function in that space).
>
> This would be a nice.

That would be this patch. Bootstrapped and tested on x86_64-linux, ok?


Bernd

[-- Attachment #2: limit-align-v2b.diff --]
[-- Type: text/x-patch, Size: 4191 bytes --]

gcc/
	* common.opt (flimit-function-alignment): New.
	* doc/invoke.texi (-flimit-function-alignment): Document.
	* emit-rtl.h (struct rtl_data): Add max_insn_address field.
	* final.c (shorten_branches): Set it.
	* varasm.c (assemble_start_function): Limit alignment if
	requested.

gcc/testsuite/
	* gcc.target/i386/align-limit.c: New test.

Index: gcc/common.opt
===================================================================
--- gcc/common.opt	(revision 240861)
+++ gcc/common.opt	(working copy)
@@ -906,6 +906,9 @@ Align the start of functions.
 falign-functions=
 Common RejectNegative Joined UInteger Var(align_functions)
 
+flimit-function-alignment
+Common Report Var(flag_limit_function_alignment) Optimization Init(0)
+
 falign-jumps
 Common Report Var(align_jumps,0) Optimization UInteger
 Align labels which are only reached by jumping.
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 240861)
+++ gcc/doc/invoke.texi	(working copy)
@@ -368,7 +368,7 @@ Objective-C and Objective-C++ Dialects}.
 -fno-ira-share-spill-slots @gol
 -fisolate-erroneous-paths-dereference -fisolate-erroneous-paths-attribute @gol
 -fivopts -fkeep-inline-functions -fkeep-static-functions @gol
--fkeep-static-consts -flive-range-shrinkage @gol
+-fkeep-static-consts -flimit-function-alignment -flive-range-shrinkage @gol
 -floop-block -floop-interchange -floop-strip-mine @gol
 -floop-unroll-and-jam -floop-nest-optimize @gol
 -floop-parallelize-all -flra-remat -flto -flto-compression-level @gol
@@ -8262,6 +8262,12 @@ If @var{n} is not specified or is zero,
 
 Enabled at levels @option{-O2}, @option{-O3}.
 
+@item -flimit-function-alignment
+If this option is enabled, the compiler tries to avoid unnecessarily
+overaligning functions. It attempts to instruct the assembler to align
+by the amount specified by @option{-falign-functions}, but not to
+skip more bytes than the size of the function.
+
 @item -falign-labels
 @itemx -falign-labels=@var{n}
 @opindex falign-labels
Index: gcc/emit-rtl.h
===================================================================
--- gcc/emit-rtl.h	(revision 240861)
+++ gcc/emit-rtl.h	(working copy)
@@ -284,6 +284,9 @@ struct GTY(()) rtl_data {
      to eliminable regs (like the frame pointer) are set if an asm
      sets them.  */
   HARD_REG_SET asm_clobbers;
+
+  /* The highest address seen during shorten_branches.  */
+  int max_insn_address;
 };
 
 #define return_label (crtl->x_return_label)
Index: gcc/final.c
===================================================================
--- gcc/final.c	(revision 240861)
+++ gcc/final.c	(working copy)
@@ -1462,7 +1462,7 @@ shorten_branches (rtx_insn *first)
       if (!increasing)
 	break;
     }
-
+  crtl->max_insn_address = insn_current_address;
   free (varying_length);
 }
 
Index: gcc/testsuite/gcc.target/i386/align-limit.c
===================================================================
--- gcc/testsuite/gcc.target/i386/align-limit.c	(nonexistent)
+++ gcc/testsuite/gcc.target/i386/align-limit.c	(working copy)
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -falign-functions=64 -flimit-function-alignment" } */
+/* { dg-final { scan-assembler ".p2align 6,,1" } } */
+/* { dg-final { scan-assembler-not ".p2align 6,,63" } } */
+
+void
+test_func (void)
+{
+}
Index: gcc/varasm.c
===================================================================
--- gcc/varasm.c	(revision 240861)
+++ gcc/varasm.c	(working copy)
@@ -1791,9 +1791,14 @@ assemble_start_function (tree decl, cons
       && align_functions_log > align
       && optimize_function_for_speed_p (cfun))
     {
+      int align_log = align_functions_log;
+      int max_skip = align_functions - 1;
+      if (flag_limit_function_alignment && crtl->max_insn_address > 0
+	  && max_skip >= crtl->max_insn_address)
+	max_skip = crtl->max_insn_address - 1;
+
 #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
-      ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file,
-				 align_functions_log, align_functions - 1);
+      ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, align_log, max_skip);
 #else
       ASM_OUTPUT_ALIGN (asm_out_file, align_functions_log);
 #endif

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: New option -flimit-function-alignment
  2016-10-14 18:28       ` Bernd Schmidt
@ 2016-11-07 14:18         ` Bernd Schmidt
  2016-11-16 18:23         ` Jeff Law
  1 sibling, 0 replies; 8+ messages in thread
From: Bernd Schmidt @ 2016-11-07 14:18 UTC (permalink / raw)
  To: Denys Vlasenko, GCC Patches

On 10/14/2016 08:28 PM, Bernd Schmidt wrote:
> On 10/12/2016 09:27 PM, Denys Vlasenko wrote:
>> Yes, something like "if max_skip >= func_size, temporarily lower
>> max_skip to func_size-1" (because otherwise we can create padding
>> bigger-or-equal to the entire function in size, which is stupid
>> - it's better to just put the function in that space).
>>
>> This would be a nice.
>
> That would be this patch. Bootstrapped and tested on x86_64-linux, ok?

Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01187.html


Bernd

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: New option -flimit-function-alignment
  2016-10-14 18:28       ` Bernd Schmidt
  2016-11-07 14:18         ` Bernd Schmidt
@ 2016-11-16 18:23         ` Jeff Law
  2016-11-24 22:28           ` Rainer Orth
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff Law @ 2016-11-16 18:23 UTC (permalink / raw)
  To: Bernd Schmidt, Denys Vlasenko, GCC Patches

On 10/14/2016 12:28 PM, Bernd Schmidt wrote:
> On 10/12/2016 09:27 PM, Denys Vlasenko wrote:
>> Yes, something like "if max_skip >= func_size, temporarily lower
>> max_skip to func_size-1" (because otherwise we can create padding
>> bigger-or-equal to the entire function in size, which is stupid
>> - it's better to just put the function in that space).
>>
>> This would be a nice.
>
> That would be this patch. Bootstrapped and tested on x86_64-linux, ok?
>
>
> Bernd
>
> limit-align-v2b.diff
>
>
> gcc/
> 	* common.opt (flimit-function-alignment): New.
> 	* doc/invoke.texi (-flimit-function-alignment): Document.
> 	* emit-rtl.h (struct rtl_data): Add max_insn_address field.
> 	* final.c (shorten_branches): Set it.
> 	* varasm.c (assemble_start_function): Limit alignment if
> 	requested.
>
> gcc/testsuite/
> 	* gcc.target/i386/align-limit.c: New test.
OK.  Sorry for the long delay.

jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: New option -flimit-function-alignment
  2016-11-16 18:23         ` Jeff Law
@ 2016-11-24 22:28           ` Rainer Orth
  0 siblings, 0 replies; 8+ messages in thread
From: Rainer Orth @ 2016-11-24 22:28 UTC (permalink / raw)
  To: Jeff Law; +Cc: Bernd Schmidt, Denys Vlasenko, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1178 bytes --]

Hi Jeff,

>> gcc/
>> 	* common.opt (flimit-function-alignment): New.
>> 	* doc/invoke.texi (-flimit-function-alignment): Document.
>> 	* emit-rtl.h (struct rtl_data): Add max_insn_address field.
>> 	* final.c (shorten_branches): Set it.
>> 	* varasm.c (assemble_start_function): Limit alignment if
>> 	requested.
>>
>> gcc/testsuite/
>> 	* gcc.target/i386/align-limit.c: New test.
> OK.  Sorry for the long delay.

unfortunately, this broke Solaris 12/SPARC bootstrap with /bin/as:

/vol/gcc/src/hg/trunk/local/gcc/varasm.c: In function 'void assemble_start_function(tree, const char*)':
/vol/gcc/src/hg/trunk/local/gcc/varasm.c:1794:11: error: unused variable 'align_log' [-Werror=unused-variable]
       int align_log = align_functions_log;
           ^~~~~~~~~

The following fixes it and allowed bootstrap to complete successfully.
I'm going to commit it as obvious.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University


2016-11-24  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	* varasm.c (assemble_start_function): Wrap align_log definition in
	ASM_OUTPUT_MAX_SKIP_ALIGN.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: va.patch --]
[-- Type: text/x-patch, Size: 478 bytes --]

diff --git a/gcc/varasm.c b/gcc/varasm.c
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -1791,7 +1791,9 @@ assemble_start_function (tree decl, cons
       && align_functions_log > align
       && optimize_function_for_speed_p (cfun))
     {
+#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
       int align_log = align_functions_log;
+#endif
       int max_skip = align_functions - 1;
       if (flag_limit_function_alignment && crtl->max_insn_address > 0
 	  && max_skip >= crtl->max_insn_address)

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-11-24 22:28 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-11 14:11 New option -flimit-function-alignment Bernd Schmidt
2016-10-11 14:24 ` Denys Vlasenko
2016-10-11 20:14   ` Bernd Schmidt
2016-10-12 19:27     ` Denys Vlasenko
2016-10-14 18:28       ` Bernd Schmidt
2016-11-07 14:18         ` Bernd Schmidt
2016-11-16 18:23         ` Jeff Law
2016-11-24 22:28           ` Rainer Orth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).