public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Jan Hubicka <hubicka@ucw.cz>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: Add -falign-all-functions
Date: Mon, 8 Jan 2024 13:59:29 +0100 (CET)	[thread overview]
Message-ID: <93nnq110-3974-p060-sp9p-q2pn0641687p@fhfr.qr> (raw)
In-Reply-To: <ZZbHOli2kMYK3n6r@kam.mff.cuni.cz>

On Thu, 4 Jan 2024, Jan Hubicka wrote:

> Hi,
> this patch adds new option -falign-all-functions which works like
> -falign-functions, but applies to all functions including those in cold
> regions.  As discussed in the PR log, this is needed for atomically
> patching function entries in the kernel.
> 
> An option would be to make -falign-function mandatory, but I think it is not a
> good idea, since original purpose of -falign-funtions is optimization of
> instruction decode and cache size.  Having -falign-all-functions is
> backwards compatible.  Richi also suggested extending syntax of the
> -falign-functions parameters (which is already non-trivial) but it seems
> to me that having separate flag is more readable.
> 
> Bootstrapped/regtested x86_64-linux, OK for master and later
> backports to release branches?
> 
> gcc/ChangeLog:
> 	
> 	PR middle-end/88345
> 	* common.opt: Add -falign-all-functions
> 	* doc/invoke.texi: Add -falign-all-functions.
> 	(-falign-functions, -falign-labels, -falign-loops): Document
> 	that alignment is ignored in cold code.
> 	* flags.h (align_loops): Reindent.
> 	(align_jumps): Reindent.
> 	(align_labels): Reindent.
> 	(align_functions): Reindent.
> 	(align_all_functions): New macro.
> 	* opts.cc (common_handle_option): Handle -falign-all-functions.
> 	* toplev.cc (parse_alignment_opts): Likewise.
> 	* varasm.cc (assemble_start_function): Likewise.
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index d263a959df3..fea2c855fcf 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1033,6 +1033,13 @@ faggressive-loop-optimizations
>  Common Var(flag_aggressive_loop_optimizations) Optimization Init(1)
>  Aggressively optimize loops using language constraints.
>  
> +falign-all-functions
> +Common Var(flag_align_all_functions) Optimization
> +Align the start of functions.

all functions

or maybe "of every function."?

> +
> +falign-all-functions=
> +Common RejectNegative Joined Var(str_align_all_functions) Optimization
> +
>  falign-functions
>  Common Var(flag_align_functions) Optimization
>  Align the start of functions.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index d272b9228dd..ad3d75d310c 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -543,6 +543,7 @@ Objective-C and Objective-C++ Dialects}.
>  @xref{Optimize Options,,Options that Control Optimization}.
>  @gccoptlist{-faggressive-loop-optimizations
>  -falign-functions[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]]
> +-falign-all-functions=[@var{n}]
>  -falign-jumps[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]]
>  -falign-labels[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]]
>  -falign-loops[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]]
> @@ -14177,6 +14178,9 @@ Align the start of functions to the next power-of-two greater than or
>  equal to @var{n}, skipping up to @var{m}-1 bytes.  This ensures that at
>  least the first @var{m} bytes of the function can be fetched by the CPU
>  without crossing an @var{n}-byte alignment boundary.
> +This is an optimization of code performance and alignment is ignored for
> +functions considered cold.  If alignment is required for all functions,
> +use @option{-falign-all-functions}.
>  
>  If @var{m} is not specified, it defaults to @var{n}.
>  
> @@ -14210,6 +14214,12 @@ overaligning functions. It attempts to instruct the assembler to align
>  by the amount specified by @option{-falign-functions}, but not to
>  skip more bytes than the size of the function.
>  
> +@opindex falign-all-functions=@var{n}
> +@item -falign-all-functions
> +Specify minimal alignment for function entry. Unlike @option{-falign-functions}
> +this alignment is applied also to all functions (even those considered cold).
> +The alignment is also not affected by @option{-flimit-function-alignment}
> +

For functions with two entries (like on powerpc), which entry does this
apply to?  I suppose the external ABI entry, not the local one?  But
how does this then help to align the patchable entry (the common
local entry should be aligned?).  Should we align _both_ entries?

>  @opindex falign-labels
>  @item -falign-labels
>  @itemx -falign-labels=@var{n}
> @@ -14240,6 +14250,8 @@ Enabled at levels @option{-O2}, @option{-O3}.
>  Align loops to a power-of-two boundary.  If the loops are executed
>  many times, this makes up for any execution of the dummy padding
>  instructions.
> +This is an optimization of code performance and alignment is ignored for
> +loops considered cold.
>  
>  If @option{-falign-labels} is greater than this value, then its value
>  is used instead.
> @@ -14262,6 +14274,8 @@ Enabled at levels @option{-O2}, @option{-O3}.
>  Align branch targets to a power-of-two boundary, for branch targets
>  where the targets can only be reached by jumping.  In this case,
>  no dummy operations need be executed.
> +This is an optimization of code performance and alignment is ignored for
> +jumps considered cold.
>  
>  If @option{-falign-labels} is greater than this value, then its value
>  is used instead.
> @@ -14371,7 +14385,7 @@ To use the link-time optimizer, @option{-flto} and optimization
>  options should be specified at compile time and during the final link.
>  It is recommended that you compile all the files participating in the
>  same link with the same options and also specify those options at
> -link time.  
> +link time.
>  For example:
>  
>  @smallexample
> diff --git a/gcc/flags.h b/gcc/flags.h
> index e4bafa310d6..ecf4fb9e846 100644
> --- a/gcc/flags.h
> +++ b/gcc/flags.h
> @@ -89,6 +89,7 @@ public:
>    align_flags x_align_jumps;
>    align_flags x_align_labels;
>    align_flags x_align_functions;
> +  align_flags x_align_all_functions;
>  };
>  
>  extern class target_flag_state default_target_flag_state;
> @@ -98,10 +99,11 @@ extern class target_flag_state *this_target_flag_state;
>  #define this_target_flag_state (&default_target_flag_state)
>  #endif
>  
> -#define align_loops	 (this_target_flag_state->x_align_loops)
> -#define align_jumps	 (this_target_flag_state->x_align_jumps)
> -#define align_labels	 (this_target_flag_state->x_align_labels)
> -#define align_functions	 (this_target_flag_state->x_align_functions)
> +#define align_loops	    (this_target_flag_state->x_align_loops)
> +#define align_jumps	    (this_target_flag_state->x_align_jumps)
> +#define align_labels	    (this_target_flag_state->x_align_labels)
> +#define align_functions	    (this_target_flag_state->x_align_functions)
> +#define align_all_functions (this_target_flag_state->x_align_all_functions)
>  
>  /* Returns TRUE if generated code should match ABI version N or
>     greater is in use.  */
> diff --git a/gcc/opts.cc b/gcc/opts.cc
> index 7a3830caaa3..3fa521501ff 100644
> --- a/gcc/opts.cc
> +++ b/gcc/opts.cc
> @@ -3342,6 +3342,12 @@ common_handle_option (struct gcc_options *opts,
>  				&opts->x_str_align_functions);
>        break;
>  
> +    case OPT_falign_all_functions_:
> +      check_alignment_argument (loc, arg, "all-functions",
> +				&opts->x_flag_align_all_functions,
> +				&opts->x_str_align_all_functions);
> +      break;
> +
>      case OPT_ftabstop_:
>        /* It is documented that we silently ignore silly values.  */
>        if (value >= 1 && value <= 100)
> diff --git a/gcc/toplev.cc b/gcc/toplev.cc
> index 85450d97a1a..3dd6f4e1ef7 100644
> --- a/gcc/toplev.cc
> +++ b/gcc/toplev.cc
> @@ -1219,6 +1219,7 @@ parse_alignment_opts (void)
>    parse_N_M (str_align_jumps, align_jumps);
>    parse_N_M (str_align_labels, align_labels);
>    parse_N_M (str_align_functions, align_functions);
> +  parse_N_M (str_align_all_functions, align_all_functions);
>  }
>  
>  /* Process the options that have been parsed.  */
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 69f8f8ee018..ddb8536a337 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -1919,6 +1919,37 @@ assemble_start_function (tree decl, const char *fnname)
>        ASM_OUTPUT_ALIGN (asm_out_file, align);
>      }
>  
> +  /* Handle forced alignment.  This really ought to apply to all functions,
> +     since it is used by patchable entries.  */
> +  if (align_all_functions.levels[0].log > align)
> +    {
> +#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
> +      int align_log = align_all_functions.levels[0].log;
> +#endif
> +      int max_skip = align_all_functions.levels[0].maxskip;
> +      if (flag_limit_function_alignment && crtl->max_insn_address > 0
> +	  && max_skip >= crtl->max_insn_address)
> +	max_skip = crtl->max_insn_address - 1;
> +
> +#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
> +      ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, align_log, max_skip);
> +      if (max_skip >= (1 << align_log) - 1)
> +	align = align_functions.levels[0].log;
> +      if (max_skip == align_all_functions.levels[0].maxskip)
> +	{
> +	  ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file,
> +				     align_all_functions.levels[1].log,
> +				     align_all_functions.levels[1].maxskip);
> +	  if (align_all_functions.levels[1].maxskip
> +	      >= (1 << align_all_functions.levels[1].log) - 1)
> +	    align = align_all_functions.levels[1].log;
> +	}
> +#else
> +      ASM_OUTPUT_ALIGN (asm_out_file, align_all_functions.levels[0].log);
> +      align = align_all_functions.levels[0].log;
> +#endif

This looks close to the handling of user-specified alignment
(factor sth out?), but I wonder if for -falign-all-functions
we should only allow a hard alignment (no max_skip) and also
not allow (but diagnose?) conflicts with limit-function-alignment?

The interaction with the other flags also doesn't seem to be
well documented?  The main docs suggest it should be
-fmin-function-alignment= which to me then suggests
-flimit-function-alignment should not have an effect on it
and even very small functions should be aligned.

Richard.

> +    }
> +
>    /* Handle a user-specified function alignment.
>       Note that we still need to align to DECL_ALIGN, as above,
>       because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all.  */
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

  reply	other threads:[~2024-01-08 13:04 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-04 14:56 Jan Hubicka
2024-01-08 12:59 ` Richard Biener [this message]
2024-01-17 13:12   ` Jan Hubicka
2024-01-17 13:22     ` Richard Biener
2024-01-17 14:00       ` Jan Hubicka
2024-01-17 14:25         ` Richard Biener
2024-01-17 15:53           ` Jan Hubicka
2024-01-18 10:33             ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=93nnq110-3974-p060-sp9p-q2pn0641687p@fhfr.qr \
    --to=rguenther@suse.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).