From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 57A1D3858C42 for ; Wed, 17 Jan 2024 13:12:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 57A1D3858C42 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=ucw.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kam.mff.cuni.cz ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 57A1D3858C42 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.113.20.16 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705497163; cv=none; b=UaO/WyKZe9mz9l3pAXMbHhkc8jEnVM1aAVv6lcqT3M6lKeyjQd7yAycnieIVohUE72RM4Yb/Ju8XbNH709k+7MoVCNOhdIomYP3/SeYMOGh9IOKmPc/e634vuMMMKnBdcJZNTCIDE2y8MwcPxii8xzMnHstgB3a5R/CDdFJHbOE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705497163; c=relaxed/simple; bh=b/Vg7S6O6zZFSjYsv1/yd8xXQnPeMyBBPh+ee5GWVg4=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=vJxLXxHUWTnTRV4NgXmFmPg13srNz+sXqfVZ7OUHMODhOQ649BgaMoIHloOkywWN5FBlyevdt3QhB6f7ZYPztXhrk1BJh2Px7tZ6w6hom0Sd8b2maPBvQ/ZLV6ZZBf6Y4SZqKkadO+BHByKfYE/Y8UwE50IEuTK+nVvoA1E6O1o= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 75981283B52; Wed, 17 Jan 2024 14:12:40 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ucw.cz; s=gen1; t=1705497160; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=3LgPtEkNFXcUT9R4+/kC1GdDLLVgHMKD/r82nB5wlBc=; b=Ag6atK9Jq9zhsHTXgooQ2MNkSzZqCyu7KOOuYNHoaTKnWpDJdas7eIcDCA+nxgftN4G4u2 S0JyAxjLzJGzq/HLoSIIKf1M6uElUrs82ZuNlpaQ5OQdI6YeHaycjqx8G7WWcAvi944K0H t6jKfkmzQ/RE09MY8RHD9rq3s0MwDUY= Date: Wed, 17 Jan 2024 14:12:40 +0100 From: Jan Hubicka To: Richard Biener Cc: gcc-patches@gcc.gnu.org Subject: Re: Add -falign-all-functions Message-ID: References: <93nnq110-3974-p060-sp9p-q2pn0641687p@fhfr.qr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <93nnq110-3974-p060-sp9p-q2pn0641687p@fhfr.qr> X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,JMQ_SPF_NEUTRAL,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > > +falign-all-functions > > +Common Var(flag_align_all_functions) Optimization > > +Align the start of functions. > > all functions > > or maybe "of every function."? Fixed, thanks! > > +@opindex falign-all-functions=@var{n} > > +@item -falign-all-functions > > +Specify minimal alignment for function entry. Unlike @option{-falign-functions} > > +this alignment is applied also to all functions (even those considered cold). > > +The alignment is also not affected by @option{-flimit-function-alignment} > > + > > For functions with two entries (like on powerpc), which entry does this > apply to? I suppose the external ABI entry, not the local one? But > how does this then help to align the patchable entry (the common > local entry should be aligned?). Should we align _both_ entries? To be honest I did not really know we actually would like to patch alternative entry points. The function alignent is always produced before the start of function, so the first entry point wins and the other entry point is not aligned. Aligning later labels needs to go using the label align code, since theoretically some targets need to do relaxation over it. In final.cc we do no function alignments on labels those labels. I guess this makes sense because if we align for performance, we probably do not want the altenrate entry point to be aligned since it appears close to original one. I can add that to compute_alignment: test if label is alternative entry point and add alignment. I wonder if that is a desired behaviour though and is this code path even used? I know this was originally added to support i386 register passing conventions and stack alignment via alternative entry point, but it was never really used that way. Also there was plan to support Fortran alternative entry point. Looking at what rs6000 does, it seems to not use the RTL representation of alternative entry points. it seems that: 1) call assemble_start_functions which a) outputs function alignment b) outputs start label c) calls print_patchable_function_entry 2) call final_start_functions which calls output_function_prologue. In rs6000 there is second call to rs6000_print_patchable_function_entry So there is no target-independent place where alignment can be added, so I would say it is up to rs6000 maintainers to decide what is right here :) > > > @opindex falign-labels > > @item -falign-labels > > @itemx -falign-labels=@var{n} > > @@ -14240,6 +14250,8 @@ Enabled at levels @option{-O2}, @option{-O3}. > > Align loops to a power-of-two boundary. If the loops are executed > > many times, this makes up for any execution of the dummy padding > > instructions. > > +This is an optimization of code performance and alignment is ignored for > > +loops considered cold. > > > > If @option{-falign-labels} is greater than this value, then its value > > is used instead. > > @@ -14262,6 +14274,8 @@ Enabled at levels @option{-O2}, @option{-O3}. > > Align branch targets to a power-of-two boundary, for branch targets > > where the targets can only be reached by jumping. In this case, > > no dummy operations need be executed. > > +This is an optimization of code performance and alignment is ignored for > > +jumps considered cold. > > > > If @option{-falign-labels} is greater than this value, then its value > > is used instead. > > @@ -14371,7 +14385,7 @@ To use the link-time optimizer, @option{-flto} and optimization > > options should be specified at compile time and during the final link. > > It is recommended that you compile all the files participating in the > > same link with the same options and also specify those options at > > -link time. > > +link time. > > For example: > > > > @smallexample > > diff --git a/gcc/flags.h b/gcc/flags.h > > index e4bafa310d6..ecf4fb9e846 100644 > > --- a/gcc/flags.h > > +++ b/gcc/flags.h > > @@ -89,6 +89,7 @@ public: > > align_flags x_align_jumps; > > align_flags x_align_labels; > > align_flags x_align_functions; > > + align_flags x_align_all_functions; > > }; > > > > extern class target_flag_state default_target_flag_state; > > @@ -98,10 +99,11 @@ extern class target_flag_state *this_target_flag_state; > > #define this_target_flag_state (&default_target_flag_state) > > #endif > > > > -#define align_loops (this_target_flag_state->x_align_loops) > > -#define align_jumps (this_target_flag_state->x_align_jumps) > > -#define align_labels (this_target_flag_state->x_align_labels) > > -#define align_functions (this_target_flag_state->x_align_functions) > > +#define align_loops (this_target_flag_state->x_align_loops) > > +#define align_jumps (this_target_flag_state->x_align_jumps) > > +#define align_labels (this_target_flag_state->x_align_labels) > > +#define align_functions (this_target_flag_state->x_align_functions) > > +#define align_all_functions (this_target_flag_state->x_align_all_functions) > > > > /* Returns TRUE if generated code should match ABI version N or > > greater is in use. */ > > diff --git a/gcc/opts.cc b/gcc/opts.cc > > index 7a3830caaa3..3fa521501ff 100644 > > --- a/gcc/opts.cc > > +++ b/gcc/opts.cc > > @@ -3342,6 +3342,12 @@ common_handle_option (struct gcc_options *opts, > > &opts->x_str_align_functions); > > break; > > > > + case OPT_falign_all_functions_: > > + check_alignment_argument (loc, arg, "all-functions", > > + &opts->x_flag_align_all_functions, > > + &opts->x_str_align_all_functions); > > + break; > > + > > case OPT_ftabstop_: > > /* It is documented that we silently ignore silly values. */ > > if (value >= 1 && value <= 100) > > diff --git a/gcc/toplev.cc b/gcc/toplev.cc > > index 85450d97a1a..3dd6f4e1ef7 100644 > > --- a/gcc/toplev.cc > > +++ b/gcc/toplev.cc > > @@ -1219,6 +1219,7 @@ parse_alignment_opts (void) > > parse_N_M (str_align_jumps, align_jumps); > > parse_N_M (str_align_labels, align_labels); > > parse_N_M (str_align_functions, align_functions); > > + parse_N_M (str_align_all_functions, align_all_functions); > > } > > > > /* Process the options that have been parsed. */ > > diff --git a/gcc/varasm.cc b/gcc/varasm.cc > > index 69f8f8ee018..ddb8536a337 100644 > > --- a/gcc/varasm.cc > > +++ b/gcc/varasm.cc > > @@ -1919,6 +1919,37 @@ assemble_start_function (tree decl, const char *fnname) > > ASM_OUTPUT_ALIGN (asm_out_file, align); > > } > > > > + /* Handle forced alignment. This really ought to apply to all functions, > > + since it is used by patchable entries. */ > > + if (align_all_functions.levels[0].log > align) > > + { > > +#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN > > + int align_log = align_all_functions.levels[0].log; > > +#endif > > + int max_skip = align_all_functions.levels[0].maxskip; > > + if (flag_limit_function_alignment && crtl->max_insn_address > 0 > > + && max_skip >= crtl->max_insn_address) > > + max_skip = crtl->max_insn_address - 1; > > + > > +#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN > > + ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, align_log, max_skip); > > + if (max_skip >= (1 << align_log) - 1) > > + align = align_functions.levels[0].log; > > + if (max_skip == align_all_functions.levels[0].maxskip) > > + { > > + ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, > > + align_all_functions.levels[1].log, > > + align_all_functions.levels[1].maxskip); > > + if (align_all_functions.levels[1].maxskip > > + >= (1 << align_all_functions.levels[1].log) - 1) > > + align = align_all_functions.levels[1].log; > > + } > > +#else > > + ASM_OUTPUT_ALIGN (asm_out_file, align_all_functions.levels[0].log); > > + align = align_all_functions.levels[0].log; > > +#endif > > This looks close to the handling of user-specified alignment > (factor sth out?), but I wonder if for -falign-all-functions > we should only allow a hard alignment (no max_skip) and also > not allow (but diagnose?) conflicts with limit-function-alignment? Well, I do not see much use of the max-skip feature, but it seemed to me that perhaps it is better to keep the command line options symmetric. Actually my first iteration of the patch supported only one value, but then I kind of convinced myself that symmetricity is better. I am definitely fine with supporting only one alignment with no max-skip. > > The interaction with the other flags also doesn't seem to be > well documented? The main docs suggest it should be > -fmin-function-alignment= which to me then suggests Hmm, I do not see any mention of min-function-algnment > -flimit-function-alignment should not have an effect on it > and even very small functions should be aligned. I write that it is not affected by limit-function-alignment @opindex falign-all-functions=@var{n} @item -falign-all-functions Specify minimal alignment for function entry. Unlike @option{-falign-functions} this alignment is applied also to all functions (even those considered cold). The alignment is also not affected by @option{-flimit-function-alignment} Because indeed that would break the atomicity of updates. Honza > > Richard. > > > + } > > + > > /* Handle a user-specified function alignment. > > Note that we still need to align to DECL_ALIGN, as above, > > because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all. */ > > > > -- > Richard Biener > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)