From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2a07:de40:b251:101:10:150:64:1]) by sourceware.org (Postfix) with ESMTPS id 61783385840C for ; Wed, 17 Jan 2024 13:27:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 61783385840C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 61783385840C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705498053; cv=none; b=w6bha0LyQrv5LWuJkrBKJi7nPogf/AkQMyYwCyx9W7eQUTGAs80jn2dtgb249sb9la9c0ZXMeNZJ8jjpjrP6j+RC8EMuDiW4/TPi2uhhgk7LsCbq5O0GFkQB58HLyIuSvo2jmTIbNpPsV3S5lD/XJELHQ/nVMbKLwI6N0nYGf2k= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705498053; c=relaxed/simple; bh=jBVdIDCOio8H3Zz7aWg4nDR86enk2Ys0cvkSh1Jddqg=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:Message-ID:MIME-Version; b=JJFK1VDq+e5kvkRONlPX45oOefvmh6YpKMcEEKMwVTtv1LTvVK4Hmal8aUzLbKBOk3e6x1JUbZCgY/72konU2TLrP3bR0JT+5Gv/PzlHTJlDzHZnJS3tjhDFqBHPV/55ZyrdTva3uzscdtwa4vCqZFMxwJ1Oi4P9qwiSVrygyEk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [10.168.4.150] (unknown [10.168.4.150]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 0177721FD0; Wed, 17 Jan 2024 13:27:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1705498046; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=zTHI0NS/Q0mb5rI/FrTVbS0iPwTmopJMDxoLRhAJqv0=; b=cvEEcV8K9027eZZ4Vv/fNN7LbOCv24wpC65A6/kwQbvtFNA7zTUgXvH9HV3AwImdaPV25c VeSgp+MdBvl5a7Mvd3Xmsl7YicTIa7hcOsWVQijvV+qqt4TAuuvyKiP2+9Uo87sihCTYCN ixKSTLOwB5DcOlB/HTG5tJnumZQT4Mo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1705498046; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=zTHI0NS/Q0mb5rI/FrTVbS0iPwTmopJMDxoLRhAJqv0=; b=s2s4dT6PU/GilZYmU+qBk8Kt7EyPQ3yQGufLOCUijtNdWhCIEFhtVfA86hZva6eNLEaNcF gxpTIqZCAIkMSjDw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1705498046; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=zTHI0NS/Q0mb5rI/FrTVbS0iPwTmopJMDxoLRhAJqv0=; b=cvEEcV8K9027eZZ4Vv/fNN7LbOCv24wpC65A6/kwQbvtFNA7zTUgXvH9HV3AwImdaPV25c VeSgp+MdBvl5a7Mvd3Xmsl7YicTIa7hcOsWVQijvV+qqt4TAuuvyKiP2+9Uo87sihCTYCN ixKSTLOwB5DcOlB/HTG5tJnumZQT4Mo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1705498046; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=zTHI0NS/Q0mb5rI/FrTVbS0iPwTmopJMDxoLRhAJqv0=; b=s2s4dT6PU/GilZYmU+qBk8Kt7EyPQ3yQGufLOCUijtNdWhCIEFhtVfA86hZva6eNLEaNcF gxpTIqZCAIkMSjDw== Date: Wed, 17 Jan 2024 14:22:18 +0100 (CET) From: Richard Biener To: Jan Hubicka cc: gcc-patches@gcc.gnu.org Subject: Re: Add -falign-all-functions In-Reply-To: Message-ID: <72ssp99q-2447-6685-8307-n06499051o12@fhfr.qr> References: <93nnq110-3974-p060-sp9p-q2pn0641687p@fhfr.qr> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Authentication-Results: smtp-out1.suse.de; none X-Spam-Level: X-Spam-Score: -4.27 X-Spamd-Result: default: False [-4.27 / 50.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-0.99)[-0.987]; MIME_GOOD(-0.10)[text/plain]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.19)[-0.930]; RCPT_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-3.00)[100.00%] X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 17 Jan 2024, Jan Hubicka wrote: > > > +falign-all-functions > > > +Common Var(flag_align_all_functions) Optimization > > > +Align the start of functions. > > > > all functions > > > > or maybe "of every function."? > > Fixed, thanks! > > > +@opindex falign-all-functions=@var{n} > > > +@item -falign-all-functions > > > +Specify minimal alignment for function entry. Unlike @option{-falign-functions} > > > +this alignment is applied also to all functions (even those considered cold). > > > +The alignment is also not affected by @option{-flimit-function-alignment} > > > + > > > > For functions with two entries (like on powerpc), which entry does this > > apply to? I suppose the external ABI entry, not the local one? But > > how does this then help to align the patchable entry (the common > > local entry should be aligned?). Should we align _both_ entries? > > To be honest I did not really know we actually would like to patch > alternative entry points. > The function alignent is always produced before the start of function, > so the first entry point wins and the other entry point is not aligned. > > Aligning later labels needs to go using the label align code, since > theoretically some targets need to do relaxation over it. > > In final.cc we do no function alignments on labels those labels. > I guess this makes sense because if we align for performance, we > probably do not want the altenrate entry point to be aligned since it > appears close to original one. I can add that to compute_alignment: > test if label is alternative entry point and add alignment. > I wonder if that is a desired behaviour though and is this code > path even used? > > I know this was originally added to support i386 register passing > conventions and stack alignment via alternative entry point, but it was > never really used that way. Also there was plan to support Fortran > alternative entry point. > > Looking at what rs6000 does, it seems to not use the RTL representation > of alternative entry points. it seems that: > 1) call assemble_start_functions which > a) outputs function alignment > b) outputs start label > c) calls print_patchable_function_entry > 2) call final_start_functions which calls output_function_prologue. > In rs6000 there is second call to > rs6000_print_patchable_function_entry > So there is no target-independent place where alignment can be added, > so I would say it is up to rs6000 maintainers to decide what is right > here :) Fair enough ... > > > > > @opindex falign-labels > > > @item -falign-labels > > > @itemx -falign-labels=@var{n} > > > @@ -14240,6 +14250,8 @@ Enabled at levels @option{-O2}, @option{-O3}. > > > Align loops to a power-of-two boundary. If the loops are executed > > > many times, this makes up for any execution of the dummy padding > > > instructions. > > > +This is an optimization of code performance and alignment is ignored for > > > +loops considered cold. > > > > > > If @option{-falign-labels} is greater than this value, then its value > > > is used instead. > > > @@ -14262,6 +14274,8 @@ Enabled at levels @option{-O2}, @option{-O3}. > > > Align branch targets to a power-of-two boundary, for branch targets > > > where the targets can only be reached by jumping. In this case, > > > no dummy operations need be executed. > > > +This is an optimization of code performance and alignment is ignored for > > > +jumps considered cold. > > > > > > If @option{-falign-labels} is greater than this value, then its value > > > is used instead. > > > @@ -14371,7 +14385,7 @@ To use the link-time optimizer, @option{-flto} and optimization > > > options should be specified at compile time and during the final link. > > > It is recommended that you compile all the files participating in the > > > same link with the same options and also specify those options at > > > -link time. > > > +link time. > > > For example: > > > > > > @smallexample > > > diff --git a/gcc/flags.h b/gcc/flags.h > > > index e4bafa310d6..ecf4fb9e846 100644 > > > --- a/gcc/flags.h > > > +++ b/gcc/flags.h > > > @@ -89,6 +89,7 @@ public: > > > align_flags x_align_jumps; > > > align_flags x_align_labels; > > > align_flags x_align_functions; > > > + align_flags x_align_all_functions; > > > }; > > > > > > extern class target_flag_state default_target_flag_state; > > > @@ -98,10 +99,11 @@ extern class target_flag_state *this_target_flag_state; > > > #define this_target_flag_state (&default_target_flag_state) > > > #endif > > > > > > -#define align_loops (this_target_flag_state->x_align_loops) > > > -#define align_jumps (this_target_flag_state->x_align_jumps) > > > -#define align_labels (this_target_flag_state->x_align_labels) > > > -#define align_functions (this_target_flag_state->x_align_functions) > > > +#define align_loops (this_target_flag_state->x_align_loops) > > > +#define align_jumps (this_target_flag_state->x_align_jumps) > > > +#define align_labels (this_target_flag_state->x_align_labels) > > > +#define align_functions (this_target_flag_state->x_align_functions) > > > +#define align_all_functions (this_target_flag_state->x_align_all_functions) > > > > > > /* Returns TRUE if generated code should match ABI version N or > > > greater is in use. */ > > > diff --git a/gcc/opts.cc b/gcc/opts.cc > > > index 7a3830caaa3..3fa521501ff 100644 > > > --- a/gcc/opts.cc > > > +++ b/gcc/opts.cc > > > @@ -3342,6 +3342,12 @@ common_handle_option (struct gcc_options *opts, > > > &opts->x_str_align_functions); > > > break; > > > > > > + case OPT_falign_all_functions_: > > > + check_alignment_argument (loc, arg, "all-functions", > > > + &opts->x_flag_align_all_functions, > > > + &opts->x_str_align_all_functions); > > > + break; > > > + > > > case OPT_ftabstop_: > > > /* It is documented that we silently ignore silly values. */ > > > if (value >= 1 && value <= 100) > > > diff --git a/gcc/toplev.cc b/gcc/toplev.cc > > > index 85450d97a1a..3dd6f4e1ef7 100644 > > > --- a/gcc/toplev.cc > > > +++ b/gcc/toplev.cc > > > @@ -1219,6 +1219,7 @@ parse_alignment_opts (void) > > > parse_N_M (str_align_jumps, align_jumps); > > > parse_N_M (str_align_labels, align_labels); > > > parse_N_M (str_align_functions, align_functions); > > > + parse_N_M (str_align_all_functions, align_all_functions); > > > } > > > > > > /* Process the options that have been parsed. */ > > > diff --git a/gcc/varasm.cc b/gcc/varasm.cc > > > index 69f8f8ee018..ddb8536a337 100644 > > > --- a/gcc/varasm.cc > > > +++ b/gcc/varasm.cc > > > @@ -1919,6 +1919,37 @@ assemble_start_function (tree decl, const char *fnname) > > > ASM_OUTPUT_ALIGN (asm_out_file, align); > > > } > > > > > > + /* Handle forced alignment. This really ought to apply to all functions, > > > + since it is used by patchable entries. */ > > > + if (align_all_functions.levels[0].log > align) > > > + { > > > +#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN > > > + int align_log = align_all_functions.levels[0].log; > > > +#endif > > > + int max_skip = align_all_functions.levels[0].maxskip; > > > + if (flag_limit_function_alignment && crtl->max_insn_address > 0 > > > + && max_skip >= crtl->max_insn_address) > > > + max_skip = crtl->max_insn_address - 1; > > > + > > > +#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN > > > + ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, align_log, max_skip); > > > + if (max_skip >= (1 << align_log) - 1) > > > + align = align_functions.levels[0].log; > > > + if (max_skip == align_all_functions.levels[0].maxskip) > > > + { > > > + ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, > > > + align_all_functions.levels[1].log, > > > + align_all_functions.levels[1].maxskip); > > > + if (align_all_functions.levels[1].maxskip > > > + >= (1 << align_all_functions.levels[1].log) - 1) > > > + align = align_all_functions.levels[1].log; > > > + } > > > +#else > > > + ASM_OUTPUT_ALIGN (asm_out_file, align_all_functions.levels[0].log); > > > + align = align_all_functions.levels[0].log; > > > +#endif > > > > This looks close to the handling of user-specified alignment > > (factor sth out?), but I wonder if for -falign-all-functions > > we should only allow a hard alignment (no max_skip) and also > > not allow (but diagnose?) conflicts with limit-function-alignment? > > Well, I do not see much use of the max-skip feature, but it seemed to me > that perhaps it is better to keep the command line options symmetric. > > Actually my first iteration of the patch supported only one value, but > then I kind of convinced myself that symmetricity is better. > I am definitely fine with supporting only one alignment with no > max-skip. > > > > The interaction with the other flags also doesn't seem to be > > well documented? The main docs suggest it should be > > -fmin-function-alignment= which to me then suggests > Hmm, I do not see any mention of min-function-algnment I meant the new option might be named -fmin-function-alignment= rather than -falign-all-functions because of how it should override all other options. Otherwise is there an updated patch to look at? Richard. > > -flimit-function-alignment should not have an effect on it > > and even very small functions should be aligned. > > I write that it is not affected by limit-function-alignment > @opindex falign-all-functions=@var{n} > @item -falign-all-functions > Specify minimal alignment for function entry. Unlike @option{-falign-functions} > this alignment is applied also to all functions (even those considered cold). > The alignment is also not affected by @option{-flimit-function-alignment} > > Because indeed that would break the atomicity of updates. > Honza > > > > Richard. > > > > > + } > > > + > > > /* Handle a user-specified function alignment. > > > Note that we still need to align to DECL_ALIGN, as above, > > > because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all. */ > > > > > > > -- > > Richard Biener > > SUSE Software Solutions Germany GmbH, > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)