From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by sourceware.org (Postfix) with ESMTPS id 2FED43858435 for ; Thu, 18 Jan 2024 10:34:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2FED43858435 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2FED43858435 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705574067; cv=none; b=mCRV0tKSFoNQ+LoA6eAPrfGuXrUFmrCeDmUGBx7lJIbFZBLPx+YxIcYBRMhcHp8ccC8YeaNdNgwtJOOYetQg6KHlyZ1gR+GtGAeB8V1Ln5D+d95UAWdcF/cyX+M8U27goWLET/HuRcHufiCtRxkKhWoFU1BbEo5dr8wuiO5Y2pU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705574067; c=relaxed/simple; bh=imcHgN3S6xb3H+jgCeevecyfrAOaqSK9wlt5wmw9iW8=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:Message-ID:MIME-Version; b=EvWLWo7fy2PRd17RSYf/SycEGlD5qCIIQR7T/su0f6R8sS8EZEmVdgVs42YUqNuQh7gOXFf+fDVQT5Ozk2LD4MD/iQS0M7wdwy6vrCuqNZvn4MJ2O8ueEEzwE5nHgJvNeR+aDgPPV0ghqHvfijE0BSDlxTlIPuiVp61K2odZg5Y= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [10.168.4.150] (unknown [10.168.4.150]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 14E4021F2E; Thu, 18 Jan 2024 10:34:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1705574060; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=KyVreUZL8Ck/gF4pht+E0MIlpDNtzjaP5TkkCXTBeFw=; b=pvSd+3s9WEfSOii8TaegQVsG0HP6thPq90LsuSkKqJOkQ24gtOS1T+WFlViykt7AZoyeDE 8NvEk/tYOgf0RPW3rT3r1xZ3M1WqK1OosDa6uOmw19S1COc9Ph8xtlZh/pc7O24/HHpOe7 /aTByxE2z4G6zaoM3I1yvDd7rPaons8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1705574060; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=KyVreUZL8Ck/gF4pht+E0MIlpDNtzjaP5TkkCXTBeFw=; b=jNR9CcE0OAlvFUHkncImhpEPWOijczeJ3qhHlPD/yervugxZnwXpX75KFbRVsgznr8+8AL ikHCJyhmOiVFAjDQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1705574060; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=KyVreUZL8Ck/gF4pht+E0MIlpDNtzjaP5TkkCXTBeFw=; b=pvSd+3s9WEfSOii8TaegQVsG0HP6thPq90LsuSkKqJOkQ24gtOS1T+WFlViykt7AZoyeDE 8NvEk/tYOgf0RPW3rT3r1xZ3M1WqK1OosDa6uOmw19S1COc9Ph8xtlZh/pc7O24/HHpOe7 /aTByxE2z4G6zaoM3I1yvDd7rPaons8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1705574060; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=KyVreUZL8Ck/gF4pht+E0MIlpDNtzjaP5TkkCXTBeFw=; b=jNR9CcE0OAlvFUHkncImhpEPWOijczeJ3qhHlPD/yervugxZnwXpX75KFbRVsgznr8+8AL ikHCJyhmOiVFAjDQ== Date: Thu, 18 Jan 2024 11:33:20 +0100 (CET) From: Richard Biener To: Jan Hubicka cc: gcc-patches@gcc.gnu.org Subject: Re: Add -falign-all-functions In-Reply-To: Message-ID: <10nqoq23-98s9-p9n4-rnp9-01n5np9051q8@fhfr.qr> References: <93nnq110-3974-p060-sp9p-q2pn0641687p@fhfr.qr> <72ssp99q-2447-6685-8307-n06499051o12@fhfr.qr> <78r5nsrq-opq6-0spr-5043-n1n092q52995@fhfr.qr> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Authentication-Results: smtp-out1.suse.de; none X-Spam-Level: X-Spam-Score: -4.27 X-Spamd-Result: default: False [-4.27 / 50.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.17)[-0.844]; RCPT_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-3.00)[100.00%] X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 17 Jan 2024, Jan Hubicka wrote: > > On Wed, 17 Jan 2024, Jan Hubicka wrote: > > > > > > > > > > I meant the new option might be named -fmin-function-alignment= > > > > rather than -falign-all-functions because of how it should > > > > override all other options. > > > > > > I was also pondering about both names. -falign-all-functions has the > > > advantage that it is similar to all the other alignment flags that are > > > all called -falign-XXX > > > > > > but both options are finte for me. > > > > > > > > Otherwise is there an updated patch to look at? > > > > > > I will prepare one. So shall I drop the max-skip support for alignment > > > and rename the flag? > > > > Yes. > OK, here is updated version. > Bootstrapped/regtested on x86_64-linux, OK? > > gcc/ChangeLog: > > * common.opt (flimit-function-alignment): Reorder so file is > alphabetically ordered. > (flimit-function-alignment): New flag. fmin-function-alignment OK with that change. Thanks, Richard. > * doc/invoke.texi (-fmin-function-alignment): Document > (-falign-jumps,-falign-labels): Document that this is an optimization > bypassed in cold code. > * varasm.cc (assemble_start_function): Honor -fmin-function-alignment. > > diff --git a/gcc/common.opt b/gcc/common.opt > index 5f0a101bccb..6e85853f086 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -1040,9 +1040,6 @@ Align the start of functions. > falign-functions= > Common RejectNegative Joined Var(str_align_functions) Optimization > > -flimit-function-alignment > -Common Var(flag_limit_function_alignment) Optimization Init(0) > - > falign-jumps > Common Var(flag_align_jumps) Optimization > Align labels which are only reached by jumping. > @@ -2277,6 +2274,10 @@ fmessage-length= > Common RejectNegative Joined UInteger > -fmessage-length= Limit diagnostics to characters per line. 0 suppresses line-wrapping. > > +fmin-function-alignment= > +Common Joined RejectNegative UInteger Var(flag_min_function_alignment) Optimization > +Align the start of every function. > + > fmodulo-sched > Common Var(flag_modulo_sched) Optimization > Perform SMS based modulo scheduling before the first scheduling pass. > @@ -2601,6 +2602,9 @@ starts and when the destructor finishes. > flifetime-dse= > Common Joined RejectNegative UInteger Var(flag_lifetime_dse) Optimization IntegerRange(0, 2) > > +flimit-function-alignment > +Common Var(flag_limit_function_alignment) Optimization Init(0) > + > flive-patching > Common RejectNegative Alias(flive-patching=,inline-clone) Optimization > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 43fd3c3a3cd..456374d9446 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -546,6 +546,7 @@ Objective-C and Objective-C++ Dialects}. > -falign-jumps[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]] > -falign-labels[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]] > -falign-loops[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]] > +-fmin-function-alignment=[@var{n}] > -fno-allocation-dce -fallow-store-data-races > -fassociative-math -fauto-profile -fauto-profile[=@var{path}] > -fauto-inc-dec -fbranch-probabilities > @@ -14177,6 +14178,9 @@ Align the start of functions to the next power-of-two greater than or > equal to @var{n}, skipping up to @var{m}-1 bytes. This ensures that at > least the first @var{m} bytes of the function can be fetched by the CPU > without crossing an @var{n}-byte alignment boundary. > +This is an optimization of code performance and alignment is ignored for > +functions considered cold. If alignment is required for all functions, > +use @option{-fmin-function-alignment}. > > If @var{m} is not specified, it defaults to @var{n}. > > @@ -14240,6 +14244,8 @@ Enabled at levels @option{-O2}, @option{-O3}. > Align loops to a power-of-two boundary. If the loops are executed > many times, this makes up for any execution of the dummy padding > instructions. > +This is an optimization of code performance and alignment is ignored for > +loops considered cold. > > If @option{-falign-labels} is greater than this value, then its value > is used instead. > @@ -14262,6 +14268,8 @@ Enabled at levels @option{-O2}, @option{-O3}. > Align branch targets to a power-of-two boundary, for branch targets > where the targets can only be reached by jumping. In this case, > no dummy operations need be executed. > +This is an optimization of code performance and alignment is ignored for > +jumps considered cold. > > If @option{-falign-labels} is greater than this value, then its value > is used instead. > @@ -14275,6 +14283,14 @@ The maximum allowed @var{n} option value is 65536. > > Enabled at levels @option{-O2}, @option{-O3}. > > +@opindex fmin-function-alignment=@var{n} > +@item -fmin-function-alignment > +Specify minimal alignment of functions to the next power-of-two greater than or > +equal to @var{n}. Unlike @option{-falign-functions} this alignment is applied > +also to all functions (even those considered cold). The alignment is also not > +affected by @option{-flimit-function-alignment} > + > + > @opindex fno-allocation-dce > @item -fno-allocation-dce > Do not remove unused C++ allocations in dead code elimination. > @@ -14371,7 +14387,7 @@ To use the link-time optimizer, @option{-flto} and optimization > options should be specified at compile time and during the final link. > It is recommended that you compile all the files participating in the > same link with the same options and also specify those options at > -link time. > +link time. > For example: > > @smallexample > diff --git a/gcc/varasm.cc b/gcc/varasm.cc > index d2c879b7da4..ccf97a5a496 100644 > --- a/gcc/varasm.cc > +++ b/gcc/varasm.cc > @@ -1939,11 +1939,16 @@ assemble_start_function (tree decl, const char *fnname) > > /* Tell assembler to move to target machine's alignment for functions. */ > align = floor_log2 (align / BITS_PER_UNIT); > + /* Handle forced alignment. This really ought to apply to all functions, > + since it is used by patchable entries. */ > + if (flag_min_function_alignment && align < flag_min_function_alignment) > + align = flag_min_function_alignment; > + > if (align > 0) > { > ASM_OUTPUT_ALIGN (asm_out_file, align); > } > > /* Handle a user-specified function alignment. > Note that we still need to align to DECL_ALIGN, as above, > because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all. */ > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)