From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 2ED3F3858D3C; Fri, 7 Oct 2022 12:56:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2ED3F3858D3C Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=ucw.cz Authentication-Results: sourceware.org; spf=none smtp.mailfrom=kam.mff.cuni.cz Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 4A5F52804A2; Fri, 7 Oct 2022 14:56:39 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ucw.cz; s=gen1; t=1665147399; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=2Io33gVWkeiyrbEslbj+DRygwVqEp/q9xUgBCwapIHs=; b=QMBtJW6/CzGVXzHVeCymho5Vsw/W92JYXUBCE4aFrdIwvT0AZnp/XdJZ5i8oKb2Nc4b67u MHMgzEsJod818dDrr/uvYy17Gdds06zYBcqsDLFpviQ4L0VJ4r5KOTQPHHl6jIKjJ1iVko u7our63eL6u6qu0OcaFGSYUmyeBP4cg= Date: Fri, 7 Oct 2022 14:56:39 +0200 From: Jan Hubicka To: Richard Biener Cc: Kito Cheng , gcc-patches@gcc.gnu.org, kito.cheng@gmail.com, msebor@gcc.gnu.org, jakub@gcc.gnu.org, rguenth@gcc.gnu.org, koen.zandberg@inria.fr, andy.chiu@sifive.com, guoren@kernel.org, greentime.hu@sifive.com, palmer@dabbelt.com, Monk Chiang Subject: Re: [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size. Message-ID: References: <20221007040325.21276-1-kito.cheng@sifive.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,KAM_SHORT,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng wrote: > > > > From: Monk Chiang > > > > Currnetly setting of -falign-functions=N will be ignored if the function > > is optimized for size or marked as cold function. > > > > However function alignment requirement is needed even optimized for > > size in some situations, RISC-V target is an example, RISC-V kernel implement > > patchable function that require function must be align to at least 4 bytes for > > atomicly patch the function prologue. > > > > Here is 4 way to fix that: > > 1. Fix -falign-functions=N, let it work as expect on -Os and all cold > > functions, which is this patch. > > 2. Force align to 4 byte if -fpatchable-function-entry is given by adjust > > RISC-V's FUNCTION_BOUNDARY. > > 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N. > > 4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that. > > > > And this patch is the first approach. > > The behavior changed with r0-42853-g194734e9e5501f but documentation > wasn't changed to reflect that -falign-functions=N is now only a hint. > > I'm not sure in what circumstances users are expected to use -falign-functions > and whether if it is for ABI reasons (as in this case) we should instead have > a -malign-functions or other magic. > > Honza - any comment on your change? This was done a long time ago and -falign-functions was/is about performance. The basic idea of the patch was to use minimal required alignment for -Os and cold functions while use -falign-functions for functions optimized for speed. This is not different what we do for other optimization options. i386 targets define quite large alignments (especially for older ones that needed function entry to be separated by given number of instructions from cache page boundary) so enabling it for all functions unconditionally is expensive (in code size). We have FUNCTION_BOUNDARY to specify minimal function alignment required by the ABI. So I think if live patches requires bigger alignment, I would go with adjusting FUNCTION_BOUNDARY with -flive-patching. Alternatively we could introduce -malign-all-function or -falign-all-functions to adjust minimal alignment if this is specific to a particular implementation of live patching in the kernel. Honza > > Thanks, > Richard. > > > gcc/ChangeLog: > > > > PR middle-end/88345 > > * varasm.cc (assemble_start_function): Adjust function align > > even optimized for size. > > * doc/invoke.texi (Os): Remove -falign-functions= from the exclusion > > list of -Os. > > > > gcc/testsuite/ChangeLog: > > > > PR middle-end/88345 > > * gcc.target/i386/pr88345-1.c: New. > > * gcc.target/i386/pr88345-2.c: Ditto. > > * gcc.target/riscv/pr88345-1.c: Ditto. > > * gcc.target/riscv/pr88345-2.c: Ditto. > > --- > > gcc/doc/invoke.texi | 2 +- > > gcc/testsuite/gcc.target/i386/pr88345-1.c | 5 +++++ > > gcc/testsuite/gcc.target/i386/pr88345-2.c | 5 +++++ > > gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++ > > gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++ > > gcc/varasm.cc | 3 +-- > > 6 files changed, 22 insertions(+), 3 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c > > > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > > index a2b0b9636f0..acf98c68825 100644 > > --- a/gcc/doc/invoke.texi > > +++ b/gcc/doc/invoke.texi > > @@ -11381,7 +11381,7 @@ results. This is the default. > > Optimize for size. @option{-Os} enables all @option{-O2} optimizations > > except those that often increase code size: > > > > -@gccoptlist{-falign-functions -falign-jumps @gol > > +@gccoptlist{-falign-jumps @gol > > -falign-labels -falign-loops @gol > > -fprefetch-loop-arrays -freorder-blocks-algorithm=stc} > > > > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c > > new file mode 100644 > > index 00000000000..226bb9ffc82 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c > > @@ -0,0 +1,5 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-falign-functions=128" } */ > > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */ > > + > > +__attribute__((__cold__)) void a() {} > > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c > > new file mode 100644 > > index 00000000000..a7fc3b162dd > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c > > @@ -0,0 +1,5 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-falign-functions=128 -Os" } */ > > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */ > > + > > +void a() {} > > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c > > new file mode 100644 > > index 00000000000..7d5afe683eb > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c > > @@ -0,0 +1,5 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-falign-functions=128" } */ > > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */ > > + > > +__attribute__((__cold__)) void a() {} > > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c > > new file mode 100644 > > index 00000000000..d4fc89d86d8 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c > > @@ -0,0 +1,5 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-falign-functions=128 -Os" } */ > > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */ > > + > > +void a() {} > > diff --git a/gcc/varasm.cc b/gcc/varasm.cc > > index 423f3f91af8..4001648b214 100644 > > --- a/gcc/varasm.cc > > +++ b/gcc/varasm.cc > > @@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname) > > Note that we still need to align to DECL_ALIGN, as above, > > because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all. */ > > if (! DECL_USER_ALIGN (decl) > > - && align_functions.levels[0].log > align > > - && optimize_function_for_speed_p (cfun)) > > + && align_functions.levels[0].log > align) > > { > > #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN > > int align_log = align_functions.levels[0].log; > > -- > > 2.37.2 > >