public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size.
@ 2022-10-07  4:03 Kito Cheng
  2022-10-07  4:52 ` Palmer Dabbelt
  2022-10-07  6:32 ` Richard Biener
  0 siblings, 2 replies; 5+ messages in thread
From: Kito Cheng @ 2022-10-07  4:03 UTC (permalink / raw)
  To: gcc-patches, kito.cheng, msebor, jakub, rguenth, koen.zandberg,
	andy.chiu, guoren, greentime.hu, palmer
  Cc: Monk Chiang

From: Monk Chiang <monk.chiang@sifive.com>

Currnetly setting of -falign-functions=N will be ignored if the function
is optimized for size or marked as cold function.

However function alignment requirement is needed even optimized for
size in some situations, RISC-V target is an example, RISC-V kernel implement
patchable function that require function must be align to at least 4 bytes for
atomicly patch the function prologue.

Here is 4 way to fix that:
1. Fix -falign-functions=N, let it work as expect on -Os and all cold
functions, which is this patch.
2. Force align to 4 byte if -fpatchable-function-entry is given by adjust
RISC-V's FUNCTION_BOUNDARY.
3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N.
4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that.

And this patch is the first approach.

gcc/ChangeLog:

	PR middle-end/88345
	* varasm.cc (assemble_start_function): Adjust function align
	even optimized for size.
	* doc/invoke.texi (Os): Remove -falign-functions= from the exclusion
	list of -Os.

gcc/testsuite/ChangeLog:

	PR middle-end/88345
	* gcc.target/i386/pr88345-1.c: New.
	* gcc.target/i386/pr88345-2.c: Ditto.
	* gcc.target/riscv/pr88345-1.c: Ditto.
	* gcc.target/riscv/pr88345-2.c: Ditto.
---
 gcc/doc/invoke.texi                        | 2 +-
 gcc/testsuite/gcc.target/i386/pr88345-1.c  | 5 +++++
 gcc/testsuite/gcc.target/i386/pr88345-2.c  | 5 +++++
 gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++
 gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++
 gcc/varasm.cc                              | 3 +--
 6 files changed, 22 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a2b0b9636f0..acf98c68825 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11381,7 +11381,7 @@ results.  This is the default.
 Optimize for size.  @option{-Os} enables all @option{-O2} optimizations 
 except those that often increase code size:
 
-@gccoptlist{-falign-functions  -falign-jumps @gol
+@gccoptlist{-falign-jumps @gol
 -falign-labels  -falign-loops @gol
 -fprefetch-loop-arrays  -freorder-blocks-algorithm=stc}
 
diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c
new file mode 100644
index 00000000000..226bb9ffc82
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-falign-functions=128" } */
+/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
+
+__attribute__((__cold__)) void a() {}
diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c
new file mode 100644
index 00000000000..a7fc3b162dd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-falign-functions=128 -Os" } */
+/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
+
+void a() {}
diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
new file mode 100644
index 00000000000..7d5afe683eb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-falign-functions=128" } */
+/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
+
+__attribute__((__cold__)) void a() {}
diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
new file mode 100644
index 00000000000..d4fc89d86d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-falign-functions=128 -Os" } */
+/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
+
+void a() {}
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 423f3f91af8..4001648b214 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname)
      Note that we still need to align to DECL_ALIGN, as above,
      because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all.  */
   if (! DECL_USER_ALIGN (decl)
-      && align_functions.levels[0].log > align
-      && optimize_function_for_speed_p (cfun))
+      && align_functions.levels[0].log > align)
     {
 #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
       int align_log = align_functions.levels[0].log;
-- 
2.37.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size.
  2022-10-07  4:03 [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size Kito Cheng
@ 2022-10-07  4:52 ` Palmer Dabbelt
  2022-10-07  6:32 ` Richard Biener
  1 sibling, 0 replies; 5+ messages in thread
From: Palmer Dabbelt @ 2022-10-07  4:52 UTC (permalink / raw)
  To: kito.cheng
  Cc: gcc-patches, Kito Cheng, msebor, jakub, rguenth, koen.zandberg,
	andy.chiu, guoren, greentime.hu, monk.chiang

On Thu, 06 Oct 2022 21:03:25 PDT (-0700), kito.cheng@sifive.com wrote:
> From: Monk Chiang <monk.chiang@sifive.com>
>
> Currnetly setting of -falign-functions=N will be ignored if the function
> is optimized for size or marked as cold function.
>
> However function alignment requirement is needed even optimized for
> size in some situations, RISC-V target is an example, RISC-V kernel implement
> patchable function that require function must be align to at least 4 bytes for
> atomicly patch the function prologue.
>
> Here is 4 way to fix that:
> 1. Fix -falign-functions=N, let it work as expect on -Os and all cold
> functions, which is this patch.
> 2. Force align to 4 byte if -fpatchable-function-entry is given by adjust
> RISC-V's FUNCTION_BOUNDARY.
> 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N.
> 4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that.
>
> And this patch is the first approach.
>
> gcc/ChangeLog:
>
> 	PR middle-end/88345
> 	* varasm.cc (assemble_start_function): Adjust function align
> 	even optimized for size.
> 	* doc/invoke.texi (Os): Remove -falign-functions= from the exclusion
> 	list of -Os.
>
> gcc/testsuite/ChangeLog:
>
> 	PR middle-end/88345
> 	* gcc.target/i386/pr88345-1.c: New.
> 	* gcc.target/i386/pr88345-2.c: Ditto.
> 	* gcc.target/riscv/pr88345-1.c: Ditto.
> 	* gcc.target/riscv/pr88345-2.c: Ditto.
> ---
>  gcc/doc/invoke.texi                        | 2 +-
>  gcc/testsuite/gcc.target/i386/pr88345-1.c  | 5 +++++
>  gcc/testsuite/gcc.target/i386/pr88345-2.c  | 5 +++++
>  gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++
>  gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++
>  gcc/varasm.cc                              | 3 +--
>  6 files changed, 22 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index a2b0b9636f0..acf98c68825 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -11381,7 +11381,7 @@ results.  This is the default.
>  Optimize for size.  @option{-Os} enables all @option{-O2} optimizations
>  except those that often increase code size:
>
> -@gccoptlist{-falign-functions  -falign-jumps @gol
> +@gccoptlist{-falign-jumps @gol
>  -falign-labels  -falign-loops @gol
>  -fprefetch-loop-arrays  -freorder-blocks-algorithm=stc}
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c
> new file mode 100644
> index 00000000000..226bb9ffc82
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128" } */
> +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
> +
> +__attribute__((__cold__)) void a() {}
> diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c
> new file mode 100644
> index 00000000000..a7fc3b162dd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128 -Os" } */
> +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
> +
> +void a() {}
> diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
> new file mode 100644
> index 00000000000..7d5afe683eb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128" } */
> +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
> +
> +__attribute__((__cold__)) void a() {}
> diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
> new file mode 100644
> index 00000000000..d4fc89d86d8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128 -Os" } */
> +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
> +
> +void a() {}
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 423f3f91af8..4001648b214 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname)
>       Note that we still need to align to DECL_ALIGN, as above,
>       because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all.  */
>    if (! DECL_USER_ALIGN (decl)
> -      && align_functions.levels[0].log > align
> -      && optimize_function_for_speed_p (cfun))
> +      && align_functions.levels[0].log > align)
>      {
>  #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
>        int align_log = align_functions.levels[0].log;

Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>

Though I'm not a global reviewer, so not sure how much that helps...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size.
  2022-10-07  4:03 [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size Kito Cheng
  2022-10-07  4:52 ` Palmer Dabbelt
@ 2022-10-07  6:32 ` Richard Biener
  2022-10-07 12:56   ` Jan Hubicka
  1 sibling, 1 reply; 5+ messages in thread
From: Richard Biener @ 2022-10-07  6:32 UTC (permalink / raw)
  To: Kito Cheng
  Cc: gcc-patches, kito.cheng, msebor, jakub, rguenth, koen.zandberg,
	andy.chiu, guoren, greentime.hu, palmer, Monk Chiang,
	Jan Hubicka

On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng <kito.cheng@sifive.com> wrote:
>
> From: Monk Chiang <monk.chiang@sifive.com>
>
> Currnetly setting of -falign-functions=N will be ignored if the function
> is optimized for size or marked as cold function.
>
> However function alignment requirement is needed even optimized for
> size in some situations, RISC-V target is an example, RISC-V kernel implement
> patchable function that require function must be align to at least 4 bytes for
> atomicly patch the function prologue.
>
> Here is 4 way to fix that:
> 1. Fix -falign-functions=N, let it work as expect on -Os and all cold
> functions, which is this patch.
> 2. Force align to 4 byte if -fpatchable-function-entry is given by adjust
> RISC-V's FUNCTION_BOUNDARY.
> 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N.
> 4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that.
>
> And this patch is the first approach.

The behavior changed with r0-42853-g194734e9e5501f but documentation
wasn't changed to reflect that -falign-functions=N is now only a hint.

I'm not sure in what circumstances users are expected to use -falign-functions
and whether if it is for ABI reasons (as in this case) we should instead have
a -malign-functions or other magic.

Honza - any comment on your change?

Thanks,
Richard.

> gcc/ChangeLog:
>
>         PR middle-end/88345
>         * varasm.cc (assemble_start_function): Adjust function align
>         even optimized for size.
>         * doc/invoke.texi (Os): Remove -falign-functions= from the exclusion
>         list of -Os.
>
> gcc/testsuite/ChangeLog:
>
>         PR middle-end/88345
>         * gcc.target/i386/pr88345-1.c: New.
>         * gcc.target/i386/pr88345-2.c: Ditto.
>         * gcc.target/riscv/pr88345-1.c: Ditto.
>         * gcc.target/riscv/pr88345-2.c: Ditto.
> ---
>  gcc/doc/invoke.texi                        | 2 +-
>  gcc/testsuite/gcc.target/i386/pr88345-1.c  | 5 +++++
>  gcc/testsuite/gcc.target/i386/pr88345-2.c  | 5 +++++
>  gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++
>  gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++
>  gcc/varasm.cc                              | 3 +--
>  6 files changed, 22 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index a2b0b9636f0..acf98c68825 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -11381,7 +11381,7 @@ results.  This is the default.
>  Optimize for size.  @option{-Os} enables all @option{-O2} optimizations
>  except those that often increase code size:
>
> -@gccoptlist{-falign-functions  -falign-jumps @gol
> +@gccoptlist{-falign-jumps @gol
>  -falign-labels  -falign-loops @gol
>  -fprefetch-loop-arrays  -freorder-blocks-algorithm=stc}
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c
> new file mode 100644
> index 00000000000..226bb9ffc82
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128" } */
> +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
> +
> +__attribute__((__cold__)) void a() {}
> diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c
> new file mode 100644
> index 00000000000..a7fc3b162dd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128 -Os" } */
> +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
> +
> +void a() {}
> diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
> new file mode 100644
> index 00000000000..7d5afe683eb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128" } */
> +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
> +
> +__attribute__((__cold__)) void a() {}
> diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
> new file mode 100644
> index 00000000000..d4fc89d86d8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128 -Os" } */
> +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
> +
> +void a() {}
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 423f3f91af8..4001648b214 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname)
>       Note that we still need to align to DECL_ALIGN, as above,
>       because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all.  */
>    if (! DECL_USER_ALIGN (decl)
> -      && align_functions.levels[0].log > align
> -      && optimize_function_for_speed_p (cfun))
> +      && align_functions.levels[0].log > align)
>      {
>  #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
>        int align_log = align_functions.levels[0].log;
> --
> 2.37.2
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size.
  2022-10-07  6:32 ` Richard Biener
@ 2022-10-07 12:56   ` Jan Hubicka
  2022-10-07 13:50     ` Palmer Dabbelt
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Hubicka @ 2022-10-07 12:56 UTC (permalink / raw)
  To: Richard Biener
  Cc: Kito Cheng, gcc-patches, kito.cheng, msebor, jakub, rguenth,
	koen.zandberg, andy.chiu, guoren, greentime.hu, palmer,
	Monk Chiang

> On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng <kito.cheng@sifive.com> wrote:
> >
> > From: Monk Chiang <monk.chiang@sifive.com>
> >
> > Currnetly setting of -falign-functions=N will be ignored if the function
> > is optimized for size or marked as cold function.
> >
> > However function alignment requirement is needed even optimized for
> > size in some situations, RISC-V target is an example, RISC-V kernel implement
> > patchable function that require function must be align to at least 4 bytes for
> > atomicly patch the function prologue.
> >
> > Here is 4 way to fix that:
> > 1. Fix -falign-functions=N, let it work as expect on -Os and all cold
> > functions, which is this patch.
> > 2. Force align to 4 byte if -fpatchable-function-entry is given by adjust
> > RISC-V's FUNCTION_BOUNDARY.
> > 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N.
> > 4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that.
> >
> > And this patch is the first approach.
> 
> The behavior changed with r0-42853-g194734e9e5501f but documentation
> wasn't changed to reflect that -falign-functions=N is now only a hint.
> 
> I'm not sure in what circumstances users are expected to use -falign-functions
> and whether if it is for ABI reasons (as in this case) we should instead have
> a -malign-functions or other magic.
> 
> Honza - any comment on your change?

This was done a long time ago and -falign-functions was/is about
performance.

The basic idea of the patch was to use minimal required alignment for
-Os and cold functions while use -falign-functions for functions
optimized for speed.  This is not different what we do for other
optimization options.

i386 targets define quite large alignments (especially for older ones
that needed function entry to be separated by given number of
instructions from cache page boundary) so enabling it for all functions
unconditionally is expensive (in code size).

We have FUNCTION_BOUNDARY to specify minimal function alignment required
by the ABI.  So I think if live patches requires bigger alignment, I
would go with adjusting FUNCTION_BOUNDARY with -flive-patching.
Alternatively we could introduce -malign-all-function or
-falign-all-functions to adjust minimal alignment if this is specific to
a particular implementation of live patching in the kernel.

Honza
> 
> Thanks,
> Richard.
> 
> > gcc/ChangeLog:
> >
> >         PR middle-end/88345
> >         * varasm.cc (assemble_start_function): Adjust function align
> >         even optimized for size.
> >         * doc/invoke.texi (Os): Remove -falign-functions= from the exclusion
> >         list of -Os.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         PR middle-end/88345
> >         * gcc.target/i386/pr88345-1.c: New.
> >         * gcc.target/i386/pr88345-2.c: Ditto.
> >         * gcc.target/riscv/pr88345-1.c: Ditto.
> >         * gcc.target/riscv/pr88345-2.c: Ditto.
> > ---
> >  gcc/doc/invoke.texi                        | 2 +-
> >  gcc/testsuite/gcc.target/i386/pr88345-1.c  | 5 +++++
> >  gcc/testsuite/gcc.target/i386/pr88345-2.c  | 5 +++++
> >  gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++
> >  gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++
> >  gcc/varasm.cc                              | 3 +--
> >  6 files changed, 22 insertions(+), 3 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c
> >
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index a2b0b9636f0..acf98c68825 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -11381,7 +11381,7 @@ results.  This is the default.
> >  Optimize for size.  @option{-Os} enables all @option{-O2} optimizations
> >  except those that often increase code size:
> >
> > -@gccoptlist{-falign-functions  -falign-jumps @gol
> > +@gccoptlist{-falign-jumps @gol
> >  -falign-labels  -falign-loops @gol
> >  -fprefetch-loop-arrays  -freorder-blocks-algorithm=stc}
> >
> > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c
> > new file mode 100644
> > index 00000000000..226bb9ffc82
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c
> > @@ -0,0 +1,5 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-falign-functions=128" } */
> > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
> > +
> > +__attribute__((__cold__)) void a() {}
> > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c
> > new file mode 100644
> > index 00000000000..a7fc3b162dd
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c
> > @@ -0,0 +1,5 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-falign-functions=128 -Os" } */
> > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
> > +
> > +void a() {}
> > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
> > new file mode 100644
> > index 00000000000..7d5afe683eb
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
> > @@ -0,0 +1,5 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-falign-functions=128" } */
> > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
> > +
> > +__attribute__((__cold__)) void a() {}
> > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
> > new file mode 100644
> > index 00000000000..d4fc89d86d8
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
> > @@ -0,0 +1,5 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-falign-functions=128 -Os" } */
> > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
> > +
> > +void a() {}
> > diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> > index 423f3f91af8..4001648b214 100644
> > --- a/gcc/varasm.cc
> > +++ b/gcc/varasm.cc
> > @@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname)
> >       Note that we still need to align to DECL_ALIGN, as above,
> >       because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all.  */
> >    if (! DECL_USER_ALIGN (decl)
> > -      && align_functions.levels[0].log > align
> > -      && optimize_function_for_speed_p (cfun))
> > +      && align_functions.levels[0].log > align)
> >      {
> >  #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
> >        int align_log = align_functions.levels[0].log;
> > --
> > 2.37.2
> >

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size.
  2022-10-07 12:56   ` Jan Hubicka
@ 2022-10-07 13:50     ` Palmer Dabbelt
  0 siblings, 0 replies; 5+ messages in thread
From: Palmer Dabbelt @ 2022-10-07 13:50 UTC (permalink / raw)
  To: hubicka
  Cc: richard.guenther, kito.cheng, gcc-patches, Kito Cheng, msebor,
	jakub, rguenth, koen.zandberg, andy.chiu, guoren, greentime.hu,
	monk.chiang

On Fri, 07 Oct 2022 05:56:39 PDT (-0700), hubicka@ucw.cz wrote:
>> On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng <kito.cheng@sifive.com> wrote:
>> >
>> > From: Monk Chiang <monk.chiang@sifive.com>
>> >
>> > Currnetly setting of -falign-functions=N will be ignored if the function
>> > is optimized for size or marked as cold function.
>> >
>> > However function alignment requirement is needed even optimized for
>> > size in some situations, RISC-V target is an example, RISC-V kernel implement
>> > patchable function that require function must be align to at least 4 bytes for
>> > atomicly patch the function prologue.
>> >
>> > Here is 4 way to fix that:
>> > 1. Fix -falign-functions=N, let it work as expect on -Os and all cold
>> > functions, which is this patch.
>> > 2. Force align to 4 byte if -fpatchable-function-entry is given by adjust
>> > RISC-V's FUNCTION_BOUNDARY.
>> > 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N.
>> > 4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that.
>> >
>> > And this patch is the first approach.
>>
>> The behavior changed with r0-42853-g194734e9e5501f but documentation
>> wasn't changed to reflect that -falign-functions=N is now only a hint.
>>
>> I'm not sure in what circumstances users are expected to use -falign-functions
>> and whether if it is for ABI reasons (as in this case) we should instead have
>> a -malign-functions or other magic.
>>
>> Honza - any comment on your change?
>
> This was done a long time ago and -falign-functions was/is about
> performance.
>
> The basic idea of the patch was to use minimal required alignment for
> -Os and cold functions while use -falign-functions for functions
> optimized for speed.  This is not different what we do for other
> optimization options.

I'd also noticed last night that -falign-functions doesn't override 
__attribute__((align(N))), but got too tired to write up the patch.  I 
wasn't sure it was the desired behavior at the time, but sounds like it 
is?  I just send a doc patch to mention that.

> i386 targets define quite large alignments (especially for older ones
> that needed function entry to be separated by given number of
> instructions from cache page boundary) so enabling it for all functions
> unconditionally is expensive (in code size).
>
> We have FUNCTION_BOUNDARY to specify minimal function alignment required
> by the ABI.  So I think if live patches requires bigger alignment, I
> would go with adjusting FUNCTION_BOUNDARY with -flive-patching.
> Alternatively we could introduce -malign-all-function or
> -falign-all-functions to adjust minimal alignment if this is specific to
> a particular implementation of live patching in the kernel.
>
> Honza
>>
>> Thanks,
>> Richard.
>>
>> > gcc/ChangeLog:
>> >
>> >         PR middle-end/88345
>> >         * varasm.cc (assemble_start_function): Adjust function align
>> >         even optimized for size.
>> >         * doc/invoke.texi (Os): Remove -falign-functions= from the exclusion
>> >         list of -Os.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >         PR middle-end/88345
>> >         * gcc.target/i386/pr88345-1.c: New.
>> >         * gcc.target/i386/pr88345-2.c: Ditto.
>> >         * gcc.target/riscv/pr88345-1.c: Ditto.
>> >         * gcc.target/riscv/pr88345-2.c: Ditto.
>> > ---
>> >  gcc/doc/invoke.texi                        | 2 +-
>> >  gcc/testsuite/gcc.target/i386/pr88345-1.c  | 5 +++++
>> >  gcc/testsuite/gcc.target/i386/pr88345-2.c  | 5 +++++
>> >  gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++
>> >  gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++
>> >  gcc/varasm.cc                              | 3 +--
>> >  6 files changed, 22 insertions(+), 3 deletions(-)
>> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c
>> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c
>> >
>> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> > index a2b0b9636f0..acf98c68825 100644
>> > --- a/gcc/doc/invoke.texi
>> > +++ b/gcc/doc/invoke.texi
>> > @@ -11381,7 +11381,7 @@ results.  This is the default.
>> >  Optimize for size.  @option{-Os} enables all @option{-O2} optimizations
>> >  except those that often increase code size:
>> >
>> > -@gccoptlist{-falign-functions  -falign-jumps @gol
>> > +@gccoptlist{-falign-jumps @gol
>> >  -falign-labels  -falign-loops @gol
>> >  -fprefetch-loop-arrays  -freorder-blocks-algorithm=stc}
>> >
>> > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c
>> > new file mode 100644
>> > index 00000000000..226bb9ffc82
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c
>> > @@ -0,0 +1,5 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-falign-functions=128" } */
>> > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
>> > +
>> > +__attribute__((__cold__)) void a() {}
>> > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c
>> > new file mode 100644
>> > index 00000000000..a7fc3b162dd
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c
>> > @@ -0,0 +1,5 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-falign-functions=128 -Os" } */
>> > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
>> > +
>> > +void a() {}
>> > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
>> > new file mode 100644
>> > index 00000000000..7d5afe683eb
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
>> > @@ -0,0 +1,5 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-falign-functions=128" } */
>> > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
>> > +
>> > +__attribute__((__cold__)) void a() {}
>> > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
>> > new file mode 100644
>> > index 00000000000..d4fc89d86d8
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
>> > @@ -0,0 +1,5 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-falign-functions=128 -Os" } */
>> > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
>> > +
>> > +void a() {}
>> > diff --git a/gcc/varasm.cc b/gcc/varasm.cc
>> > index 423f3f91af8..4001648b214 100644
>> > --- a/gcc/varasm.cc
>> > +++ b/gcc/varasm.cc
>> > @@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname)
>> >       Note that we still need to align to DECL_ALIGN, as above,
>> >       because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all.  */
>> >    if (! DECL_USER_ALIGN (decl)
>> > -      && align_functions.levels[0].log > align
>> > -      && optimize_function_for_speed_p (cfun))
>> > +      && align_functions.levels[0].log > align)
>> >      {
>> >  #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
>> >        int align_log = align_functions.levels[0].log;
>> > --
>> > 2.37.2
>> >

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-10-07 13:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-07  4:03 [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size Kito Cheng
2022-10-07  4:52 ` Palmer Dabbelt
2022-10-07  6:32 ` Richard Biener
2022-10-07 12:56   ` Jan Hubicka
2022-10-07 13:50     ` Palmer Dabbelt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).