public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Kewen.Lin" <linkw@linux.ibm.com>
To: GCC Patches <gcc-patches@gcc.gnu.org>
Cc: David Edelsohn <dje.gcc@gmail.com>,
	Segher Boessenkool <segher@kernel.crashing.org>,
	AlanM <amodra@gmail.com>
Subject: PING^1 [PATCH v4] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]
Date: Wed, 28 Sep 2022 13:37:48 +0800	[thread overview]
Message-ID: <adb660dd-1a3d-61aa-cb56-3ca3de8df19d@linux.ibm.com> (raw)
In-Reply-To: <c2778c95-439c-ac11-8335-f791ce92be25@linux.ibm.com>

Hi,

Gentle ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600277.html

BR,
Kewen

on 2022/8/25 13:50, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> As PR99888 and its related show, the current support for
> -fpatchable-function-entry on powerpc ELFv2 doesn't work
> well with global entry existence.  For example, with one
> command line option -fpatchable-function-entry=3,2, it got
> below w/o this patch:
> 
>   .LPFE1:
> 	  nop
> 	  nop
> 	  .type   foo, @function
>   foo:
> 	  nop
>   .LFB0:
> 	  .cfi_startproc
>   .LCF0:
>   0:      addis 2,12,.TOC.-.LCF0@ha
> 	  addi 2,2,.TOC.-.LCF0@l
> 	  .localentry     foo,.-foo
> 
> , the assembly is unexpected since the patched nops have
> no effects when being entered from local entry.
> 
> This patch is to update the nops patched before and after
> local entry, it looks like:
> 
> 	  .type   foo, @function
>   foo:
>   .LFB0:
> 	  .cfi_startproc
>   .LCF0:
>   0:      addis 2,12,.TOC.-.LCF0@ha
> 	  addi 2,2,.TOC.-.LCF0@l
> 	  nop
> 	  nop
> 	  .localentry     foo,.-foo
> 	  nop
> 
> Bootstrapped and regtested on powerpc64-linux-gnu P7 & P8,
> and powerpc64le-linux-gnu P9 & P10.
> 
> v4: Change the remaining NOP to nop and update documentation of option
>     -fpatchable-function-entry for PowerPC ELFv2 ABI dual entry points
>     as Segher suggested.
> 
> v3: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599925.html
> 
> v2: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599617.html
> 
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599461.html
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> ----
> 	PR target/99888
> 	PR target/105649
> 
> gcc/ChangeLog:
> 
> 	* doc/invoke.texi (option -fpatchable-function-entry): Adjust the
> 	documentation for PowerPC ELFv2 ABI dual entry points.
> 	* config/rs6000/rs6000-internal.h
> 	(rs6000_print_patchable_function_entry): New function declaration.
> 	* config/rs6000/rs6000-logue.cc (rs6000_output_function_prologue):
> 	Support patchable-function-entry by emitting nops before and after
> 	local entry for the function that needs global entry.
> 	* config/rs6000/rs6000.cc (rs6000_print_patchable_function_entry): Skip
> 	the function that needs global entry till global entry has been
> 	emitted.
> 	* config/rs6000/rs6000.h (struct machine_function): New bool member
> 	global_entry_emitted.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/powerpc/pr99888-1.c: New test.
> 	* gcc.target/powerpc/pr99888-2.c: New test.
> 	* gcc.target/powerpc/pr99888-3.c: New test.
> 	* gcc.target/powerpc/pr99888-4.c: New test.
> 	* gcc.target/powerpc/pr99888-5.c: New test.
> 	* gcc.target/powerpc/pr99888-6.c: New test.
> 	* c-c++-common/patchable_function_entry-default.c: Adjust for
> 	powerpc_elfv2 to avoid compilation error.
> ---
>  gcc/config/rs6000/rs6000-internal.h           |  5 +++
>  gcc/config/rs6000/rs6000-logue.cc             | 32 ++++++++++++++
>  gcc/config/rs6000/rs6000.cc                   | 10 ++++-
>  gcc/config/rs6000/rs6000.h                    |  4 ++
>  gcc/doc/invoke.texi                           |  8 +++-
>  .../patchable_function_entry-default.c        |  3 ++
>  gcc/testsuite/gcc.target/powerpc/pr99888-1.c  | 43 +++++++++++++++++++
>  gcc/testsuite/gcc.target/powerpc/pr99888-2.c  | 43 +++++++++++++++++++
>  gcc/testsuite/gcc.target/powerpc/pr99888-3.c  | 11 +++++
>  gcc/testsuite/gcc.target/powerpc/pr99888-4.c  | 13 ++++++
>  gcc/testsuite/gcc.target/powerpc/pr99888-5.c  | 13 ++++++
>  gcc/testsuite/gcc.target/powerpc/pr99888-6.c  | 14 ++++++
>  12 files changed, 195 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-2.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-3.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-4.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-5.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-6.c
> 
> diff --git a/gcc/config/rs6000/rs6000-internal.h b/gcc/config/rs6000/rs6000-internal.h
> index 8ee8c987b81..d80c04b5ae5 100644
> --- a/gcc/config/rs6000/rs6000-internal.h
> +++ b/gcc/config/rs6000/rs6000-internal.h
> @@ -183,10 +183,15 @@ extern tree rs6000_fold_builtin (tree fndecl ATTRIBUTE_UNUSED,
>  			         tree *args ATTRIBUTE_UNUSED,
>  			         bool ignore ATTRIBUTE_UNUSED);
> 
> +extern void rs6000_print_patchable_function_entry (FILE *,
> +						   unsigned HOST_WIDE_INT,
> +						   bool);
> +
>  extern bool rs6000_passes_float;
>  extern bool rs6000_passes_long_double;
>  extern bool rs6000_passes_vector;
>  extern bool rs6000_returns_struct;
>  extern bool cpu_builtin_p;
> 
> +
>  #endif
> diff --git a/gcc/config/rs6000/rs6000-logue.cc b/gcc/config/rs6000/rs6000-logue.cc
> index 59fe1c8cb8b..51f55d1d527 100644
> --- a/gcc/config/rs6000/rs6000-logue.cc
> +++ b/gcc/config/rs6000/rs6000-logue.cc
> @@ -4013,11 +4013,43 @@ rs6000_output_function_prologue (FILE *file)
>  	  fprintf (file, "\tadd 2,2,12\n");
>  	}
> 
> +      unsigned short patch_area_size = crtl->patch_area_size;
> +      unsigned short patch_area_entry = crtl->patch_area_entry;
> +      /* Need to emit the patching area.  */
> +      if (patch_area_size > 0)
> +	{
> +	  cfun->machine->global_entry_emitted = true;
> +	  /* As ELFv2 ABI shows, the allowable bytes between the global
> +	     and local entry points are 0, 4, 8, 16, 32 and 64 when
> +	     there is a local entry point.  Considering there are two
> +	     non-prefixed instructions for global entry point prologue
> +	     (8 bytes), the count for patchable nops before local entry
> +	     point would be 2, 6 and 14.  It's possible to support those
> +	     other counts of nops by not making a local entry point, but
> +	     we don't have clear user cases for them, so leave them
> +	     unsupported for now.  */
> +	  if (patch_area_entry > 0)
> +	    {
> +	      if (patch_area_entry != 2
> +		  && patch_area_entry != 6
> +		  && patch_area_entry != 14)
> +		error ("unsupported number of nops before function entry (%u)",
> +		       patch_area_entry);
> +	      rs6000_print_patchable_function_entry (file, patch_area_entry,
> +						     true);
> +	      patch_area_size -= patch_area_entry;
> +	    }
> +	}
> +
>        fputs ("\t.localentry\t", file);
>        assemble_name (file, name);
>        fputs (",.-", file);
>        assemble_name (file, name);
>        fputs ("\n", file);
> +      /* Emit the nops after local entry.  */
> +      if (patch_area_size > 0)
> +	rs6000_print_patchable_function_entry (file, patch_area_size,
> +					       patch_area_entry == 0);
>      }
> 
>    else if (rs6000_pcrel_p ())
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 777a06599c3..c79a0ee7a49 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -14898,8 +14898,14 @@ rs6000_print_patchable_function_entry (FILE *file,
>    if (!(TARGET_64BIT && DEFAULT_ABI != ABI_ELFv2)
>        && HAVE_GAS_SECTION_LINK_ORDER)
>      flags |= SECTION_LINK_ORDER;
> -  default_print_patchable_function_entry_1 (file, patch_area_size, record_p,
> -					    flags);
> +  bool global_entry_needed_p = rs6000_global_entry_point_prologue_needed_p ();
> +  /* For a function which needs global entry point, we will emit the
> +     patchable area before and after local entry point under the control of
> +     cfun->machine->global_entry_emitted, see the handling in function
> +     rs6000_output_function_prologue.  */
> +  if (!global_entry_needed_p || cfun->machine->global_entry_emitted)
> +    default_print_patchable_function_entry_1 (file, patch_area_size, record_p,
> +					      flags);
>  }
> 
> 
>  enum rtx_code
> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> index ad9bf0f7358..c352421e87c 100644
> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -2439,6 +2439,10 @@ typedef struct GTY(()) machine_function
>    bool lr_is_wrapped_separately;
>    bool toc_is_wrapped_separately;
>    bool mma_return_type_error;
> +  /* Indicate global entry is emitted, only useful when the function requires
> +     global entry.  It helps to control the patchable area before and after
> +     local entry.  */
> +  bool global_entry_emitted;
>  } machine_function;
>  #endif
> 
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index b7c75028789..c98f5ceb0b9 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -16717,9 +16717,13 @@ the area size or to remove it completely on a single function.
>  If @code{N=0}, no pad location is recorded.
> 
>  The NOP instructions are inserted at---and maybe before, depending on
> -@var{M}---the function entry address, even before the prologue.
> +@var{M}---the function entry address, even before the prologue.  On
> +PowerPC with the ELFv2 ABI, for one function with dual entry points,
> +the local entry point is taken as the function entry for generation.
> 
> -The maximum value of @var{N} and @var{M} is 65535.
> +The maximum value of @var{N} and @var{M} is 65535.  On PowerPC with the
> +ELFv2 ABI, for one function with dual entry points, the supported values
> +for @var{M} are 0, 2, 6 and 14.
>  @end table
> 
> 
> diff --git a/gcc/testsuite/c-c++-common/patchable_function_entry-default.c b/gcc/testsuite/c-c++-common/patchable_function_entry-default.c
> index 7036f7bfbea..a501efccb19 100644
> --- a/gcc/testsuite/c-c++-common/patchable_function_entry-default.c
> +++ b/gcc/testsuite/c-c++-common/patchable_function_entry-default.c
> @@ -1,6 +1,9 @@
>  /* { dg-do compile { target { ! { nvptx*-*-* visium-*-* } } } } */
>  /* { dg-options "-O2 -fpatchable-function-entry=3,1" } */
>  /* { dg-additional-options "-fno-pie" { target sparc*-*-* } } */
> +/* See PR99888, one single preceding nop isn't allowed on powerpc_elfv2,
> +   so overriding with two preceding nops to make it pass there.  */
> +/* { dg-additional-options "-fpatchable-function-entry=3,2" { target powerpc_elfv2 } } */
>  /* { dg-final { scan-assembler-times "nop|NOP|SWYM" 3 { target { ! { alpha*-*-* } } } } } */
>  /* { dg-final { scan-assembler-times "bis" 3 { target alpha*-*-* } } } */
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-1.c b/gcc/testsuite/gcc.target/powerpc/pr99888-1.c
> new file mode 100644
> index 00000000000..9370b4e7438
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-1.c
> @@ -0,0 +1,43 @@
> +/* Verify no errors for different nops after local entry on ELFv2.  */
> +
> +extern int a;
> +
> +__attribute__ ((noipa, patchable_function_entry (1, 0)))
> +int test1 (int b) {
> +  return a + b;
> +}
> +
> +__attribute__ ((noipa, patchable_function_entry (2, 0)))
> +int test2 (int b) {
> +  return a + b;
> +}
> +
> +__attribute__ ((noipa, patchable_function_entry (3, 0)))
> +int test3 (int b) {
> +  return a + b;
> +}
> +
> +__attribute__ ((noipa, patchable_function_entry (4, 0)))
> +int test4 (int b) {
> +  return a + b;
> +}
> +
> +__attribute__ ((noipa, patchable_function_entry (5, 0)))
> +int test5 (int b) {
> +  return a + b;
> +}
> +
> +__attribute__ ((noipa, patchable_function_entry (6, 0)))
> +int test6 (int b) {
> +  return a + b;
> +}
> +
> +__attribute__ ((noipa, patchable_function_entry (7, 0)))
> +int test7 (int b) {
> +  return a + b;
> +}
> +
> +__attribute__ ((noipa, patchable_function_entry (8, 0)))
> +int test8 (int b) {
> +  return a + b;
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-2.c b/gcc/testsuite/gcc.target/powerpc/pr99888-2.c
> new file mode 100644
> index 00000000000..45061712602
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-2.c
> @@ -0,0 +1,43 @@
> +/* Verify no errors for 2, 6 and 14 nops before local entry on ELFv2.  */
> +
> +extern int a;
> +
> +__attribute__ ((noipa, patchable_function_entry (2, 2)))
> +int test1 (int b) {
> +  return a + b;
> +}
> +
> +__attribute__ ((noipa, patchable_function_entry (4, 2)))
> +int test2 (int b) {
> +  return a + b;
> +}
> +
> +__attribute__ ((noipa, patchable_function_entry (6, 6)))
> +int test3 (int b) {
> +  return a + b;
> +}
> +
> +__attribute__ ((noipa, patchable_function_entry (8, 6)))
> +int test4 (int b) {
> +  return a + b;
> +}
> +
> +__attribute__ ((noipa, patchable_function_entry (16, 6)))
> +int test5 (int b) {
> +  return a + b;
> +}
> +
> +__attribute__ ((noipa, patchable_function_entry (14, 14)))
> +int test6 (int b) {
> +  return a + b;
> +}
> +
> +__attribute__ ((noipa, patchable_function_entry (28, 14)))
> +int test7 (int b) {
> +  return a + b;
> +}
> +
> +__attribute__ ((noipa, patchable_function_entry (64, 14)))
> +int test8 (int b) {
> +  return a + b;
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-3.c b/gcc/testsuite/gcc.target/powerpc/pr99888-3.c
> new file mode 100644
> index 00000000000..4531ae32036
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-3.c
> @@ -0,0 +1,11 @@
> +/* { dg-options "-fpatchable-function-entry=1" } */
> +
> +/* Verify no errors on ELFv2, using command line option instead of
> +   function attribute.  */
> +
> +extern int a;
> +
> +int test (int b) {
> +  return a + b;
> +}
> +
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-4.c b/gcc/testsuite/gcc.target/powerpc/pr99888-4.c
> new file mode 100644
> index 00000000000..00a8d4d316e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-4.c
> @@ -0,0 +1,13 @@
> +/* { dg-require-effective-target powerpc_elfv2 } */
> +/* There is no global entry point prologue with pcrel.  */
> +/* { dg-options "-mno-pcrel -fpatchable-function-entry=1,1" } */
> +
> +/* Verify one error emitted for unexpected 1 nop before local
> +   entry.  */
> +
> +extern int a;
> +
> +int test (int b) {
> +  return a + b;
> +}
> +/* { dg-error "unsupported number of nops before function entry \\(1\\)" "" { target *-*-* } .-1 } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-5.c b/gcc/testsuite/gcc.target/powerpc/pr99888-5.c
> new file mode 100644
> index 00000000000..39d3b4465f1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-5.c
> @@ -0,0 +1,13 @@
> +/* { dg-require-effective-target powerpc_elfv2 } */
> +/* There is no global entry point prologue with pcrel.  */
> +/* { dg-options "-mno-pcrel -fpatchable-function-entry=7,3" } */
> +
> +/* Verify one error emitted for unexpected 3 nops before local
> +   entry.  */
> +
> +extern int a;
> +
> +int test (int b) {
> +  return a + b;
> +}
> +/* { dg-error "unsupported number of nops before function entry \\(3\\)" "" { target *-*-* } .-1 } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-6.c b/gcc/testsuite/gcc.target/powerpc/pr99888-6.c
> new file mode 100644
> index 00000000000..c6c18dcc7ac
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-6.c
> @@ -0,0 +1,14 @@
> +/* { dg-require-effective-target powerpc_elfv2 } */
> +/* There is no global entry point prologue with pcrel.  */
> +/* { dg-options "-mno-pcrel" } */
> +
> +/* Verify one error emitted for unexpected 4 nops before local
> +   entry.  */
> +
> +extern int a;
> +
> +__attribute__ ((patchable_function_entry (20, 4)))
> +int test (int b) {
> +  return a + b;
> +}
> +/* { dg-error "unsupported number of nops before function entry \\(4\\)" "" { target *-*-* } .-1 } */
> --
> 2.27.0
> 
> 
> 

  reply	other threads:[~2022-09-28  5:37 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-25  5:50 Kewen.Lin
2022-09-28  5:37 ` Kewen.Lin [this message]
2022-09-28 15:22 ` Segher Boessenkool
2022-09-30 12:38   ` Kewen.Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adb660dd-1a3d-61aa-cb56-3ca3de8df19d@linux.ibm.com \
    --to=linkw@linux.ibm.com \
    --cc=amodra@gmail.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).