public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] core: Support heap-based trampolines
@ 2023-07-16 10:38 FX Coudert
  2023-07-17  6:31 ` Richard Biener
  0 siblings, 1 reply; 15+ messages in thread
From: FX Coudert @ 2023-07-16 10:38 UTC (permalink / raw)
  To: gcc-patches; +Cc: Iain Sandoe, maxim.blinov, ebotcazou, Jeff Law

[-- Attachment #1: Type: text/plain, Size: 1542 bytes --]

Hi,

This is a reworked version (following review) of the patch by Maxim Blinov and Iain Sandoe enabling heap-based trampolines. What has changed since the last version:

- wording changes, preferring to use “heap-based trampolines” rather than “off-stack trampolines”
- the option triggering generation of these new trampolines is now a binary choice: -ftrampoline-impl=[stack|heap]
- some adjustments due to changes in the macOS build flags in GCC since last year

Regarding testing, this patch has had excellent exposure on darwin (both x86_64 and aarch64) because it was part of Iain’s branch, distributed by many macOS distros/vendors (including Homebrew) for more than a year, and there was no bug report against the feature or implementation. On x86_64-linux, I have regression-tested it in three different configurations:
- a default build
- a build with --enable-heap-trampolines
- a build with --enable-heap-trampolines and heap trampolines forced by default (forcing HEAP_TRAMPOLINES_INIT = 1)

There are no regressions in any of these settings (apart from an expected missing warning in gcc.dg/Wtrampolines.c).

----------

From the original review, one question asked (by Jeff Law) was: whether the two linux implementations should be dropped, and the configure time 
selectability as well. Regardless of the answer to the first question, I think we probably want to retain the later, even if only for darwin, as we want to implement this only on recent darwin versions.


OK to commit?

FX



[-- Attachment #2: 0001-core-Support-heap-based-trampolines.patch --]
[-- Type: application/octet-stream, Size: 37849 bytes --]

From d52627ab9aad754d872874401ec8de623ca775f1 Mon Sep 17 00:00:00 2001
From: Maxim Blinov <maxim.blinov@embecosm.com>
Date: Sat, 13 Nov 2021 04:39:53 +0000
Subject: [PATCH] core: Support heap-based trampolines

1. Generate heap-based nested function trampolines

Add support for allocating nested function trampolines on an
executable heap rather than on the stack. This is motivated by targets
such as AArch64 Darwin, which globally prohibit executing code on the
stack.

The target-specific routines for allocating and writing trampolines is
to be provided in libgcc, and is by-default _not_ compiled in unless
the target specifically requires it, or you manually provide
--enable-heap-trampolines when configuring gcc/libgcc.

The gcc flag -ftrampoline-impl controls whether to generate code
that instantiates trampolines on the stack, or to emit calls to
__builtin_nested_func_ptr_created and
__builtin_nested_func_ptr_deleted. Note that this flag is completely
independent of libgcc: If libgcc is for any reason missing those
symbols, you will get a link failure.

This implementation imposes some implicit restrictions as compared to
stack trampolines. longjmp'ing back to a state before a trampoline was
created will cause us to skip over the corresponding
__builtin_nested_func_ptr_deleted, which will leak trampolines
starting from the beginning of the linked list of allocated
trampolines. There may be scope for instrumenting longjmp/setjmp to
trigger cleanups of trampolines.

2. Add x86_64-linux support for heap-based trampolines

Implement the __builtin_nested_func_ptr_{created,deleted} functions
for the x86_64-linux platform. This serves to exercise the
infrastructure added in libgcc (--enable-heap-trampolines) and
gcc (-ftrampoline-impl=heap) in supporting heap-based trampoline
generation, and is intended primarily for demonstration and debugging
purposes.

3. Add aarch64-linux support for head-based trampolines

Implement the __builtin_nested_func_ptr_{created,deleted} functions
for the aarch64-linux platform. This serves to exercise the
infrastructure added in libgcc (--enable-heap-trampolines) and
gcc (-ftrampoline-impl=heap) in supporting heap-based trampoline
generation, and is intended primarily for demonstration and debugging
purposes.

4. Darwin, aarch64, x86_64: Support heap trampolines.

Implement the __builtin_nested_func_ptr_{created,deleted} functions for
x86_64 and aarch64 Darwin.

For aarch64 --enable-heap-trampolines is enabled by default, and
-ftrampoline-impl=heap is enabled by default if we are on host macOS
version 11.x or greater.

For x86_64 this is configure-time opt-in (and can be applied from 10.10
onwards)

Co-Authored-By: Andrew Burgess <andrew.burgess@embecosm.com>
Co-Authored-By: Iain Sandoe <iain@sandoe.co.uk>

gcc/ChangeLog:

	* builtins.def (BUILT_IN_NESTED_PTR_CREATED): Define.
	(BUILT_IN_NESTED_PTR_DELETED): Ditto.
	* common.opt (ftrampoline-impl): Add option to control
	generation of trampoline instantiation (heap or stack).
	* config.gcc: Default to heap trampolines on macOS 11 and above.
	* config.in: Regenerate.
	* config/i386/darwin.h: Define X86_CUSTOM_FUNCTION_TEST.
	* config/i386/i386.h: Define X86_CUSTOM_FUNCTION_TEST.
	* config/i386/i386.cc: Use X86_CUSTOM_FUNCTION_TEST.
	* coretypes.h: Define enum trampoline_impl.
	* tree-nested.cc (convert_tramp_reference_op): Don't bother calling
	__builtin_adjust_trampoline for heap trampolines.
	(finalize_nesting_tree_1): Emit calls to
	__builtin_nested_...{created,deleted} if we're generating with
	-ftrampoline-impl=heap.
	* tree.cc (build_common_builtin_nodes): Build
	__builtin_nested_...{created,deleted}.
	* doc/invoke.texi (-ftrampoline-impl): Document.

libgcc/ChangeLog:

	* configure.ac: Add configure parameter
	--enable-heap-trampolines, and do error checking if we've
	trying to enable heap-based trampolines for a platform that doesn't
	provide any such implementation.
	* libgcc-std.ver.in: Ditto.
	* libgcc2.h (__builtin_nested_func_ptr_created): Declare.
	(__builtin_nested_func_ptr_deleted): Ditto.
	* config/aarch64/heap-trampoline.c: New file: Implement heap-based
	trampolines for aarch64.
	* config/aarch64/t-heap-trampoline: Add rule to build
	config/aarch64/heap-trampoline.c
	* config/i386/heap-trampoline.c: New file: Implement heap-based
	trampolines for x86_64.
	* config/i386/t-heap-trampoline: Add rule to build
	config/i386/heap-trampoline.cc
	* config.host: Handle --enable-heap-trampolines on
	x86_64-*-linux*, aarch64-*-linux*, aarch64*-*darwin*.
	* configure: Regenerate.
---
 gcc/builtins.def                        |   2 +
 gcc/common.opt                          |  17 ++-
 gcc/config.gcc                          |  11 ++
 gcc/config.in                           |   3 +-
 gcc/config/i386/darwin.h                |   6 +
 gcc/config/i386/i386.cc                 |   2 +-
 gcc/config/i386/i386.h                  |   6 +
 gcc/coretypes.h                         |   6 +
 gcc/doc/invoke.texi                     |  17 ++-
 gcc/tree-nested.cc                      | 121 ++++++++++++++---
 gcc/tree.cc                             |  17 +++
 libgcc/config.host                      |   9 ++
 libgcc/config/aarch64/heap-trampoline.c | 172 ++++++++++++++++++++++++
 libgcc/config/aarch64/t-heap-trampoline |  19 +++
 libgcc/config/i386/heap-trampoline.c    | 172 ++++++++++++++++++++++++
 libgcc/config/i386/t-heap-trampoline    |  19 +++
 libgcc/configure                        |  38 ++++++
 libgcc/configure.ac                     |  29 ++++
 libgcc/libgcc-std.ver.in                |   3 +
 libgcc/libgcc2.h                        |   3 +
 20 files changed, 651 insertions(+), 21 deletions(-)
 create mode 100644 libgcc/config/aarch64/heap-trampoline.c
 create mode 100644 libgcc/config/aarch64/t-heap-trampoline
 create mode 100644 libgcc/config/i386/heap-trampoline.c
 create mode 100644 libgcc/config/i386/t-heap-trampoline

diff --git a/gcc/builtins.def b/gcc/builtins.def
index 76e7200e772..918389d863d 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -1073,6 +1073,8 @@ DEF_BUILTIN_STUB (BUILT_IN_ADJUST_TRAMPOLINE, "__builtin_adjust_trampoline")
 DEF_BUILTIN_STUB (BUILT_IN_INIT_DESCRIPTOR, "__builtin_init_descriptor")
 DEF_BUILTIN_STUB (BUILT_IN_ADJUST_DESCRIPTOR, "__builtin_adjust_descriptor")
 DEF_BUILTIN_STUB (BUILT_IN_NONLOCAL_GOTO, "__builtin_nonlocal_goto")
+DEF_BUILTIN_STUB (BUILT_IN_NESTED_PTR_CREATED, "__builtin_nested_func_ptr_created")
+DEF_BUILTIN_STUB (BUILT_IN_NESTED_PTR_DELETED, "__builtin_nested_func_ptr_deleted")
 
 /* Implementing __builtin_setjmp.  */
 DEF_BUILTIN_STUB (BUILT_IN_SETJMP_SETUP, "__builtin_setjmp_setup")
diff --git a/gcc/common.opt b/gcc/common.opt
index 25f650e2dae..4511930fe58 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2884,10 +2884,25 @@ Common Var(flag_tracer) Optimization
 Perform superblock formation via tail duplication.
 
 ftrampolines
-Common Var(flag_trampolines) Init(0)
+Common Var(flag_trampolines) Init(HEAP_TRAMPOLINES_INIT)
 For targets that normally need trampolines for nested functions, always
 generate them instead of using descriptors.
 
+ftrampoline-impl=
+Common Joined RejectNegative Enum(trampoline_impl) Var(flag_trampoline_impl) Init(HEAP_TRAMPOLINES_INIT ? TRAMPOLINE_IMPL_HEAP : TRAMPOLINE_IMPL_STACK)
+Whether trampolines are generated in executable memory rather than
+executable stack.
+
+Enum
+Name(trampoline_impl) Type(enum trampoline_impl) UnknownError(unknown trampoline implementation %qs)
+
+EnumValue
+Enum(trampoline_impl) String(stack) Value(TRAMPOLINE_IMPL_STACK)
+
+EnumValue
+Enum(trampoline_impl) String(heap) Value(TRAMPOLINE_IMPL_HEAP)
+
+
 ; Zero means that floating-point math operations cannot generate a
 ; (user-visible) trap.  This is the case, for example, in nonstop
 ; IEEE 754 arithmetic.
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 1446eb2b3ca..a94d86f85e7 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1125,6 +1125,17 @@ case ${target} in
   ;;
 esac
 
+# Figure out if we need to enable -fheap-trampolines by default
+case ${target} in
+*-*-darwin2*)
+  # Currently, we do this for macOS 11 and above.
+  tm_defines="$tm_defines HEAP_TRAMPOLINES_INIT=1"
+  ;;
+*)
+  tm_defines="$tm_defines HEAP_TRAMPOLINES_INIT=0"
+  ;;
+esac
+
 case ${target} in
 aarch64*-*-elf | aarch64*-*-fuchsia* | aarch64*-*-rtems*)
 	tm_file="${tm_file} elfos.h newlib-stdint.h"
diff --git a/gcc/config.in b/gcc/config.in
index 0e62b9fbfc9..4cad077bfbe 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -2239,7 +2239,8 @@
 #endif
 
 
-/* Define to the sub-directory where libtool stores uninstalled libraries. */
+/* Define to the sub-directory in which libtool stores uninstalled libraries.
+   */
 #ifndef USED_FOR_TARGET
 #undef LT_OBJDIR
 #endif
diff --git a/gcc/config/i386/darwin.h b/gcc/config/i386/darwin.h
index 588bd669bdd..036eefbbb95 100644
--- a/gcc/config/i386/darwin.h
+++ b/gcc/config/i386/darwin.h
@@ -308,3 +308,9 @@ along with GCC; see the file COPYING3.  If not see
 #define CLEAR_INSN_CACHE(beg, end)				\
   extern void sys_icache_invalidate(void *start, size_t len);	\
   sys_icache_invalidate ((beg), (size_t)((end)-(beg)))
+
+/* Disable custom function descriptors for Darwin when we have off-stack
+   trampolines.  */
+#undef X86_CUSTOM_FUNCTION_TEST
+#define X86_CUSTOM_FUNCTION_TEST \
+  (flag_trampolines && flag_trampoline_impl == TRAMPOLINE_IMPL_HEAP) ? 0 : 1
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index f0d6167e667..ec80c71200c 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -25565,7 +25565,7 @@ ix86_libgcc_floating_mode_supported_p
 #define TARGET_HARD_REGNO_SCRATCH_OK ix86_hard_regno_scratch_ok
 
 #undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS
-#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1
+#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS X86_CUSTOM_FUNCTION_TEST
 
 #undef TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID
 #define TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID ix86_addr_space_zero_address_valid
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index aea3209d5a3..19b535edf05 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -755,6 +755,12 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
 /* Minimum allocation boundary for the code of a function.  */
 #define FUNCTION_BOUNDARY 8
 
+/* We will and with this value to test if a custom function descriptor needs
+   a static chain.  The function boundary must the adjusted so that the bit
+   this represents is no longer part of the address.  0 Disables the custom
+   function descriptors.  */
+#define X86_CUSTOM_FUNCTION_TEST 1
+
 /* C++ stores the virtual bit in the lowest bit of function pointers.  */
 #define TARGET_PTRMEMFUNC_VBIT_LOCATION ptrmemfunc_vbit_in_pfn
 
diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index ca8837cef67..7e022a427c4 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -199,6 +199,12 @@ enum tls_model {
   TLS_MODEL_LOCAL_EXEC
 };
 
+/* Types of trampoline implementation.  */
+enum trampoline_impl {
+  TRAMPOLINE_IMPL_STACK,
+  TRAMPOLINE_IMPL_HEAP
+};
+
 /* Types of ABI for an offload compiler.  */
 enum offload_abi {
   OFFLOAD_ABI_UNSET,
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cbc1282c274..6cb3b24221b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -710,7 +710,8 @@ Objective-C and Objective-C++ Dialects}.
 -fverbose-asm  -fpack-struct[=@var{n}]
 -fleading-underscore  -ftls-model=@var{model}
 -fstack-reuse=@var{reuse_level}
--ftrampolines  -ftrapv  -fwrapv
+-ftrampolines -ftrampoline-impl=@r{[}stack@r{|}heap@r{]}
+-ftrapv  -fwrapv
 -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]}
 -fstrict-volatile-bitfields  -fsync-libcalls}
 
@@ -18801,6 +18802,20 @@ For languages other than Ada, the @code{-ftrampolines} and
 trampolines are always generated on platforms that need them
 for nested functions.
 
+@opindex ftrampoline-impl
+@item -ftrampoline-impl=@r{[}stack@r{|}heap@r{]}
+By default, trampolines are generated on stack. However, certain platforms
+(such as the Apple M1) do not permit an executable stack.  Compiling with
+@option{-ftrampoline-impl=heap} generate calls to @code{__builtin_nested_func_ptr_created}
+and @code{__builtin_nested_func_ptr_deleted} in order to allocate and
+deallocate trampoline space on the executable heap. Please note that
+these functions are implemented in libgcc, and will not be compiled in
+unless you provide @option{--enable-heap-trampolines} when
+building gcc.  @emph{PLEASE NOTE}: Heap trampolines are @emph{not}
+guaranteed to be correctly deallocated if you @code{setjmp},
+instantiate nested functions, and then @code{longjmp} back to a state
+prior to having allocated those nested functions.
+
 @opindex fvisibility
 @item -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]}
 Set the default ELF image symbol visibility to the specified option---all
diff --git a/gcc/tree-nested.cc b/gcc/tree-nested.cc
index ae7d1f1f6a8..84ee9962485 100644
--- a/gcc/tree-nested.cc
+++ b/gcc/tree-nested.cc
@@ -611,6 +611,14 @@ get_trampoline_type (struct nesting_info *info)
   if (trampoline_type)
     return trampoline_type;
 
+  /* When trampolines are created off-stack then the only thing we need in the
+     local frame is a single pointer.  */
+  if (flag_trampoline_impl == TRAMPOLINE_IMPL_HEAP)
+    {
+      trampoline_type = build_pointer_type (void_type_node);
+      return trampoline_type;
+    }
+
   align = TRAMPOLINE_ALIGNMENT;
   size = TRAMPOLINE_SIZE;
 
@@ -2788,17 +2796,27 @@ convert_tramp_reference_op (tree *tp, int *walk_subtrees, void *data)
 
       /* Compute the address of the field holding the trampoline.  */
       x = get_frame_field (info, target_context, x, &wi->gsi);
-      x = build_addr (x);
-      x = gsi_gimplify_val (info, x, &wi->gsi);
 
-      /* Do machine-specific ugliness.  Normally this will involve
-	 computing extra alignment, but it can really be anything.  */
-      if (descr)
-	builtin = builtin_decl_implicit (BUILT_IN_ADJUST_DESCRIPTOR);
+      /* APB: We don't need to do the adjustment calls when using off-stack
+	 trampolines, any such adjustment will be done when the off-stack
+	 trampoline is created.  */
+      if (!descr && flag_trampoline_impl == TRAMPOLINE_IMPL_HEAP)
+	x = gsi_gimplify_val (info, x, &wi->gsi);
       else
-	builtin = builtin_decl_implicit (BUILT_IN_ADJUST_TRAMPOLINE);
-      call = gimple_build_call (builtin, 1, x);
-      x = init_tmp_var_with_call (info, &wi->gsi, call);
+	{
+	  x = build_addr (x);
+
+	  x = gsi_gimplify_val (info, x, &wi->gsi);
+
+	  /* Do machine-specific ugliness.  Normally this will involve
+	     computing extra alignment, but it can really be anything.  */
+	  if (descr)
+	    builtin = builtin_decl_implicit (BUILT_IN_ADJUST_DESCRIPTOR);
+	  else
+	    builtin = builtin_decl_implicit (BUILT_IN_ADJUST_TRAMPOLINE);
+	  call = gimple_build_call (builtin, 1, x);
+	  x = init_tmp_var_with_call (info, &wi->gsi, call);
+	}
 
       /* Cast back to the proper function type.  */
       x = build1 (NOP_EXPR, TREE_TYPE (t), x);
@@ -3377,6 +3395,7 @@ build_init_call_stmt (struct nesting_info *info, tree decl, tree field,
 static void
 finalize_nesting_tree_1 (struct nesting_info *root)
 {
+  gimple_seq cleanup_list = NULL;
   gimple_seq stmt_list = NULL;
   gimple *stmt;
   tree context = root->context;
@@ -3508,9 +3527,48 @@ finalize_nesting_tree_1 (struct nesting_info *root)
 	  if (!field)
 	    continue;
 
-	  x = builtin_decl_implicit (BUILT_IN_INIT_TRAMPOLINE);
-	  stmt = build_init_call_stmt (root, i->context, field, x);
-	  gimple_seq_add_stmt (&stmt_list, stmt);
+	  if (flag_trampoline_impl == TRAMPOLINE_IMPL_HEAP)
+	    {
+	      /* We pass a whole bunch of arguments to the builtin function that
+		 creates the off-stack trampoline, these are
+		 1. The nested function chain value (that must be passed to the
+		 nested function so it can find the function arguments).
+		 2. A pointer to the nested function implementation,
+		 3. The address in the local stack frame where we should write
+		 the address of the trampoline.
+
+		 When this code was originally written I just kind of threw
+		 everything at the builtin, figuring I'd work out what was
+		 actually needed later, I think, the stack pointer could
+		 certainly be dropped, arguments #2 and #4 are based off the
+		 stack pointer anyway, so #1 doesn't seem to add much value.  */
+	      tree arg1, arg2, arg3;
+
+	      gcc_assert (DECL_STATIC_CHAIN (i->context));
+	      arg1 = build_addr (root->frame_decl);
+	      arg2 = build_addr (i->context);
+
+	      x = build3 (COMPONENT_REF, TREE_TYPE (field),
+			  root->frame_decl, field, NULL_TREE);
+	      arg3 = build_addr (x);
+
+	      x = builtin_decl_implicit (BUILT_IN_NESTED_PTR_CREATED);
+	      stmt = gimple_build_call (x, 3, arg1, arg2, arg3);
+	      gimple_seq_add_stmt (&stmt_list, stmt);
+
+	      /* This call to delete the nested function trampoline is added to
+		 the cleanup list, and called when we exit the current scope.  */
+	      x = builtin_decl_implicit (BUILT_IN_NESTED_PTR_DELETED);
+	      stmt = gimple_build_call (x, 0);
+	      gimple_seq_add_stmt (&cleanup_list, stmt);
+	    }
+	  else
+	    {
+	      /* Original code to initialise the on stack trampoline.  */
+	      x = builtin_decl_implicit (BUILT_IN_INIT_TRAMPOLINE);
+	      stmt = build_init_call_stmt (root, i->context, field, x);
+	      gimple_seq_add_stmt (&stmt_list, stmt);
+	    }
 	}
     }
 
@@ -3535,11 +3593,40 @@ finalize_nesting_tree_1 (struct nesting_info *root)
   /* If we created initialization statements, insert them.  */
   if (stmt_list)
     {
-      gbind *bind;
-      annotate_all_with_location (stmt_list, DECL_SOURCE_LOCATION (context));
-      bind = gimple_seq_first_stmt_as_a_bind (gimple_body (context));
-      gimple_seq_add_seq (&stmt_list, gimple_bind_body (bind));
-      gimple_bind_set_body (bind, stmt_list);
+      if (flag_trampoline_impl == TRAMPOLINE_IMPL_HEAP)
+	{
+	  /* Handle off-stack trampolines.  */
+	  gbind *bind;
+	  annotate_all_with_location (stmt_list, DECL_SOURCE_LOCATION (context));
+	  annotate_all_with_location (cleanup_list, DECL_SOURCE_LOCATION (context));
+	  bind = gimple_seq_first_stmt_as_a_bind (gimple_body (context));
+	  gimple_seq_add_seq (&stmt_list, gimple_bind_body (bind));
+
+	  gimple_seq xxx_list = NULL;
+
+	  if (cleanup_list != NULL)
+	    {
+	      /* Maybe we shouldn't be creating this try/finally if -fno-exceptions is
+		 in use.  If this is the case, then maybe we should, instead, be
+		 inserting the cleanup code onto every path out of this function?  Not
+		 yet figured out how we would do this.  */
+	      gtry *t = gimple_build_try (stmt_list, cleanup_list, GIMPLE_TRY_FINALLY);
+	      gimple_seq_add_stmt (&xxx_list, t);
+	    }
+	  else
+	    xxx_list = stmt_list;
+
+	  gimple_bind_set_body (bind, xxx_list);
+	}
+      else
+	{
+	  /* The traditional, on stack trampolines.  */
+	  gbind *bind;
+	  annotate_all_with_location (stmt_list, DECL_SOURCE_LOCATION (context));
+	  bind = gimple_seq_first_stmt_as_a_bind (gimple_body (context));
+	  gimple_seq_add_seq (&stmt_list, gimple_bind_body (bind));
+	  gimple_bind_set_body (bind, stmt_list);
+	}
     }
 
   /* If a chain_decl was created, then it needs to be registered with
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 420857b110c..3e7beba8744 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -9870,6 +9870,23 @@ build_common_builtin_nodes (void)
 			"__builtin_nonlocal_goto",
 			ECF_NORETURN | ECF_NOTHROW);
 
+  tree ptr_ptr_type_node = build_pointer_type (ptr_type_node);
+
+  ftype = build_function_type_list (void_type_node,
+				    ptr_type_node, // void *chain
+				    ptr_type_node, // void *func
+				    ptr_ptr_type_node, // void **dst
+				    NULL_TREE);
+  local_define_builtin ("__builtin_nested_func_ptr_created", ftype,
+			BUILT_IN_NESTED_PTR_CREATED,
+			"__builtin_nested_func_ptr_created", ECF_NOTHROW);
+
+  ftype = build_function_type_list (void_type_node,
+				    NULL_TREE);
+  local_define_builtin ("__builtin_nested_func_ptr_deleted", ftype,
+			BUILT_IN_NESTED_PTR_DELETED,
+			"__builtin_nested_func_ptr_deleted", ECF_NOTHROW);
+
   ftype = build_function_type_list (void_type_node,
 				    ptr_type_node, ptr_type_node, NULL_TREE);
   local_define_builtin ("__builtin_setjmp_setup", ftype,
diff --git a/libgcc/config.host b/libgcc/config.host
index 9d7212028d0..e3e311b75a4 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -423,6 +423,9 @@ aarch64*-*-linux*)
 	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
 	tmake_file="${tmake_file} t-dfprules"
+	if test x$heap_trampolines = xyes; then
+	    tmake_file="${tmake_file} ${cpu_type}/t-heap-trampoline"
+	fi
 	;;
 aarch64*-*-vxworks7*)
 	extra_parts="$extra_parts crtfastmath.o"
@@ -697,6 +700,9 @@ x86_64-*-darwin*)
 	tmake_file="$tmake_file i386/t-crtpc t-crtfm i386/t-msabi"
 	tm_file="$tm_file i386/darwin-lib.h"
 	extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o crtfastmath.o"
+	if test x$heap_trampolines = xyes; then
+	    tmake_file="${tmake_file} i386/t-heap-trampoline"
+	fi
 	;;
 i[34567]86-*-elfiamcu)
 	tmake_file="$tmake_file i386/t-crtstuff t-softfp-sfdftf i386/32/t-softfp i386/32/t-iamcu i386/t-softfp t-softfp t-dfprules"
@@ -763,6 +769,9 @@ x86_64-*-linux*)
 	tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff t-dfprules"
 	tm_file="${tm_file} i386/elf-lib.h"
 	md_unwind_header=i386/linux-unwind.h
+	if test x$heap_trampolines = xyes; then
+	    tmake_file="${tmake_file} i386/t-heap-trampoline"
+	fi
 	;;
 x86_64-*-kfreebsd*-gnu)
 	extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o crtfastmath.o"
diff --git a/libgcc/config/aarch64/heap-trampoline.c b/libgcc/config/aarch64/heap-trampoline.c
new file mode 100644
index 00000000000..c8b83681ed7
--- /dev/null
+++ b/libgcc/config/aarch64/heap-trampoline.c
@@ -0,0 +1,172 @@
+/* Copyright The GNU Toolchain Authors. */
+
+#include <unistd.h>
+#include <sys/mman.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+
+#if __APPLE__
+/* For pthread_jit_write_protect_np */
+#include <pthread.h>
+#endif
+
+void *allocate_trampoline_page (void);
+int get_trampolines_per_page (void);
+struct tramp_ctrl_data *allocate_tramp_ctrl (struct tramp_ctrl_data *parent);
+void *allocate_trampoline_page (void);
+
+void __builtin_nested_func_ptr_created (void *chain, void *func, void **dst);
+void __builtin_nested_func_ptr_deleted (void);
+
+#if defined(__gnu_linux__)
+static const uint32_t aarch64_trampoline_insns[] = {
+  0xd503245f, /* hint    34 */
+  0x580000b1, /* ldr     x17, .+20 */
+  0x580000d2, /* ldr     x18, .+24 */
+  0xd61f0220, /* br      x17 */
+  0xd5033f9f, /* dsb     sy */
+  0xd5033fdf /* isb */
+};
+
+#elif __APPLE__
+static const uint32_t aarch64_trampoline_insns[] = {
+  0xd503245f, /* hint    34 */
+  0x580000b1, /* ldr     x17, .+20 */
+  0x580000d0, /* ldr     x16, .+24 */
+  0xd61f0220, /* br      x17 */
+  0xd5033f9f, /* dsb     sy */
+  0xd5033fdf /* isb */
+};
+
+#else
+#error "Unsupported AArch64 platform for heap trampolines"
+#endif
+
+struct aarch64_trampoline {
+  uint32_t insns[6];
+  void *func_ptr;
+  void *chain_ptr;
+};
+
+struct tramp_ctrl_data
+{
+  struct tramp_ctrl_data *prev;
+
+  int free_trampolines;
+
+  /* This will be pointing to an executable mmap'ed page.  */
+  struct aarch64_trampoline *trampolines;
+};
+
+int
+get_trampolines_per_page (void)
+{
+  return getpagesize() / sizeof(struct aarch64_trampoline);
+}
+
+static _Thread_local struct tramp_ctrl_data *tramp_ctrl_curr = NULL;
+
+void *
+allocate_trampoline_page (void)
+{
+  void *page;
+
+#if defined(__gnu_linux__)
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+	       MAP_ANON | MAP_PRIVATE, 0, 0);
+#elif __APPLE__
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+	       MAP_ANON | MAP_PRIVATE | MAP_JIT, 0, 0);
+#else
+  page = MAP_FAILED;
+#endif
+
+  return page;
+}
+
+struct tramp_ctrl_data *
+allocate_tramp_ctrl (struct tramp_ctrl_data *parent)
+{
+  struct tramp_ctrl_data *p = malloc (sizeof (struct tramp_ctrl_data));
+  if (p == NULL)
+    return NULL;
+
+  p->trampolines = allocate_trampoline_page ();
+
+  if (p->trampolines == MAP_FAILED)
+    return NULL;
+
+  p->prev = parent;
+  p->free_trampolines = get_trampolines_per_page();
+
+  return p;
+}
+
+void
+__builtin_nested_func_ptr_created (void *chain, void *func, void **dst)
+{
+  if (tramp_ctrl_curr == NULL)
+    {
+      tramp_ctrl_curr = allocate_tramp_ctrl (NULL);
+      if (tramp_ctrl_curr == NULL)
+	abort ();
+    }
+
+  if (tramp_ctrl_curr->free_trampolines == 0)
+    {
+      void *tramp_ctrl = allocate_tramp_ctrl (tramp_ctrl_curr);
+      if (!tramp_ctrl)
+	abort ();
+
+      tramp_ctrl_curr = tramp_ctrl;
+    }
+
+  struct aarch64_trampoline *trampoline
+    = &tramp_ctrl_curr->trampolines[get_trampolines_per_page ()
+				    - tramp_ctrl_curr->free_trampolines];
+
+#if __APPLE__
+  /* Disable write protection for the MAP_JIT regions in this thread (see
+     https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon) */
+  pthread_jit_write_protect_np (0);
+#endif
+
+  memcpy (trampoline->insns, aarch64_trampoline_insns,
+	  sizeof(aarch64_trampoline_insns));
+  trampoline->func_ptr = func;
+  trampoline->chain_ptr = chain;
+
+#if __APPLE__
+  /* Re-enable write protection.  */
+  pthread_jit_write_protect_np (1);
+#endif
+
+  tramp_ctrl_curr->free_trampolines -= 1;
+
+  __builtin___clear_cache ((void *)trampoline->insns,
+			   ((void *)trampoline->insns + sizeof(trampoline->insns)));
+
+  *dst = &trampoline->insns;
+}
+
+void
+__builtin_nested_func_ptr_deleted (void)
+{
+  if (tramp_ctrl_curr == NULL)
+    abort ();
+
+  tramp_ctrl_curr->free_trampolines += 1;
+
+  if (tramp_ctrl_curr->free_trampolines == get_trampolines_per_page ())
+    {
+      if (tramp_ctrl_curr->prev == NULL)
+	return;
+
+      munmap (tramp_ctrl_curr->trampolines, getpagesize());
+      struct tramp_ctrl_data *prev = tramp_ctrl_curr->prev;
+      free (tramp_ctrl_curr);
+      tramp_ctrl_curr = prev;
+    }
+}
diff --git a/libgcc/config/aarch64/t-heap-trampoline b/libgcc/config/aarch64/t-heap-trampoline
new file mode 100644
index 00000000000..b22480800b2
--- /dev/null
+++ b/libgcc/config/aarch64/t-heap-trampoline
@@ -0,0 +1,19 @@
+# Copyright The GNU Toolchain Authors.
+
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+LIB2ADD += $(srcdir)/config/aarch64/heap-trampoline.c
diff --git a/libgcc/config/i386/heap-trampoline.c b/libgcc/config/i386/heap-trampoline.c
new file mode 100644
index 00000000000..96e13bf828e
--- /dev/null
+++ b/libgcc/config/i386/heap-trampoline.c
@@ -0,0 +1,172 @@
+/* Copyright The GNU Toolchain Authors. */
+
+#include <unistd.h>
+#include <sys/mman.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+
+#if __APPLE__ && __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ >= 101400
+/* For pthread_jit_write_protect_np */
+#include <pthread.h>
+#endif
+
+void *allocate_trampoline_page (void);
+int get_trampolines_per_page (void);
+struct tramp_ctrl_data *allocate_tramp_ctrl (struct tramp_ctrl_data *parent);
+void *allocate_trampoline_page (void);
+
+void __builtin_nested_func_ptr_created (void *chain, void *func, void **dst);
+void __builtin_nested_func_ptr_deleted (void);
+
+static const uint8_t trampoline_insns[] = {
+  /* movabs $<chain>,%r11  */
+  0x49, 0xbb,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+
+  /* movabs $<func>,%r10  */
+  0x49, 0xba,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+
+  /* rex.WB jmpq *%r11  */
+  0x41, 0xff, 0xe3
+};
+
+union ix86_trampoline {
+  uint8_t insns[sizeof(trampoline_insns)];
+
+  struct __attribute__((packed)) fields {
+    uint8_t insn_0[2];
+    void *func_ptr;
+    uint8_t insn_1[2];
+    void *chain_ptr;
+    uint8_t insn_2[3];
+  } fields;
+};
+
+struct tramp_ctrl_data
+{
+  struct tramp_ctrl_data *prev;
+
+  int free_trampolines;
+
+  /* This will be pointing to an executable mmap'ed page.  */
+  union ix86_trampoline *trampolines;
+};
+
+int
+get_trampolines_per_page (void)
+{
+  return getpagesize() / sizeof(union ix86_trampoline);
+}
+
+static _Thread_local struct tramp_ctrl_data *tramp_ctrl_curr = NULL;
+
+void *
+allocate_trampoline_page (void)
+{
+  void *page;
+
+#if defined(__gnu_linux__)
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+	       MAP_ANON | MAP_PRIVATE, 0, 0);
+#elif __APPLE__
+# if  __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ >= 101400
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+	       MAP_ANON | MAP_PRIVATE | MAP_JIT, 0, 0);
+# else
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+	       MAP_ANON | MAP_PRIVATE, 0, 0);
+# endif
+#else
+  page = MAP_FAILED;
+#endif
+
+  return page;
+}
+
+struct tramp_ctrl_data *
+allocate_tramp_ctrl (struct tramp_ctrl_data *parent)
+{
+  struct tramp_ctrl_data *p = malloc (sizeof (struct tramp_ctrl_data));
+  if (p == NULL)
+    return NULL;
+
+  p->trampolines = allocate_trampoline_page ();
+
+  if (p->trampolines == MAP_FAILED)
+    return NULL;
+
+  p->prev = parent;
+  p->free_trampolines = get_trampolines_per_page();
+
+  return p;
+}
+
+void
+__builtin_nested_func_ptr_created (void *chain, void *func, void **dst)
+{
+  if (tramp_ctrl_curr == NULL)
+    {
+      tramp_ctrl_curr = allocate_tramp_ctrl (NULL);
+      if (tramp_ctrl_curr == NULL)
+	abort ();
+    }
+
+  if (tramp_ctrl_curr->free_trampolines == 0)
+    {
+      void *tramp_ctrl = allocate_tramp_ctrl (tramp_ctrl_curr);
+      if (!tramp_ctrl)
+	abort ();
+
+      tramp_ctrl_curr = tramp_ctrl;
+    }
+
+  union ix86_trampoline *trampoline
+    = &tramp_ctrl_curr->trampolines[get_trampolines_per_page ()
+				    - tramp_ctrl_curr->free_trampolines];
+
+#if __APPLE__ && __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ >= 101400
+  /* Disable write protection for the MAP_JIT regions in this thread (see
+     https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon) */
+  pthread_jit_write_protect_np (0);
+#endif
+
+  memcpy (trampoline->insns, trampoline_insns,
+	  sizeof(trampoline_insns));
+  trampoline->fields.func_ptr = func;
+  trampoline->fields.chain_ptr = chain;
+
+#if __APPLE__ && __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ >= 101400
+  /* Re-enable write protection.  */
+  pthread_jit_write_protect_np (1);
+#endif
+
+  tramp_ctrl_curr->free_trampolines -= 1;
+
+  __builtin___clear_cache ((void *)trampoline->insns,
+			   ((void *)trampoline->insns + sizeof(trampoline->insns)));
+
+  *dst = &trampoline->insns;
+}
+
+void
+__builtin_nested_func_ptr_deleted (void)
+{
+  if (tramp_ctrl_curr == NULL)
+    abort ();
+
+  tramp_ctrl_curr->free_trampolines += 1;
+
+  if (tramp_ctrl_curr->free_trampolines == get_trampolines_per_page ())
+    {
+      if (tramp_ctrl_curr->prev == NULL)
+	return;
+
+      munmap (tramp_ctrl_curr->trampolines, getpagesize());
+      struct tramp_ctrl_data *prev = tramp_ctrl_curr->prev;
+      free (tramp_ctrl_curr);
+      tramp_ctrl_curr = prev;
+    }
+}
diff --git a/libgcc/config/i386/t-heap-trampoline b/libgcc/config/i386/t-heap-trampoline
new file mode 100644
index 00000000000..613f635b1f6
--- /dev/null
+++ b/libgcc/config/i386/t-heap-trampoline
@@ -0,0 +1,19 @@
+# Copyright The GNU Toolchain Authors.
+
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+LIB2ADD += $(srcdir)/config/i386/heap-trampoline.c
diff --git a/libgcc/configure b/libgcc/configure
index be5d45f1755..f607f592a90 100755
--- a/libgcc/configure
+++ b/libgcc/configure
@@ -654,6 +654,7 @@ build_cpu
 build
 with_aix_soname
 enable_vtable_verify
+heap_trampolines
 enable_shared
 libgcc_topdir
 target_alias
@@ -701,6 +702,7 @@ with_target_subdir
 with_cross_host
 with_ld
 enable_shared
+enable_heap_trampolines
 enable_vtable_verify
 with_aix_soname
 enable_version_specific_runtime_libs
@@ -1342,6 +1344,9 @@ Optional Features:
   --disable-FEATURE       do not include FEATURE (same as --enable-FEATURE=no)
   --enable-FEATURE[=ARG]  include FEATURE [ARG=yes]
   --disable-shared        don't provide a shared libgcc
+  --enable-heap-trampolines
+                  Specify whether to support generating heap trampolines
+
   --enable-vtable-verify    Enable vtable verification feature
   --enable-version-specific-runtime-libs    Specify that runtime libraries should be installed in a compiler-specific directory
   --enable-maintainer-mode
@@ -2252,6 +2257,39 @@ fi
 
 
 
+# Check whether --enable-heap-trampolines was given.
+if test "${enable_heap_trampolines+set}" = set; then :
+  enableval=$enable_heap_trampolines;
+case "$target" in
+  x86_64-*-linux* | x86_64-*-darwin1[4-9]* | x86_64-*-darwin2*)
+    heap_trampolines=$enableval
+    ;;
+  aarch64*-*-linux* )
+    heap_trampolines=$enableval
+    ;;
+  aarch64*-*darwin* )
+    heap_trampolines=$enableval
+    ;;
+  *)
+    as_fn_error $? "Configure option --enable-off-stack-trampolines is not supported \
+for this platform" "$LINENO" 5
+    heap_trampolines=no
+    ;;
+esac
+else
+
+case "$target" in
+  *-*-darwin2*)
+    heap_trampolines=yes
+    ;;
+  *)
+    heap_trampolines=no
+    ;;
+esac
+fi
+
+
+
 # Check whether --enable-vtable-verify was given.
 if test "${enable_vtable_verify+set}" = set; then :
   enableval=$enable_vtable_verify; case "$enableval" in
diff --git a/libgcc/configure.ac b/libgcc/configure.ac
index 2fc9d5d7c93..459657838e0 100644
--- a/libgcc/configure.ac
+++ b/libgcc/configure.ac
@@ -68,6 +68,35 @@ AC_ARG_ENABLE(shared,
 ], [enable_shared=yes])
 AC_SUBST(enable_shared)
 
+AC_ARG_ENABLE([heap-trampolines],
+  [AS_HELP_STRING([--enable-heap-trampolines]
+                  [Specify whether to support generating heap trampolines])],[
+case "$target" in
+  x86_64-*-linux* | x86_64-*-darwin1[[4-9]]* | x86_64-*-darwin2*)
+    heap_trampolines=$enableval
+    ;;
+  aarch64*-*-linux* )
+    heap_trampolines=$enableval
+    ;;
+  aarch64*-*darwin* )
+    heap_trampolines=$enableval
+    ;;
+  *)
+    AC_MSG_ERROR([Configure option --enable-off-stack-trampolines is not supported \
+for this platform])
+    heap_trampolines=no
+    ;;
+esac],[
+case "$target" in
+  *-*-darwin2*)
+    heap_trampolines=yes
+    ;;
+  *)
+    heap_trampolines=no
+    ;;
+esac])
+AC_SUBST(heap_trampolines)
+
 AC_ARG_ENABLE(vtable-verify,
 [  --enable-vtable-verify    Enable vtable verification feature ],
 [case "$enableval" in
diff --git a/libgcc/libgcc-std.ver.in b/libgcc/libgcc-std.ver.in
index c4f87a50e70..a48f4899eb6 100644
--- a/libgcc/libgcc-std.ver.in
+++ b/libgcc/libgcc-std.ver.in
@@ -1943,4 +1943,7 @@ GCC_4.8.0 {
 GCC_7.0.0 {
   __PFX__divmoddi4
   __PFX__divmodti4
+
+  __builtin_nested_func_ptr_created
+  __builtin_nested_func_ptr_deleted
 }
diff --git a/libgcc/libgcc2.h b/libgcc/libgcc2.h
index 3ec9bbd8164..ac7eaab4f01 100644
--- a/libgcc/libgcc2.h
+++ b/libgcc/libgcc2.h
@@ -29,6 +29,9 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #pragma GCC visibility push(default)
 #endif
 
+extern void __builtin_nested_func_ptr_created (void *, void *, void **);
+extern void __builtin_nested_func_ptr_deleted (void);
+
 extern int __gcc_bcmp (const unsigned char *, const unsigned char *, size_t);
 extern void __clear_cache (void *, void *);
 extern void __eprintf (const char *, const char *, unsigned int, const char *)
-- 
2.34.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-09-16 19:11 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-16 10:38 [PATCH] core: Support heap-based trampolines FX Coudert
2023-07-17  6:31 ` Richard Biener
2023-07-17  6:43   ` FX Coudert
2023-07-17  6:58     ` Iain Sandoe
2023-07-17  7:16       ` Iain Sandoe
2023-07-19  9:04         ` Martin Uecker
2023-07-19  9:29           ` Iain Sandoe
2023-07-19 10:43             ` Martin Uecker
2023-07-19 14:23               ` Iain Sandoe
2023-07-19 15:18                 ` Martin Uecker
2023-08-05 14:20   ` FX Coudert
2023-08-20  9:43     ` FX Coudert
2023-09-06 15:44     ` FX Coudert
2023-09-14 10:18       ` Richard Biener
2023-09-16 19:10         ` Iain Sandoe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).