public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: FX Coudert <fxcoudert@gmail.com>
To: Richard Biener <richard.guenther@gmail.com>,
	Jeff Law <jeffreyalaw@gmail.com>,
	GCC Patches <gcc-patches@gcc.gnu.org>
Cc: Iain Sandoe <iains.gcc@gmail.com>,
	maxim.blinov@embecosm.com, ebotcazou@adacore.com
Subject: Re: [PATCH] core: Support heap-based trampolines
Date: Wed, 6 Sep 2023 17:44:29 +0200	[thread overview]
Message-ID: <20E41E5A-C9FE-4B9B-AEE6-CA8D18EC6635@gmail.com> (raw)
In-Reply-To: <1E30AA03-E0C1-447C-8A06-1B516B86992D@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 861 bytes --]

Hi,

ping**2 on the revised patch, for Richard or another global reviewer. So far all review feedback is that it’s a step forward, and it’s been widely used for both aarch64-darwin and x86_64-darwin distributions for almost three years now.

OK to commit?
FX



> Le 5 août 2023 à 16:20, FX Coudert <fxcoudert@gmail.com> a écrit :
> 
> Hi Richard,
> 
> Thanks for your feedback. Here is an amended version of the patch, taking into consideration your requests and the following discussion. There is no configure option for the libgcc part, and the documentation is amended. The patch is split into three commits for core, target and libgcc.
> 
> Currently regtesting on x86_64 linux and darwin (it was fine before I split up into three commits, so I’m re-testing to make sure I didn’t screw anything up).
> 
> OK to commit?
> FX


[-- Attachment #2: 0001-core-Support-heap-based-trampolines.patch --]
[-- Type: application/octet-stream, Size: 14264 bytes --]

From bfb1e356e7e6848736218608eca953569361cf18 Mon Sep 17 00:00:00 2001
From: Francois-Xavier Coudert <fxcoudert@gcc.gnu.org>
Date: Sat, 5 Aug 2023 14:54:11 +0200
Subject: [PATCH 1/3] core: Support heap-based trampolines

Generate heap-based nested function trampolines

Add support for allocating nested function trampolines on an
executable heap rather than on the stack. This is motivated by targets
such as AArch64 Darwin, which globally prohibit executing code on the
stack.

The target-specific routines for allocating and writing trampolines are
to be provided in libgcc.

The gcc flag -ftrampoline-impl controls whether to generate code
that instantiates trampolines on the stack, or to emit calls to
__builtin_nested_func_ptr_created and
__builtin_nested_func_ptr_deleted. Note that this flag is completely
independent of libgcc: If libgcc is for any reason missing those
symbols, you will get a link failure.

This implementation imposes some implicit restrictions as compared to
stack trampolines. longjmp'ing back to a state before a trampoline was
created will cause us to skip over the corresponding
__builtin_nested_func_ptr_deleted, which will leak trampolines
starting from the beginning of the linked list of allocated
trampolines. There may be scope for instrumenting longjmp/setjmp to
trigger cleanups of trampolines.

Co-Authored-By: Andrew Burgess <andrew.burgess@embecosm.com>
Co-Authored-By: Iain Sandoe <iain@sandoe.co.uk>

gcc/ChangeLog:

	* builtins.def (BUILT_IN_NESTED_PTR_CREATED): Define.
	(BUILT_IN_NESTED_PTR_DELETED): Ditto.
	* common.opt (ftrampoline-impl): Add option to control
	generation of trampoline instantiation (heap or stack).
	* coretypes.h: Define enum trampoline_impl.
	* tree-nested.cc (convert_tramp_reference_op): Don't bother calling
	__builtin_adjust_trampoline for heap trampolines.
	(finalize_nesting_tree_1): Emit calls to
	__builtin_nested_...{created,deleted} if we're generating with
	-ftrampoline-impl=heap.
	* tree.cc (build_common_builtin_nodes): Build
	__builtin_nested_...{created,deleted}.
	* doc/invoke.texi (-ftrampoline-impl): Document.
---
 gcc/builtins.def    |   2 +
 gcc/common.opt      |  17 ++++++-
 gcc/coretypes.h     |   6 +++
 gcc/doc/invoke.texi |  17 ++++++-
 gcc/tree-nested.cc  | 121 +++++++++++++++++++++++++++++++++++++-------
 gcc/tree.cc         |  17 +++++++
 6 files changed, 161 insertions(+), 19 deletions(-)

diff --git a/gcc/builtins.def b/gcc/builtins.def
index 5953266acba..7a7987100d1 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -1074,6 +1074,8 @@ DEF_BUILTIN_STUB (BUILT_IN_ADJUST_TRAMPOLINE, "__builtin_adjust_trampoline")
 DEF_BUILTIN_STUB (BUILT_IN_INIT_DESCRIPTOR, "__builtin_init_descriptor")
 DEF_BUILTIN_STUB (BUILT_IN_ADJUST_DESCRIPTOR, "__builtin_adjust_descriptor")
 DEF_BUILTIN_STUB (BUILT_IN_NONLOCAL_GOTO, "__builtin_nonlocal_goto")
+DEF_BUILTIN_STUB (BUILT_IN_NESTED_PTR_CREATED, "__builtin_nested_func_ptr_created")
+DEF_BUILTIN_STUB (BUILT_IN_NESTED_PTR_DELETED, "__builtin_nested_func_ptr_deleted")
 
 /* Implementing __builtin_setjmp.  */
 DEF_BUILTIN_STUB (BUILT_IN_SETJMP_SETUP, "__builtin_setjmp_setup")
diff --git a/gcc/common.opt b/gcc/common.opt
index 0888c15b88f..949307a4414 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2884,10 +2884,25 @@ Common Var(flag_tracer) Optimization
 Perform superblock formation via tail duplication.
 
 ftrampolines
-Common Var(flag_trampolines) Init(0)
+Common Var(flag_trampolines) Init(HEAP_TRAMPOLINES_INIT)
 For targets that normally need trampolines for nested functions, always
 generate them instead of using descriptors.
 
+ftrampoline-impl=
+Common Joined RejectNegative Enum(trampoline_impl) Var(flag_trampoline_impl) Init(HEAP_TRAMPOLINES_INIT ? TRAMPOLINE_IMPL_HEAP : TRAMPOLINE_IMPL_STACK)
+Whether trampolines are generated in executable memory rather than
+executable stack.
+
+Enum
+Name(trampoline_impl) Type(enum trampoline_impl) UnknownError(unknown trampoline implementation %qs)
+
+EnumValue
+Enum(trampoline_impl) String(stack) Value(TRAMPOLINE_IMPL_STACK)
+
+EnumValue
+Enum(trampoline_impl) String(heap) Value(TRAMPOLINE_IMPL_HEAP)
+
+
 ; Zero means that floating-point math operations cannot generate a
 ; (user-visible) trap.  This is the case, for example, in nonstop
 ; IEEE 754 arithmetic.
diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index ca8837cef67..7e022a427c4 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -199,6 +199,12 @@ enum tls_model {
   TLS_MODEL_LOCAL_EXEC
 };
 
+/* Types of trampoline implementation.  */
+enum trampoline_impl {
+  TRAMPOLINE_IMPL_STACK,
+  TRAMPOLINE_IMPL_HEAP
+};
+
 /* Types of ABI for an offload compiler.  */
 enum offload_abi {
   OFFLOAD_ABI_UNSET,
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 674f956f4b8..13e13728621 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -711,7 +711,8 @@ Objective-C and Objective-C++ Dialects}.
 -fverbose-asm  -fpack-struct[=@var{n}]
 -fleading-underscore  -ftls-model=@var{model}
 -fstack-reuse=@var{reuse_level}
--ftrampolines  -ftrapv  -fwrapv
+-ftrampolines -ftrampoline-impl=@r{[}stack@r{|}heap@r{]}
+-ftrapv  -fwrapv
 -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]}
 -fstrict-volatile-bitfields  -fsync-libcalls}
 
@@ -18834,6 +18835,20 @@ For languages other than Ada, the @code{-ftrampolines} and
 trampolines are always generated on platforms that need them
 for nested functions.
 
+@opindex ftrampoline-impl
+@item -ftrampoline-impl=@r{[}stack@r{|}heap@r{]}
+By default, trampolines are generated on stack.  However, certain platforms
+(such as the Apple M1) do not permit an executable stack.  Compiling with
+@option{-ftrampoline-impl=heap} generate calls to
+@code{__builtin_nested_func_ptr_created} and
+@code{__builtin_nested_func_ptr_deleted} in order to allocate and
+deallocate trampoline space on the executable heap.  These functions are
+implemented in libgcc, and will only be provided on specific targets:
+x86_64 Darwin, x86_64 and aarch64 Linux.  @emph{PLEASE NOTE}: Heap
+trampolines are @emph{not} guaranteed to be correctly deallocated if you
+@code{setjmp}, instantiate nested functions, and then @code{longjmp} back
+to a state prior to having allocated those nested functions.
+
 @opindex fvisibility
 @item -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]}
 Set the default ELF image symbol visibility to the specified option---all
diff --git a/gcc/tree-nested.cc b/gcc/tree-nested.cc
index ae7d1f1f6a8..84ee9962485 100644
--- a/gcc/tree-nested.cc
+++ b/gcc/tree-nested.cc
@@ -611,6 +611,14 @@ get_trampoline_type (struct nesting_info *info)
   if (trampoline_type)
     return trampoline_type;
 
+  /* When trampolines are created off-stack then the only thing we need in the
+     local frame is a single pointer.  */
+  if (flag_trampoline_impl == TRAMPOLINE_IMPL_HEAP)
+    {
+      trampoline_type = build_pointer_type (void_type_node);
+      return trampoline_type;
+    }
+
   align = TRAMPOLINE_ALIGNMENT;
   size = TRAMPOLINE_SIZE;
 
@@ -2788,17 +2796,27 @@ convert_tramp_reference_op (tree *tp, int *walk_subtrees, void *data)
 
       /* Compute the address of the field holding the trampoline.  */
       x = get_frame_field (info, target_context, x, &wi->gsi);
-      x = build_addr (x);
-      x = gsi_gimplify_val (info, x, &wi->gsi);
 
-      /* Do machine-specific ugliness.  Normally this will involve
-	 computing extra alignment, but it can really be anything.  */
-      if (descr)
-	builtin = builtin_decl_implicit (BUILT_IN_ADJUST_DESCRIPTOR);
+      /* APB: We don't need to do the adjustment calls when using off-stack
+	 trampolines, any such adjustment will be done when the off-stack
+	 trampoline is created.  */
+      if (!descr && flag_trampoline_impl == TRAMPOLINE_IMPL_HEAP)
+	x = gsi_gimplify_val (info, x, &wi->gsi);
       else
-	builtin = builtin_decl_implicit (BUILT_IN_ADJUST_TRAMPOLINE);
-      call = gimple_build_call (builtin, 1, x);
-      x = init_tmp_var_with_call (info, &wi->gsi, call);
+	{
+	  x = build_addr (x);
+
+	  x = gsi_gimplify_val (info, x, &wi->gsi);
+
+	  /* Do machine-specific ugliness.  Normally this will involve
+	     computing extra alignment, but it can really be anything.  */
+	  if (descr)
+	    builtin = builtin_decl_implicit (BUILT_IN_ADJUST_DESCRIPTOR);
+	  else
+	    builtin = builtin_decl_implicit (BUILT_IN_ADJUST_TRAMPOLINE);
+	  call = gimple_build_call (builtin, 1, x);
+	  x = init_tmp_var_with_call (info, &wi->gsi, call);
+	}
 
       /* Cast back to the proper function type.  */
       x = build1 (NOP_EXPR, TREE_TYPE (t), x);
@@ -3377,6 +3395,7 @@ build_init_call_stmt (struct nesting_info *info, tree decl, tree field,
 static void
 finalize_nesting_tree_1 (struct nesting_info *root)
 {
+  gimple_seq cleanup_list = NULL;
   gimple_seq stmt_list = NULL;
   gimple *stmt;
   tree context = root->context;
@@ -3508,9 +3527,48 @@ finalize_nesting_tree_1 (struct nesting_info *root)
 	  if (!field)
 	    continue;
 
-	  x = builtin_decl_implicit (BUILT_IN_INIT_TRAMPOLINE);
-	  stmt = build_init_call_stmt (root, i->context, field, x);
-	  gimple_seq_add_stmt (&stmt_list, stmt);
+	  if (flag_trampoline_impl == TRAMPOLINE_IMPL_HEAP)
+	    {
+	      /* We pass a whole bunch of arguments to the builtin function that
+		 creates the off-stack trampoline, these are
+		 1. The nested function chain value (that must be passed to the
+		 nested function so it can find the function arguments).
+		 2. A pointer to the nested function implementation,
+		 3. The address in the local stack frame where we should write
+		 the address of the trampoline.
+
+		 When this code was originally written I just kind of threw
+		 everything at the builtin, figuring I'd work out what was
+		 actually needed later, I think, the stack pointer could
+		 certainly be dropped, arguments #2 and #4 are based off the
+		 stack pointer anyway, so #1 doesn't seem to add much value.  */
+	      tree arg1, arg2, arg3;
+
+	      gcc_assert (DECL_STATIC_CHAIN (i->context));
+	      arg1 = build_addr (root->frame_decl);
+	      arg2 = build_addr (i->context);
+
+	      x = build3 (COMPONENT_REF, TREE_TYPE (field),
+			  root->frame_decl, field, NULL_TREE);
+	      arg3 = build_addr (x);
+
+	      x = builtin_decl_implicit (BUILT_IN_NESTED_PTR_CREATED);
+	      stmt = gimple_build_call (x, 3, arg1, arg2, arg3);
+	      gimple_seq_add_stmt (&stmt_list, stmt);
+
+	      /* This call to delete the nested function trampoline is added to
+		 the cleanup list, and called when we exit the current scope.  */
+	      x = builtin_decl_implicit (BUILT_IN_NESTED_PTR_DELETED);
+	      stmt = gimple_build_call (x, 0);
+	      gimple_seq_add_stmt (&cleanup_list, stmt);
+	    }
+	  else
+	    {
+	      /* Original code to initialise the on stack trampoline.  */
+	      x = builtin_decl_implicit (BUILT_IN_INIT_TRAMPOLINE);
+	      stmt = build_init_call_stmt (root, i->context, field, x);
+	      gimple_seq_add_stmt (&stmt_list, stmt);
+	    }
 	}
     }
 
@@ -3535,11 +3593,40 @@ finalize_nesting_tree_1 (struct nesting_info *root)
   /* If we created initialization statements, insert them.  */
   if (stmt_list)
     {
-      gbind *bind;
-      annotate_all_with_location (stmt_list, DECL_SOURCE_LOCATION (context));
-      bind = gimple_seq_first_stmt_as_a_bind (gimple_body (context));
-      gimple_seq_add_seq (&stmt_list, gimple_bind_body (bind));
-      gimple_bind_set_body (bind, stmt_list);
+      if (flag_trampoline_impl == TRAMPOLINE_IMPL_HEAP)
+	{
+	  /* Handle off-stack trampolines.  */
+	  gbind *bind;
+	  annotate_all_with_location (stmt_list, DECL_SOURCE_LOCATION (context));
+	  annotate_all_with_location (cleanup_list, DECL_SOURCE_LOCATION (context));
+	  bind = gimple_seq_first_stmt_as_a_bind (gimple_body (context));
+	  gimple_seq_add_seq (&stmt_list, gimple_bind_body (bind));
+
+	  gimple_seq xxx_list = NULL;
+
+	  if (cleanup_list != NULL)
+	    {
+	      /* Maybe we shouldn't be creating this try/finally if -fno-exceptions is
+		 in use.  If this is the case, then maybe we should, instead, be
+		 inserting the cleanup code onto every path out of this function?  Not
+		 yet figured out how we would do this.  */
+	      gtry *t = gimple_build_try (stmt_list, cleanup_list, GIMPLE_TRY_FINALLY);
+	      gimple_seq_add_stmt (&xxx_list, t);
+	    }
+	  else
+	    xxx_list = stmt_list;
+
+	  gimple_bind_set_body (bind, xxx_list);
+	}
+      else
+	{
+	  /* The traditional, on stack trampolines.  */
+	  gbind *bind;
+	  annotate_all_with_location (stmt_list, DECL_SOURCE_LOCATION (context));
+	  bind = gimple_seq_first_stmt_as_a_bind (gimple_body (context));
+	  gimple_seq_add_seq (&stmt_list, gimple_bind_body (bind));
+	  gimple_bind_set_body (bind, stmt_list);
+	}
     }
 
   /* If a chain_decl was created, then it needs to be registered with
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 420857b110c..3e7beba8744 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -9870,6 +9870,23 @@ build_common_builtin_nodes (void)
 			"__builtin_nonlocal_goto",
 			ECF_NORETURN | ECF_NOTHROW);
 
+  tree ptr_ptr_type_node = build_pointer_type (ptr_type_node);
+
+  ftype = build_function_type_list (void_type_node,
+				    ptr_type_node, // void *chain
+				    ptr_type_node, // void *func
+				    ptr_ptr_type_node, // void **dst
+				    NULL_TREE);
+  local_define_builtin ("__builtin_nested_func_ptr_created", ftype,
+			BUILT_IN_NESTED_PTR_CREATED,
+			"__builtin_nested_func_ptr_created", ECF_NOTHROW);
+
+  ftype = build_function_type_list (void_type_node,
+				    NULL_TREE);
+  local_define_builtin ("__builtin_nested_func_ptr_deleted", ftype,
+			BUILT_IN_NESTED_PTR_DELETED,
+			"__builtin_nested_func_ptr_deleted", ECF_NOTHROW);
+
   ftype = build_function_type_list (void_type_node,
 				    ptr_type_node, ptr_type_node, NULL_TREE);
   local_define_builtin ("__builtin_setjmp_setup", ftype,
-- 
2.39.2 (Apple Git-143)


[-- Attachment #3: 0002-target-Support-heap-based-trampolines.patch --]
[-- Type: application/octet-stream, Size: 3571 bytes --]

From a7c7415110feb085620497852776fdad7edf9116 Mon Sep 17 00:00:00 2001
From: Francois-Xavier Coudert <fxcoudert@gcc.gnu.org>
Date: Sat, 5 Aug 2023 14:56:31 +0200
Subject: [PATCH 2/3] target: Support heap-based trampolines

Enable -ftrampoline-impl=heap by default if we are on macOS 11
or later.

Co-Authored-By: Andrew Burgess <andrew.burgess@embecosm.com>
Co-Authored-By: Iain Sandoe <iain@sandoe.co.uk>

gcc/ChangeLog:

	* config.gcc: Default to heap trampolines on macOS 11 and above.
	* config/i386/darwin.h: Define X86_CUSTOM_FUNCTION_TEST.
	* config/i386/i386.h: Define X86_CUSTOM_FUNCTION_TEST.
	* config/i386/i386.cc: Use X86_CUSTOM_FUNCTION_TEST.
---
 gcc/config.gcc           | 11 +++++++++++
 gcc/config/i386/darwin.h |  6 ++++++
 gcc/config/i386/i386.cc  |  2 +-
 gcc/config/i386/i386.h   |  6 ++++++
 4 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 415e0e1ebc5..5d70b9b4daf 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1127,6 +1127,17 @@ case ${target} in
   ;;
 esac
 
+# Figure out if we need to enable heap trampolines by default
+case ${target} in
+*-*-darwin2*)
+  # Currently, we do this for macOS 11 and above.
+  tm_defines="$tm_defines HEAP_TRAMPOLINES_INIT=1"
+  ;;
+*)
+  tm_defines="$tm_defines HEAP_TRAMPOLINES_INIT=0"
+  ;;
+esac
+
 case ${target} in
 aarch64*-*-elf | aarch64*-*-fuchsia* | aarch64*-*-rtems*)
 	tm_file="${tm_file} elfos.h newlib-stdint.h"
diff --git a/gcc/config/i386/darwin.h b/gcc/config/i386/darwin.h
index 588bd669bdd..036eefbbb95 100644
--- a/gcc/config/i386/darwin.h
+++ b/gcc/config/i386/darwin.h
@@ -308,3 +308,9 @@ along with GCC; see the file COPYING3.  If not see
 #define CLEAR_INSN_CACHE(beg, end)				\
   extern void sys_icache_invalidate(void *start, size_t len);	\
   sys_icache_invalidate ((beg), (size_t)((end)-(beg)))
+
+/* Disable custom function descriptors for Darwin when we have off-stack
+   trampolines.  */
+#undef X86_CUSTOM_FUNCTION_TEST
+#define X86_CUSTOM_FUNCTION_TEST \
+  (flag_trampolines && flag_trampoline_impl == TRAMPOLINE_IMPL_HEAP) ? 0 : 1
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 8cd26eb54fa..d7fe8f75c4f 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -25705,7 +25705,7 @@ ix86_libgcc_floating_mode_supported_p
 #define TARGET_HARD_REGNO_SCRATCH_OK ix86_hard_regno_scratch_ok
 
 #undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS
-#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1
+#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS X86_CUSTOM_FUNCTION_TEST
 
 #undef TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID
 #define TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID ix86_addr_space_zero_address_valid
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index ef342fcee9b..e1495e98c42 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -755,6 +755,12 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
 /* Minimum allocation boundary for the code of a function.  */
 #define FUNCTION_BOUNDARY 8
 
+/* We will and with this value to test if a custom function descriptor needs
+   a static chain.  The function boundary must the adjusted so that the bit
+   this represents is no longer part of the address.  0 Disables the custom
+   function descriptors.  */
+#define X86_CUSTOM_FUNCTION_TEST 1
+
 /* C++ stores the virtual bit in the lowest bit of function pointers.  */
 #define TARGET_PTRMEMFUNC_VBIT_LOCATION ptrmemfunc_vbit_in_pfn
 
-- 
2.39.2 (Apple Git-143)


[-- Attachment #4: 0003-libgcc-support-heap-based-trampolines.patch --]
[-- Type: application/octet-stream, Size: 15597 bytes --]

From e875cd959ea6d674530280ead2a2323bd6c2ad3a Mon Sep 17 00:00:00 2001
From: Francois-Xavier Coudert <fxcoudert@gcc.gnu.org>
Date: Sat, 5 Aug 2023 14:31:06 +0200
Subject: [PATCH 3/3] libgcc: support heap-based trampolines

Add support for heap-based trampolines on x86_64-linux, aarch64-linux,
and x86_64-darwin. Implement the __builtin_nested_func_ptr_created and
__builtin_nested_func_ptr_deleted functions for these targets.

Co-Authored-By: Andrew Burgess <andrew.burgess@embecosm.com>
Co-Authored-By: Iain Sandoe <iain@sandoe.co.uk>

libgcc/ChangeLog:

	* libgcc2.h (__builtin_nested_func_ptr_created): Declare.
	(__builtin_nested_func_ptr_deleted): Declare.
	* libgcc-std.ver.in: Add the new symbols.
	* config/aarch64/heap-trampoline.c: Implement heap-based
	trampolines for aarch64.
	* config/aarch64/t-heap-trampoline: Add rule to build
	config/aarch64/heap-trampoline.c
	* config/i386/heap-trampoline.c: Implement heap-based
	trampolines for x86_64.
	* config/i386/t-heap-trampoline: Add rule to build
	config/i386/heap-trampoline.cc
	* config.host: Handle --enable-heap-trampolines on
	x86_64-*-linux*, aarch64-*-linux*, x86_64-*-darwin*.
---
 libgcc/config.host                      |   3 +
 libgcc/config/aarch64/heap-trampoline.c | 172 ++++++++++++++++++++++++
 libgcc/config/aarch64/t-heap-trampoline |  19 +++
 libgcc/config/i386/heap-trampoline.c    | 172 ++++++++++++++++++++++++
 libgcc/config/i386/t-heap-trampoline    |  19 +++
 libgcc/libgcc-std.ver.in                |   3 +
 libgcc/libgcc2.h                        |   3 +
 7 files changed, 391 insertions(+)
 create mode 100644 libgcc/config/aarch64/heap-trampoline.c
 create mode 100644 libgcc/config/aarch64/t-heap-trampoline
 create mode 100644 libgcc/config/i386/heap-trampoline.c
 create mode 100644 libgcc/config/i386/t-heap-trampoline

diff --git a/libgcc/config.host b/libgcc/config.host
index c94d69d84b7..d96b02ce87f 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -423,6 +423,7 @@ aarch64*-*-linux*)
 	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
 	tmake_file="${tmake_file} t-dfprules"
+	tmake_file="${tmake_file} ${cpu_type}/t-heap-trampoline"
 	;;
 aarch64*-*-vxworks7*)
 	extra_parts="$extra_parts crtfastmath.o"
@@ -697,6 +698,7 @@ x86_64-*-darwin*)
 	tmake_file="$tmake_file i386/t-crtpc t-crtfm i386/t-msabi"
 	tm_file="$tm_file i386/darwin-lib.h"
 	extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o crtfastmath.o"
+	tmake_file="${tmake_file} i386/t-heap-trampoline"
 	;;
 i[34567]86-*-elfiamcu)
 	tmake_file="$tmake_file i386/t-crtstuff t-softfp-sfdftf i386/32/t-softfp i386/32/t-iamcu i386/t-softfp t-softfp t-dfprules"
@@ -763,6 +765,7 @@ x86_64-*-linux*)
 	tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff t-dfprules"
 	tm_file="${tm_file} i386/elf-lib.h"
 	md_unwind_header=i386/linux-unwind.h
+	tmake_file="${tmake_file} i386/t-heap-trampoline"
 	;;
 x86_64-*-kfreebsd*-gnu)
 	extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o crtfastmath.o"
diff --git a/libgcc/config/aarch64/heap-trampoline.c b/libgcc/config/aarch64/heap-trampoline.c
new file mode 100644
index 00000000000..c8b83681ed7
--- /dev/null
+++ b/libgcc/config/aarch64/heap-trampoline.c
@@ -0,0 +1,172 @@
+/* Copyright The GNU Toolchain Authors. */
+
+#include <unistd.h>
+#include <sys/mman.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+
+#if __APPLE__
+/* For pthread_jit_write_protect_np */
+#include <pthread.h>
+#endif
+
+void *allocate_trampoline_page (void);
+int get_trampolines_per_page (void);
+struct tramp_ctrl_data *allocate_tramp_ctrl (struct tramp_ctrl_data *parent);
+void *allocate_trampoline_page (void);
+
+void __builtin_nested_func_ptr_created (void *chain, void *func, void **dst);
+void __builtin_nested_func_ptr_deleted (void);
+
+#if defined(__gnu_linux__)
+static const uint32_t aarch64_trampoline_insns[] = {
+  0xd503245f, /* hint    34 */
+  0x580000b1, /* ldr     x17, .+20 */
+  0x580000d2, /* ldr     x18, .+24 */
+  0xd61f0220, /* br      x17 */
+  0xd5033f9f, /* dsb     sy */
+  0xd5033fdf /* isb */
+};
+
+#elif __APPLE__
+static const uint32_t aarch64_trampoline_insns[] = {
+  0xd503245f, /* hint    34 */
+  0x580000b1, /* ldr     x17, .+20 */
+  0x580000d0, /* ldr     x16, .+24 */
+  0xd61f0220, /* br      x17 */
+  0xd5033f9f, /* dsb     sy */
+  0xd5033fdf /* isb */
+};
+
+#else
+#error "Unsupported AArch64 platform for heap trampolines"
+#endif
+
+struct aarch64_trampoline {
+  uint32_t insns[6];
+  void *func_ptr;
+  void *chain_ptr;
+};
+
+struct tramp_ctrl_data
+{
+  struct tramp_ctrl_data *prev;
+
+  int free_trampolines;
+
+  /* This will be pointing to an executable mmap'ed page.  */
+  struct aarch64_trampoline *trampolines;
+};
+
+int
+get_trampolines_per_page (void)
+{
+  return getpagesize() / sizeof(struct aarch64_trampoline);
+}
+
+static _Thread_local struct tramp_ctrl_data *tramp_ctrl_curr = NULL;
+
+void *
+allocate_trampoline_page (void)
+{
+  void *page;
+
+#if defined(__gnu_linux__)
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+	       MAP_ANON | MAP_PRIVATE, 0, 0);
+#elif __APPLE__
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+	       MAP_ANON | MAP_PRIVATE | MAP_JIT, 0, 0);
+#else
+  page = MAP_FAILED;
+#endif
+
+  return page;
+}
+
+struct tramp_ctrl_data *
+allocate_tramp_ctrl (struct tramp_ctrl_data *parent)
+{
+  struct tramp_ctrl_data *p = malloc (sizeof (struct tramp_ctrl_data));
+  if (p == NULL)
+    return NULL;
+
+  p->trampolines = allocate_trampoline_page ();
+
+  if (p->trampolines == MAP_FAILED)
+    return NULL;
+
+  p->prev = parent;
+  p->free_trampolines = get_trampolines_per_page();
+
+  return p;
+}
+
+void
+__builtin_nested_func_ptr_created (void *chain, void *func, void **dst)
+{
+  if (tramp_ctrl_curr == NULL)
+    {
+      tramp_ctrl_curr = allocate_tramp_ctrl (NULL);
+      if (tramp_ctrl_curr == NULL)
+	abort ();
+    }
+
+  if (tramp_ctrl_curr->free_trampolines == 0)
+    {
+      void *tramp_ctrl = allocate_tramp_ctrl (tramp_ctrl_curr);
+      if (!tramp_ctrl)
+	abort ();
+
+      tramp_ctrl_curr = tramp_ctrl;
+    }
+
+  struct aarch64_trampoline *trampoline
+    = &tramp_ctrl_curr->trampolines[get_trampolines_per_page ()
+				    - tramp_ctrl_curr->free_trampolines];
+
+#if __APPLE__
+  /* Disable write protection for the MAP_JIT regions in this thread (see
+     https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon) */
+  pthread_jit_write_protect_np (0);
+#endif
+
+  memcpy (trampoline->insns, aarch64_trampoline_insns,
+	  sizeof(aarch64_trampoline_insns));
+  trampoline->func_ptr = func;
+  trampoline->chain_ptr = chain;
+
+#if __APPLE__
+  /* Re-enable write protection.  */
+  pthread_jit_write_protect_np (1);
+#endif
+
+  tramp_ctrl_curr->free_trampolines -= 1;
+
+  __builtin___clear_cache ((void *)trampoline->insns,
+			   ((void *)trampoline->insns + sizeof(trampoline->insns)));
+
+  *dst = &trampoline->insns;
+}
+
+void
+__builtin_nested_func_ptr_deleted (void)
+{
+  if (tramp_ctrl_curr == NULL)
+    abort ();
+
+  tramp_ctrl_curr->free_trampolines += 1;
+
+  if (tramp_ctrl_curr->free_trampolines == get_trampolines_per_page ())
+    {
+      if (tramp_ctrl_curr->prev == NULL)
+	return;
+
+      munmap (tramp_ctrl_curr->trampolines, getpagesize());
+      struct tramp_ctrl_data *prev = tramp_ctrl_curr->prev;
+      free (tramp_ctrl_curr);
+      tramp_ctrl_curr = prev;
+    }
+}
diff --git a/libgcc/config/aarch64/t-heap-trampoline b/libgcc/config/aarch64/t-heap-trampoline
new file mode 100644
index 00000000000..b22480800b2
--- /dev/null
+++ b/libgcc/config/aarch64/t-heap-trampoline
@@ -0,0 +1,19 @@
+# Copyright The GNU Toolchain Authors.
+
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+LIB2ADD += $(srcdir)/config/aarch64/heap-trampoline.c
diff --git a/libgcc/config/i386/heap-trampoline.c b/libgcc/config/i386/heap-trampoline.c
new file mode 100644
index 00000000000..96e13bf828e
--- /dev/null
+++ b/libgcc/config/i386/heap-trampoline.c
@@ -0,0 +1,172 @@
+/* Copyright The GNU Toolchain Authors. */
+
+#include <unistd.h>
+#include <sys/mman.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+
+#if __APPLE__ && __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ >= 101400
+/* For pthread_jit_write_protect_np */
+#include <pthread.h>
+#endif
+
+void *allocate_trampoline_page (void);
+int get_trampolines_per_page (void);
+struct tramp_ctrl_data *allocate_tramp_ctrl (struct tramp_ctrl_data *parent);
+void *allocate_trampoline_page (void);
+
+void __builtin_nested_func_ptr_created (void *chain, void *func, void **dst);
+void __builtin_nested_func_ptr_deleted (void);
+
+static const uint8_t trampoline_insns[] = {
+  /* movabs $<chain>,%r11  */
+  0x49, 0xbb,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+
+  /* movabs $<func>,%r10  */
+  0x49, 0xba,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+
+  /* rex.WB jmpq *%r11  */
+  0x41, 0xff, 0xe3
+};
+
+union ix86_trampoline {
+  uint8_t insns[sizeof(trampoline_insns)];
+
+  struct __attribute__((packed)) fields {
+    uint8_t insn_0[2];
+    void *func_ptr;
+    uint8_t insn_1[2];
+    void *chain_ptr;
+    uint8_t insn_2[3];
+  } fields;
+};
+
+struct tramp_ctrl_data
+{
+  struct tramp_ctrl_data *prev;
+
+  int free_trampolines;
+
+  /* This will be pointing to an executable mmap'ed page.  */
+  union ix86_trampoline *trampolines;
+};
+
+int
+get_trampolines_per_page (void)
+{
+  return getpagesize() / sizeof(union ix86_trampoline);
+}
+
+static _Thread_local struct tramp_ctrl_data *tramp_ctrl_curr = NULL;
+
+void *
+allocate_trampoline_page (void)
+{
+  void *page;
+
+#if defined(__gnu_linux__)
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+	       MAP_ANON | MAP_PRIVATE, 0, 0);
+#elif __APPLE__
+# if  __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ >= 101400
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+	       MAP_ANON | MAP_PRIVATE | MAP_JIT, 0, 0);
+# else
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+	       MAP_ANON | MAP_PRIVATE, 0, 0);
+# endif
+#else
+  page = MAP_FAILED;
+#endif
+
+  return page;
+}
+
+struct tramp_ctrl_data *
+allocate_tramp_ctrl (struct tramp_ctrl_data *parent)
+{
+  struct tramp_ctrl_data *p = malloc (sizeof (struct tramp_ctrl_data));
+  if (p == NULL)
+    return NULL;
+
+  p->trampolines = allocate_trampoline_page ();
+
+  if (p->trampolines == MAP_FAILED)
+    return NULL;
+
+  p->prev = parent;
+  p->free_trampolines = get_trampolines_per_page();
+
+  return p;
+}
+
+void
+__builtin_nested_func_ptr_created (void *chain, void *func, void **dst)
+{
+  if (tramp_ctrl_curr == NULL)
+    {
+      tramp_ctrl_curr = allocate_tramp_ctrl (NULL);
+      if (tramp_ctrl_curr == NULL)
+	abort ();
+    }
+
+  if (tramp_ctrl_curr->free_trampolines == 0)
+    {
+      void *tramp_ctrl = allocate_tramp_ctrl (tramp_ctrl_curr);
+      if (!tramp_ctrl)
+	abort ();
+
+      tramp_ctrl_curr = tramp_ctrl;
+    }
+
+  union ix86_trampoline *trampoline
+    = &tramp_ctrl_curr->trampolines[get_trampolines_per_page ()
+				    - tramp_ctrl_curr->free_trampolines];
+
+#if __APPLE__ && __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ >= 101400
+  /* Disable write protection for the MAP_JIT regions in this thread (see
+     https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon) */
+  pthread_jit_write_protect_np (0);
+#endif
+
+  memcpy (trampoline->insns, trampoline_insns,
+	  sizeof(trampoline_insns));
+  trampoline->fields.func_ptr = func;
+  trampoline->fields.chain_ptr = chain;
+
+#if __APPLE__ && __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ >= 101400
+  /* Re-enable write protection.  */
+  pthread_jit_write_protect_np (1);
+#endif
+
+  tramp_ctrl_curr->free_trampolines -= 1;
+
+  __builtin___clear_cache ((void *)trampoline->insns,
+			   ((void *)trampoline->insns + sizeof(trampoline->insns)));
+
+  *dst = &trampoline->insns;
+}
+
+void
+__builtin_nested_func_ptr_deleted (void)
+{
+  if (tramp_ctrl_curr == NULL)
+    abort ();
+
+  tramp_ctrl_curr->free_trampolines += 1;
+
+  if (tramp_ctrl_curr->free_trampolines == get_trampolines_per_page ())
+    {
+      if (tramp_ctrl_curr->prev == NULL)
+	return;
+
+      munmap (tramp_ctrl_curr->trampolines, getpagesize());
+      struct tramp_ctrl_data *prev = tramp_ctrl_curr->prev;
+      free (tramp_ctrl_curr);
+      tramp_ctrl_curr = prev;
+    }
+}
diff --git a/libgcc/config/i386/t-heap-trampoline b/libgcc/config/i386/t-heap-trampoline
new file mode 100644
index 00000000000..613f635b1f6
--- /dev/null
+++ b/libgcc/config/i386/t-heap-trampoline
@@ -0,0 +1,19 @@
+# Copyright The GNU Toolchain Authors.
+
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+LIB2ADD += $(srcdir)/config/i386/heap-trampoline.c
diff --git a/libgcc/libgcc-std.ver.in b/libgcc/libgcc-std.ver.in
index c4f87a50e70..a48f4899eb6 100644
--- a/libgcc/libgcc-std.ver.in
+++ b/libgcc/libgcc-std.ver.in
@@ -1943,4 +1943,7 @@ GCC_4.8.0 {
 GCC_7.0.0 {
   __PFX__divmoddi4
   __PFX__divmodti4
+
+  __builtin_nested_func_ptr_created
+  __builtin_nested_func_ptr_deleted
 }
diff --git a/libgcc/libgcc2.h b/libgcc/libgcc2.h
index 3ec9bbd8164..ac7eaab4f01 100644
--- a/libgcc/libgcc2.h
+++ b/libgcc/libgcc2.h
@@ -29,6 +29,9 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #pragma GCC visibility push(default)
 #endif
 
+extern void __builtin_nested_func_ptr_created (void *, void *, void **);
+extern void __builtin_nested_func_ptr_deleted (void);
+
 extern int __gcc_bcmp (const unsigned char *, const unsigned char *, size_t);
 extern void __clear_cache (void *, void *);
 extern void __eprintf (const char *, const char *, unsigned int, const char *)
-- 
2.39.2 (Apple Git-143)


  parent reply	other threads:[~2023-09-06 15:44 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-16 10:38 FX Coudert
2023-07-17  6:31 ` Richard Biener
2023-07-17  6:43   ` FX Coudert
2023-07-17  6:58     ` Iain Sandoe
2023-07-17  7:16       ` Iain Sandoe
2023-07-19  9:04         ` Martin Uecker
2023-07-19  9:29           ` Iain Sandoe
2023-07-19 10:43             ` Martin Uecker
2023-07-19 14:23               ` Iain Sandoe
2023-07-19 15:18                 ` Martin Uecker
2023-08-05 14:20   ` FX Coudert
2023-08-20  9:43     ` FX Coudert
2023-09-06 15:44     ` FX Coudert [this message]
2023-09-14 10:18       ` Richard Biener
2023-09-16 19:10         ` Iain Sandoe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20E41E5A-C9FE-4B9B-AEE6-CA8D18EC6635@gmail.com \
    --to=fxcoudert@gmail.com \
    --cc=ebotcazou@adacore.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=iains.gcc@gmail.com \
    --cc=jeffreyalaw@gmail.com \
    --cc=maxim.blinov@embecosm.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).