public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 1/4] Generate off-stack nested function trampolines
@ 2021-11-13  9:45 Maxim Blinov
  2021-11-13  9:45 ` [PATCH 2/4] Add x86_64-linux support for off-stack trampolines Maxim Blinov
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Maxim Blinov @ 2021-11-13  9:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: iain, maxim.blinov, Andrew Burgess

Add support for allocating nested function trampolines on an
executable heap rather than on the stack. This is motivated by targets
such as AArch64 Darwin, which globally prohibit executing code on the
stack.

The target-specific routines for allocating and writing trampolines is
to be provided in libgcc, and is by-default _not_ compiled in unless
the target specifically requires it, or you manually provide
--enable-off-stack-trampolines when configuring gcc/libgcc.

The gcc flag -foff-stack-trampolines controls whether to generate code
that instantiates trampolines on the stack, or to emit calls to
__builtin_nested_func_ptr_created and
__builtin_nested_func_ptr_deleted. Note that this flag is completely
independent of libgcc: If libgcc is for any reason missing those
symbols, you will get a link failure.

This implementation imposes some implicit restrictions as compared to
stack trampolines. longjmp'ing back to a state before a trampoline was
created will cause us to skip over the corresponding
__builtin_nested_func_ptr_deleted, which will leak trampolines
starting from the beginning of the linked list of allocated
trampolines. There may be scope for instrumenting longjmp/setjmp to
trigger cleanups of trampolines.

Co-authored-by: Andrew Burgess <andrew.burgess@embecosm.com>

gcc/ChangeLog:

        * builtins.def (BUILT_IN_NESTED_PTR_CREATED): Define.
        (BUILT_IN_NESTED_PTR_DELETED): Ditto.
        * common.opt (foff-stack-trampolines): Add flag to control
        generation of heap-based trampoline instantiation.
        * tree-nested.c (convert_tramp_reference_op): Don't bother calling
        __builtin_adjust_trampoline for the off-stack case.
        (finalize_nesting_tree_1): Emit calls to
        __builtin_nested_...{created,deleted} if we're generating with
        -foff-stack-trampolines.
        * tree.c (build_common_builtin_nodes): Build
        __builtin_nested_...{created,deleted}.
	* dov/invoke.texi (-foff-stack-trampolines): Document.

libgcc/ChangeLog:

	* configure.ac: Add configure parameter
        --enable-off-stack-trampolines, and do error checking if we've
        trying to enable off-stack trampolines for a platform that doesn't
        provide any such implementation.
	* configure: Regenerate.
	* libgcc-std.ver.in: Ditto.
	* libgcc2.h (__builtin_nested_func_ptr_created): Declare.
        (__builtin_nested_func_ptr_deleted): Ditto.
---
 gcc/builtins.def         |   2 +
 gcc/common.opt           |   4 ++
 gcc/config.gcc           |   7 +++
 gcc/doc/invoke.texi      |  14 +++++
 gcc/tree-nested.c        | 121 +++++++++++++++++++++++++++++++++------
 gcc/tree.c               |  17 ++++++
 libgcc/configure         |  26 +++++++++
 libgcc/configure.ac      |  17 ++++++
 libgcc/libgcc-std.ver.in |   3 +
 libgcc/libgcc2.h         |   3 +
 10 files changed, 197 insertions(+), 17 deletions(-)

diff --git a/gcc/builtins.def b/gcc/builtins.def
index 45a09b4d42d..90a94a6dd0f 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -950,6 +950,8 @@ DEF_BUILTIN_STUB (BUILT_IN_ADJUST_TRAMPOLINE, "__builtin_adjust_trampoline")
 DEF_BUILTIN_STUB (BUILT_IN_INIT_DESCRIPTOR, "__builtin_init_descriptor")
 DEF_BUILTIN_STUB (BUILT_IN_ADJUST_DESCRIPTOR, "__builtin_adjust_descriptor")
 DEF_BUILTIN_STUB (BUILT_IN_NONLOCAL_GOTO, "__builtin_nonlocal_goto")
+DEF_BUILTIN_STUB (BUILT_IN_NESTED_PTR_CREATED, "__builtin_nested_func_ptr_created")
+DEF_BUILTIN_STUB (BUILT_IN_NESTED_PTR_DELETED, "__builtin_nested_func_ptr_deleted")
 
 /* Implementing __builtin_setjmp.  */
 DEF_BUILTIN_STUB (BUILT_IN_SETJMP_SETUP, "__builtin_setjmp_setup")
diff --git a/gcc/common.opt b/gcc/common.opt
index de9b848eda5..a97aeeb2165 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2149,6 +2149,10 @@ foffload-abi=
 Common Joined RejectNegative Enum(offload_abi)
 -foffload-abi=[lp64|ilp32]	Set the ABI to use in an offload compiler.
 
+foff-stack-trampolines
+Common RejectNegative Var(flag_off_stack_trampolines) Init(OFF_STACK_TRAMPOLINES_INIT)
+Generate trampolines in executable memory rather than executable stack.
+
 Enum
 Name(offload_abi) Type(enum offload_abi) UnknownError(unknown offload ABI %qs)
 
diff --git a/gcc/config.gcc b/gcc/config.gcc
index edd12655c4a..c479aa4cc44 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1070,6 +1070,13 @@ case ${target} in
   ;;
 esac
 
+# Figure out if we need to enable -foff-stack-trampolines by default.
+case ${target} in
+*)
+  tm_defines="$tm_defines OFF_STACK_TRAMPOLINES_INIT=0"
+  ;;
+esac
+
 case ${target} in
 aarch64*-*-elf | aarch64*-*-fuchsia* | aarch64*-*-rtems*)
 	tm_file="${tm_file} dbxelf.h elfos.h newlib-stdint.h"
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2aba4c70b44..a5db65f8721 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -660,6 +660,7 @@ Objective-C and Objective-C++ Dialects}.
 @gccoptlist{-fcall-saved-@var{reg}  -fcall-used-@var{reg} @gol
 -ffixed-@var{reg}  -fexceptions @gol
 -fnon-call-exceptions  -fdelete-dead-exceptions  -funwind-tables @gol
+-foff-stack-trampolines @gol
 -fasynchronous-unwind-tables @gol
 -fno-gnu-unique @gol
 -finhibit-size-directive  -fcommon  -fno-ident @gol
@@ -16683,6 +16684,19 @@ instructions.  It does not allow exceptions to be thrown from
 arbitrary signal handlers such as @code{SIGALRM}.  This enables
 @option{-fexceptions}.
 
+@item -foff-stack-trampolines
+@opindex foff-stack-trampolines
+Certain platforms (such as the Apple M1) do not permit an executable
+stack. Generate calls to @code{__builtin_nested_func_ptr_created} and
+@code{__builtin_nested_func_ptr_deleted} in order to allocate and
+deallocate trampoline space on the executable heap. Please note that
+these functions are implemented in libgcc, and will not be compiled in
+unless you provide @option{--enable-off-stack-trampolines} when
+building gcc.  @emph{PLEASE NOTE}: The trampolines are @emph{not}
+guaranteed to be correctly deallocated if you @code{setjmp},
+instantiate nested functions, and then @code{longjmp} back to a state
+prior to having allocated those nested functions.
+
 @item -fdelete-dead-exceptions
 @opindex fdelete-dead-exceptions
 Consider that instructions that may throw exceptions but don't otherwise
diff --git a/gcc/tree-nested.c b/gcc/tree-nested.c
index c7f50ebd21c..a405c905e1d 100644
--- a/gcc/tree-nested.c
+++ b/gcc/tree-nested.c
@@ -611,6 +611,14 @@ get_trampoline_type (struct nesting_info *info)
   if (trampoline_type)
     return trampoline_type;
 
+  /* When trampolines are created off-stack then the only thing we need in the
+     local frame is a single pointer.  */
+  if (flag_off_stack_trampolines)
+    {
+      trampoline_type = build_pointer_type (void_type_node);
+      return trampoline_type;
+    }
+
   align = TRAMPOLINE_ALIGNMENT;
   size = TRAMPOLINE_SIZE;
 
@@ -2784,17 +2792,27 @@ convert_tramp_reference_op (tree *tp, int *walk_subtrees, void *data)
 
       /* Compute the address of the field holding the trampoline.  */
       x = get_frame_field (info, target_context, x, &wi->gsi);
-      x = build_addr (x);
-      x = gsi_gimplify_val (info, x, &wi->gsi);
 
-      /* Do machine-specific ugliness.  Normally this will involve
-	 computing extra alignment, but it can really be anything.  */
-      if (descr)
-	builtin = builtin_decl_implicit (BUILT_IN_ADJUST_DESCRIPTOR);
+      /* APB: We don't need to do the adjustment calls when using off-stack
+	 trampolines, any such adjustment will be done when the off-stack
+	 trampoline is created.  */
+      if (flag_off_stack_trampolines)
+	x = gsi_gimplify_val (info, x, &wi->gsi);
       else
-	builtin = builtin_decl_implicit (BUILT_IN_ADJUST_TRAMPOLINE);
-      call = gimple_build_call (builtin, 1, x);
-      x = init_tmp_var_with_call (info, &wi->gsi, call);
+	{
+	  x = build_addr (x);
+
+	  x = gsi_gimplify_val (info, x, &wi->gsi);
+
+	  /* Do machine-specific ugliness.  Normally this will involve
+	     computing extra alignment, but it can really be anything.  */
+	  if (descr)
+	    builtin = builtin_decl_implicit (BUILT_IN_ADJUST_DESCRIPTOR);
+	  else
+	    builtin = builtin_decl_implicit (BUILT_IN_ADJUST_TRAMPOLINE);
+	  call = gimple_build_call (builtin, 1, x);
+	  x = init_tmp_var_with_call (info, &wi->gsi, call);
+	}
 
       /* Cast back to the proper function type.  */
       x = build1 (NOP_EXPR, TREE_TYPE (t), x);
@@ -3373,6 +3391,7 @@ build_init_call_stmt (struct nesting_info *info, tree decl, tree field,
 static void
 finalize_nesting_tree_1 (struct nesting_info *root)
 {
+  gimple_seq cleanup_list = NULL;
   gimple_seq stmt_list = NULL;
   gimple *stmt;
   tree context = root->context;
@@ -3504,9 +3523,48 @@ finalize_nesting_tree_1 (struct nesting_info *root)
 	  if (!field)
 	    continue;
 
-	  x = builtin_decl_implicit (BUILT_IN_INIT_TRAMPOLINE);
-	  stmt = build_init_call_stmt (root, i->context, field, x);
-	  gimple_seq_add_stmt (&stmt_list, stmt);
+	  if (flag_off_stack_trampolines)
+	    {
+	      /* We pass a whole bunch of arguments to the builtin function that
+		 creates the off-stack trampoline, these are
+		 1. The nested function chain value (that must be passed to the
+		 nested function so it can find the function arguments).
+		 2. A pointer to the nested function implementation,
+		 3. The address in the local stack frame where we should write
+		 the address of the trampoline.
+
+		 When this code was originally written I just kind of threw
+		 everything at the builtin, figuring I'd work out what was
+		 actually needed later, I think, the stack pointer could
+		 certainly be dropped, arguments #2 and #4 are based off the
+		 stack pointer anyway, so #1 doesn't seem to add much value.  */
+	      tree arg1, arg2, arg3;
+
+	      gcc_assert (DECL_STATIC_CHAIN (i->context));
+	      arg1 = build_addr (root->frame_decl);
+	      arg2 = build_addr (i->context);
+
+	      x = build3 (COMPONENT_REF, TREE_TYPE (field),
+			  root->frame_decl, field, NULL_TREE);
+	      arg3 = build_addr (x);
+
+	      x = builtin_decl_implicit (BUILT_IN_NESTED_PTR_CREATED);
+	      stmt = gimple_build_call (x, 3, arg1, arg2, arg3);
+	      gimple_seq_add_stmt (&stmt_list, stmt);
+
+	      /* This call to delete the nested function trampoline is added to
+		 the cleanup list, and called when we exit the current scope.  */
+	      x = builtin_decl_implicit (BUILT_IN_NESTED_PTR_DELETED);
+	      stmt = gimple_build_call (x, 0);
+	      gimple_seq_add_stmt (&cleanup_list, stmt);
+	    }
+	  else
+	    {
+	      /* Original code to initialise the on stack trampoline.  */
+	      x = builtin_decl_implicit (BUILT_IN_INIT_TRAMPOLINE);
+	      stmt = build_init_call_stmt (root, i->context, field, x);
+	      gimple_seq_add_stmt (&stmt_list, stmt);
+	    }
 	}
     }
 
@@ -3531,11 +3589,40 @@ finalize_nesting_tree_1 (struct nesting_info *root)
   /* If we created initialization statements, insert them.  */
   if (stmt_list)
     {
-      gbind *bind;
-      annotate_all_with_location (stmt_list, DECL_SOURCE_LOCATION (context));
-      bind = gimple_seq_first_stmt_as_a_bind (gimple_body (context));
-      gimple_seq_add_seq (&stmt_list, gimple_bind_body (bind));
-      gimple_bind_set_body (bind, stmt_list);
+      if (flag_off_stack_trampolines)
+	{
+	  /* Handle the new, off stack trampolines.  */
+	  gbind *bind;
+	  annotate_all_with_location (stmt_list, DECL_SOURCE_LOCATION (context));
+	  annotate_all_with_location (cleanup_list, DECL_SOURCE_LOCATION (context));
+	  bind = gimple_seq_first_stmt_as_a_bind (gimple_body (context));
+	  gimple_seq_add_seq (&stmt_list, gimple_bind_body (bind));
+
+	  gimple_seq xxx_list = NULL;
+
+	  if (cleanup_list != NULL)
+	    {
+	      /* We Maybe shouldn't be creating this try/finally if -fno-exceptions is
+		 in use.  If this is the case, then maybe we should, instead, be
+		 inserting the cleanup code onto every path out of this function?  Not
+		 yet figured out how we would do this.  */
+	      gtry *t = gimple_build_try (stmt_list, cleanup_list, GIMPLE_TRY_FINALLY);
+	      gimple_seq_add_stmt (&xxx_list, t);
+	    }
+	  else
+	    xxx_list = stmt_list;
+
+	  gimple_bind_set_body (bind, xxx_list);
+	}
+      else
+	{
+	  /* The traditional, on stack trampolines.  */
+	  gbind *bind;
+	  annotate_all_with_location (stmt_list, DECL_SOURCE_LOCATION (context));
+	  bind = gimple_seq_first_stmt_as_a_bind (gimple_body (context));
+	  gimple_seq_add_seq (&stmt_list, gimple_bind_body (bind));
+	  gimple_bind_set_body (bind, stmt_list);
+	}
     }
 
   /* If a chain_decl was created, then it needs to be registered with
diff --git a/gcc/tree.c b/gcc/tree.c
index f2c829fa4c6..968963a3127 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -9648,6 +9648,23 @@ build_common_builtin_nodes (void)
 			"__builtin_nonlocal_goto",
 			ECF_NORETURN | ECF_NOTHROW);
 
+  tree ptr_ptr_type_node = build_pointer_type (ptr_type_node);
+
+  ftype = build_function_type_list (void_type_node,
+				    ptr_type_node, // void *chain
+				    ptr_type_node, // void *func
+				    ptr_ptr_type_node, // void **dst
+				    NULL_TREE);
+  local_define_builtin ("__builtin_nested_func_ptr_created", ftype,
+			BUILT_IN_NESTED_PTR_CREATED,
+			"__builtin_nested_func_ptr_created", ECF_NOTHROW);
+
+  ftype = build_function_type_list (void_type_node,
+				    NULL_TREE);
+  local_define_builtin ("__builtin_nested_func_ptr_deleted", ftype,
+			BUILT_IN_NESTED_PTR_DELETED,
+			"__builtin_nested_func_ptr_deleted", ECF_NOTHROW);
+
   ftype = build_function_type_list (void_type_node,
 				    ptr_type_node, ptr_type_node, NULL_TREE);
   local_define_builtin ("__builtin_setjmp_setup", ftype,
diff --git a/libgcc/configure b/libgcc/configure
index 4919a56f518..2f469219e07 100755
--- a/libgcc/configure
+++ b/libgcc/configure
@@ -654,6 +654,7 @@ build
 with_aix_soname
 enable_vtable_verify
 enable_gcov
+off_stack_trampolines
 enable_shared
 libgcc_topdir
 target_alias
@@ -701,6 +702,7 @@ with_target_subdir
 with_cross_host
 with_ld
 enable_shared
+enable_off_stack_trampolines
 enable_gcov
 enable_vtable_verify
 with_aix_soname
@@ -1342,6 +1344,9 @@ Optional Features:
   --disable-FEATURE       do not include FEATURE (same as --enable-FEATURE=no)
   --enable-FEATURE[=ARG]  include FEATURE [ARG=yes]
   --disable-shared        don't provide a shared libgcc
+  --enable-off-stack-trampolines
+                  Specify whether to support generating off-stack trampolines
+
   --disable-gcov          don't provide libgcov and related host tools
   --enable-vtable-verify    Enable vtable verification feature
   --enable-version-specific-runtime-libs    Specify that runtime libraries should be installed in a compiler-specific directory
@@ -2252,6 +2257,27 @@ fi
 
 
 
+# Check whether --enable-off-stack-trampolines was given.
+if test "${enable_off_stack_trampolines+set}" = set; then :
+  enableval=$enable_off_stack_trampolines;
+case "$target" in
+  *)
+    as_fn_error $? "Configure option --enable-off-stack-trampolines is not supported \
+for this platform" "$LINENO" 5
+    off_stack_trampolines=no
+    ;;
+esac
+else
+
+case "$target" in
+  *)
+    off_stack_trampolines=no
+    ;;
+esac
+fi
+
+
+
 # Check whether --enable-gcov was given.
 if test "${enable_gcov+set}" = set; then :
   enableval=$enable_gcov;
diff --git a/libgcc/configure.ac b/libgcc/configure.ac
index 13a80b2551b..97bbd4bd35c 100644
--- a/libgcc/configure.ac
+++ b/libgcc/configure.ac
@@ -68,6 +68,23 @@ AC_ARG_ENABLE(shared,
 ], [enable_shared=yes])
 AC_SUBST(enable_shared)
 
+AC_ARG_ENABLE([off-stack-trampolines],
+  [AS_HELP_STRING([--enable-off-stack-trampolines]
+                  [Specify whether to support generating off-stack trampolines])],[
+case "$target" in
+  *)
+    AC_MSG_ERROR([Configure option --enable-off-stack-trampolines is not supported \
+for this platform])
+    off_stack_trampolines=no
+    ;;
+esac],[
+case "$target" in
+  *)
+    off_stack_trampolines=no
+    ;;
+esac])
+AC_SUBST(off_stack_trampolines)
+
 AC_ARG_ENABLE(gcov,
 [  --disable-gcov          don't provide libgcov and related host tools],
 [], [enable_gcov=yes])
diff --git a/libgcc/libgcc-std.ver.in b/libgcc/libgcc-std.ver.in
index cea33267e53..f26ad3cdf5d 100644
--- a/libgcc/libgcc-std.ver.in
+++ b/libgcc/libgcc-std.ver.in
@@ -1943,4 +1943,7 @@ GCC_4.8.0 {
 GCC_7.0.0 {
   __PFX__divmoddi4
   __PFX__divmodti4
+
+  __builtin_nested_func_ptr_created
+  __builtin_nested_func_ptr_deleted
 }
diff --git a/libgcc/libgcc2.h b/libgcc/libgcc2.h
index 1819ff3ac3d..1a448c02c04 100644
--- a/libgcc/libgcc2.h
+++ b/libgcc/libgcc2.h
@@ -29,6 +29,9 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #pragma GCC visibility push(default)
 #endif
 
+extern void __builtin_nested_func_ptr_created (void *, void *, void **);
+extern void __builtin_nested_func_ptr_deleted (void);
+
 extern int __gcc_bcmp (const unsigned char *, const unsigned char *, size_t);
 extern void __clear_cache (void *, void *);
 extern void __eprintf (const char *, const char *, unsigned int, const char *)
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/4] Add x86_64-linux support for off-stack trampolines
  2021-11-13  9:45 [PATCH 1/4] Generate off-stack nested function trampolines Maxim Blinov
@ 2021-11-13  9:45 ` Maxim Blinov
  2021-12-03  3:18   ` Jeff Law
  2021-11-13  9:45 ` [PATCH 3/4] Add aarch64-linux " Maxim Blinov
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Maxim Blinov @ 2021-11-13  9:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: iain, maxim.blinov, Andrew Burgess

Implement the __builtin_nested_func_ptr_{created,deleted} functions
for the x86_64-linux platform. This serves to exercise the
infrastructure added in libgcc (--enable-off-stack-trampolines) and
gcc (-foff-stack-trampolines) in supporting off-stack trampoline
generation, and is intended primarily for demonstration and debugging
purposes.

Co-authored-by: Andrew Burgess <andrew.burgess@embecosm.com>

libgcc/ChangeLog:

	* config/i386/heap-trampoline.c: New file: Implement off-stack
	trampolines for x86_64.
	* config/i386/t-heap-trampoline: Add rule to build
	config/i386/heap-trampoline.c
	* config.host (x86_64-*-linux*): Handle
	--enable-off-stack-trampolines.
	* configure.ac (--enable-off-stack-trampolines): Permit setting
	for target x86_64-*-linux*.
	* configure: Regenerate.
---
 libgcc/config.host                   |   4 +
 libgcc/config/i386/heap-trampoline.c | 143 +++++++++++++++++++++++++++
 libgcc/config/i386/t-heap-trampoline |  21 ++++
 libgcc/configure                     |   3 +
 libgcc/configure.ac                  |   3 +
 5 files changed, 174 insertions(+)
 create mode 100644 libgcc/config/i386/heap-trampoline.c
 create mode 100644 libgcc/config/i386/t-heap-trampoline

diff --git a/libgcc/config.host b/libgcc/config.host
index 168535b1780..163cd4c4161 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -753,6 +753,10 @@ x86_64-*-linux*)
 	tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff t-dfprules"
 	tm_file="${tm_file} i386/elf-lib.h"
 	md_unwind_header=i386/linux-unwind.h
+	if test x$off_stack_trampolines = xyes; then
+	    extra_parts="${extra_parts} heap-trampoline.o"
+	    tmake_file="${tmake_file} i386/t-heap-trampoline"
+	fi
 	;;
 x86_64-*-kfreebsd*-gnu)
 	extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o crtfastmath.o"
diff --git a/libgcc/config/i386/heap-trampoline.c b/libgcc/config/i386/heap-trampoline.c
new file mode 100644
index 00000000000..6c202660c35
--- /dev/null
+++ b/libgcc/config/i386/heap-trampoline.c
@@ -0,0 +1,143 @@
+#include <unistd.h>
+#include <sys/mman.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+
+void *allocate_trampoline_page (void);
+int get_trampolines_per_page (void);
+struct tramp_ctrl_data *allocate_tramp_ctrl (struct tramp_ctrl_data *parent);
+void *allocate_trampoline_page (void);
+
+void __builtin_nested_func_ptr_created (void *chain, void *func, void **dst);
+void __builtin_nested_func_ptr_deleted (void);
+
+struct tramp_ctrl_data;
+struct tramp_ctrl_data
+{
+  struct tramp_ctrl_data *prev;
+
+  int free_trampolines;
+
+  /* This will be pointing to an executable mmap'ed page.  */
+  union ix86_trampoline *trampolines;
+};
+
+static const uint8_t trampoline_insns[] = {
+  /* movabs $<chain>,%r11  */
+  0x49, 0xbb,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+
+  /* movabs $<func>,%r10  */
+  0x49, 0xba,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+
+  /* rex.WB jmpq *%r11  */
+  0x41, 0xff, 0xe3
+};
+
+union ix86_trampoline {
+  uint8_t insns[sizeof(trampoline_insns)];
+
+  struct __attribute__((packed)) fields {
+    uint8_t insn_0[2];
+    void *func_ptr;
+    uint8_t insn_1[2];
+    void *chain_ptr;
+    uint8_t insn_2[3];
+  } fields;
+};
+
+int
+get_trampolines_per_page (void)
+{
+  return getpagesize() / sizeof(union ix86_trampoline);
+}
+
+static _Thread_local struct tramp_ctrl_data *tramp_ctrl_curr = NULL;
+
+void *
+allocate_trampoline_page (void)
+{
+  void *page;
+
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+	       MAP_ANON | MAP_PRIVATE, 0, 0);
+
+  return page;
+}
+
+struct tramp_ctrl_data *
+allocate_tramp_ctrl (struct tramp_ctrl_data *parent)
+{
+  struct tramp_ctrl_data *p = malloc (sizeof (struct tramp_ctrl_data));
+  if (p == NULL)
+    return NULL;
+
+  p->trampolines = allocate_trampoline_page ();
+
+  if (p->trampolines == MAP_FAILED)
+    return NULL;
+
+  p->prev = parent;
+  p->free_trampolines = get_trampolines_per_page();
+
+  return p;
+}
+
+void
+__builtin_nested_func_ptr_created (void *chain, void *func, void **dst)
+{
+  if (tramp_ctrl_curr == NULL)
+    {
+      tramp_ctrl_curr = allocate_tramp_ctrl (NULL);
+      if (tramp_ctrl_curr == NULL)
+	abort ();
+    }
+
+  if (tramp_ctrl_curr->free_trampolines == 0)
+    {
+      void *tramp_ctrl = allocate_tramp_ctrl (tramp_ctrl_curr);
+      if (!tramp_ctrl)
+	abort ();
+
+      tramp_ctrl_curr = tramp_ctrl;
+    }
+
+  union ix86_trampoline *trampoline
+    = &tramp_ctrl_curr->trampolines[get_trampolines_per_page ()
+				    - tramp_ctrl_curr->free_trampolines];
+
+  memcpy (trampoline->insns, trampoline_insns,
+	  sizeof(trampoline_insns));
+  trampoline->fields.func_ptr = func;
+  trampoline->fields.chain_ptr = chain;
+
+  tramp_ctrl_curr->free_trampolines -= 1;
+
+  __builtin___clear_cache ((void *)trampoline->insns,
+			   ((void *)trampoline->insns + sizeof(trampoline->insns)));
+
+  *dst = &trampoline->insns;
+}
+
+void
+__builtin_nested_func_ptr_deleted (void)
+{
+  if (tramp_ctrl_curr == NULL)
+    abort ();
+
+  tramp_ctrl_curr->free_trampolines += 1;
+
+  if (tramp_ctrl_curr->free_trampolines == get_trampolines_per_page ())
+    {
+      if (tramp_ctrl_curr->prev == NULL)
+	return;
+
+      munmap (tramp_ctrl_curr->trampolines, getpagesize());
+      struct tramp_ctrl_data *prev = tramp_ctrl_curr->prev;
+      free (tramp_ctrl_curr);
+      tramp_ctrl_curr = prev;
+    }
+}
diff --git a/libgcc/config/i386/t-heap-trampoline b/libgcc/config/i386/t-heap-trampoline
new file mode 100644
index 00000000000..d9dc9cfeac4
--- /dev/null
+++ b/libgcc/config/i386/t-heap-trampoline
@@ -0,0 +1,21 @@
+# Machine description for AArch64 architecture.
+# Copyright (C) 2012-2021 Free Software Foundation, Inc.
+# Contributed by ARM Ltd.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+LIB2ADD += $(srcdir)/config/i386/heap-trampoline.c
diff --git a/libgcc/configure b/libgcc/configure
index 2f469219e07..d9a74e3cc7c 100755
--- a/libgcc/configure
+++ b/libgcc/configure
@@ -2261,6 +2261,9 @@ fi
 if test "${enable_off_stack_trampolines+set}" = set; then :
   enableval=$enable_off_stack_trampolines;
 case "$target" in
+  x86_64-*-linux* )
+    off_stack_trampolines=$enableval
+    ;;
   *)
     as_fn_error $? "Configure option --enable-off-stack-trampolines is not supported \
 for this platform" "$LINENO" 5
diff --git a/libgcc/configure.ac b/libgcc/configure.ac
index 97bbd4bd35c..4bec0a54493 100644
--- a/libgcc/configure.ac
+++ b/libgcc/configure.ac
@@ -72,6 +72,9 @@ AC_ARG_ENABLE([off-stack-trampolines],
   [AS_HELP_STRING([--enable-off-stack-trampolines]
                   [Specify whether to support generating off-stack trampolines])],[
 case "$target" in
+  x86_64-*-linux* )
+    off_stack_trampolines=$enableval
+    ;;
   *)
     AC_MSG_ERROR([Configure option --enable-off-stack-trampolines is not supported \
 for this platform])
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/4] Add aarch64-linux support for off-stack trampolines
  2021-11-13  9:45 [PATCH 1/4] Generate off-stack nested function trampolines Maxim Blinov
  2021-11-13  9:45 ` [PATCH 2/4] Add x86_64-linux support for off-stack trampolines Maxim Blinov
@ 2021-11-13  9:45 ` Maxim Blinov
  2021-12-03  3:20   ` Jeff Law
  2021-11-13  9:45 ` [PATCH 4/4] Add aarch64-darwin " Maxim Blinov
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Maxim Blinov @ 2021-11-13  9:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: iain, maxim.blinov, Andrew Burgess

Implement the __builtin_nested_func_ptr_{created,deleted} functions
for the aarch64-linux platform. This serves to exercise the
infrastructure added in libgcc (--enable-off-stack-trampolines) and
gcc (-foff-stack-trampolines) in supporting off-stack trampoline
generation, and is intended primarily for demonstration and debugging
purposes.

Co-authored-by: Andrew Burgess <andrew.burgess@embecosm.com>

libgcc/ChangeLog:

        * config/aarch64/heap-trampoline.c: New file: Implement off-stack
	trampolines for aarch64.
        * config/aarch64/t-heap-trampoline: Add rule to build
        config/aarch64/heap-trampoline.c
        * config.host (aarch64-*-linux*): Handle
        --enable-off-stack-trampolines.
        * configure.ac (--enable-off-stack-trampolines): Permit setting
        for target aarch64-*-linux*.
        * configure: Regenerate.
---
 libgcc/config.host                      |   4 +
 libgcc/config/aarch64/heap-trampoline.c | 133 ++++++++++++++++++++++++
 libgcc/config/aarch64/t-heap-trampoline |  21 ++++
 libgcc/configure                        |   3 +
 libgcc/configure.ac                     |   3 +
 5 files changed, 164 insertions(+)
 create mode 100644 libgcc/config/aarch64/heap-trampoline.c
 create mode 100644 libgcc/config/aarch64/t-heap-trampoline

diff --git a/libgcc/config.host b/libgcc/config.host
index 163cd4c4161..912477db7d9 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -388,6 +388,10 @@ aarch64*-*-linux*)
 	tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
 	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
+	if test x$off_stack_trampolines = xyes; then
+	    extra_parts="$extra_parts heap-trampoline.o"
+	    tmake_file="${tmake_file} ${cpu_type}/t-heap-trampoline"
+	fi
 	;;
 aarch64*-*-vxworks7*)
 	extra_parts="$extra_parts crtfastmath.o"
diff --git a/libgcc/config/aarch64/heap-trampoline.c b/libgcc/config/aarch64/heap-trampoline.c
new file mode 100644
index 00000000000..721a2bed400
--- /dev/null
+++ b/libgcc/config/aarch64/heap-trampoline.c
@@ -0,0 +1,133 @@
+#include <unistd.h>
+#include <sys/mman.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+
+void *allocate_trampoline_page (void);
+int get_trampolines_per_page (void);
+struct tramp_ctrl_data *allocate_tramp_ctrl (struct tramp_ctrl_data *parent);
+void *allocate_trampoline_page (void);
+
+void __builtin_nested_func_ptr_created (void *chain, void *func, void **dst);
+void __builtin_nested_func_ptr_deleted (void);
+
+struct tramp_ctrl_data;
+struct tramp_ctrl_data
+{
+  struct tramp_ctrl_data *prev;
+
+  int free_trampolines;
+
+  /* This will be pointing to an executable mmap'ed page.  */
+  struct aarch64_trampoline *trampolines;
+};
+
+struct aarch64_trampoline {
+  uint32_t insns[6];
+  void *func_ptr;
+  void *chain_ptr;
+};
+
+int
+get_trampolines_per_page (void)
+{
+  return getpagesize() / sizeof(struct aarch64_trampoline);
+}
+
+static _Thread_local struct tramp_ctrl_data *tramp_ctrl_curr = NULL;
+
+void *
+allocate_trampoline_page (void)
+{
+  void *page;
+
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+	       MAP_ANON | MAP_PRIVATE, 0, 0);
+
+  return page;
+}
+
+struct tramp_ctrl_data *
+allocate_tramp_ctrl (struct tramp_ctrl_data *parent)
+{
+  struct tramp_ctrl_data *p = malloc (sizeof (struct tramp_ctrl_data));
+  if (p == NULL)
+    return NULL;
+
+  p->trampolines = allocate_trampoline_page ();
+
+  if (p->trampolines == MAP_FAILED)
+    return NULL;
+
+  p->prev = parent;
+  p->free_trampolines = get_trampolines_per_page();
+
+  return p;
+}
+
+static const uint32_t aarch64_trampoline_insns[] = {
+  0xd503245f, /* hint    34 */
+  0x580000b1, /* ldr     x17, .+20 */
+  0x580000d2, /* ldr     x18, .+24 */
+  0xd61f0220, /* br      x17 */
+  0xd5033f9f, /* dsb     sy */
+  0xd5033fdf /* isb */
+};
+
+void
+__builtin_nested_func_ptr_created (void *chain, void *func, void **dst)
+{
+  if (tramp_ctrl_curr == NULL)
+    {
+      tramp_ctrl_curr = allocate_tramp_ctrl (NULL);
+      if (tramp_ctrl_curr == NULL)
+	abort ();
+    }
+
+  if (tramp_ctrl_curr->free_trampolines == 0)
+    {
+      void *tramp_ctrl = allocate_tramp_ctrl (tramp_ctrl_curr);
+      if (!tramp_ctrl)
+	abort ();
+
+      tramp_ctrl_curr = tramp_ctrl;
+    }
+
+  struct aarch64_trampoline *trampoline
+    = &tramp_ctrl_curr->trampolines[get_trampolines_per_page ()
+				    - tramp_ctrl_curr->free_trampolines];
+
+  memcpy (trampoline->insns, aarch64_trampoline_insns,
+	  sizeof(aarch64_trampoline_insns));
+  trampoline->func_ptr = func;
+  trampoline->chain_ptr = chain;
+
+  tramp_ctrl_curr->free_trampolines -= 1;
+
+  __builtin___clear_cache ((void *)trampoline->insns,
+			   ((void *)trampoline->insns + sizeof(trampoline->insns)));
+
+  *dst = &trampoline->insns;
+}
+
+void
+__builtin_nested_func_ptr_deleted (void)
+{
+  if (tramp_ctrl_curr == NULL)
+    abort ();
+
+  tramp_ctrl_curr->free_trampolines += 1;
+
+  if (tramp_ctrl_curr->free_trampolines == get_trampolines_per_page ())
+    {
+      if (tramp_ctrl_curr->prev == NULL)
+	return;
+
+      munmap (tramp_ctrl_curr->trampolines, getpagesize());
+      struct tramp_ctrl_data *prev = tramp_ctrl_curr->prev;
+      free (tramp_ctrl_curr);
+      tramp_ctrl_curr = prev;
+    }
+}
diff --git a/libgcc/config/aarch64/t-heap-trampoline b/libgcc/config/aarch64/t-heap-trampoline
new file mode 100644
index 00000000000..26b7756265e
--- /dev/null
+++ b/libgcc/config/aarch64/t-heap-trampoline
@@ -0,0 +1,21 @@
+# Machine description for AArch64 architecture.
+# Copyright (C) 2012-2021 Free Software Foundation, Inc.
+# Contributed by ARM Ltd.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+LIB2ADD += $(srcdir)/config/aarch64/heap-trampoline.c
diff --git a/libgcc/configure b/libgcc/configure
index d9a74e3cc7c..5abcea8bed3 100755
--- a/libgcc/configure
+++ b/libgcc/configure
@@ -2264,6 +2264,9 @@ case "$target" in
   x86_64-*-linux* )
     off_stack_trampolines=$enableval
     ;;
+  aarch64*-*-linux* )
+    off_stack_trampolines=$enableval
+    ;;
   *)
     as_fn_error $? "Configure option --enable-off-stack-trampolines is not supported \
 for this platform" "$LINENO" 5
diff --git a/libgcc/configure.ac b/libgcc/configure.ac
index 4bec0a54493..c6eaceec957 100644
--- a/libgcc/configure.ac
+++ b/libgcc/configure.ac
@@ -75,6 +75,9 @@ case "$target" in
   x86_64-*-linux* )
     off_stack_trampolines=$enableval
     ;;
+  aarch64*-*-linux* )
+    off_stack_trampolines=$enableval
+    ;;
   *)
     AC_MSG_ERROR([Configure option --enable-off-stack-trampolines is not supported \
 for this platform])
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 4/4] Add aarch64-darwin support for off-stack trampolines
  2021-11-13  9:45 [PATCH 1/4] Generate off-stack nested function trampolines Maxim Blinov
  2021-11-13  9:45 ` [PATCH 2/4] Add x86_64-linux support for off-stack trampolines Maxim Blinov
  2021-11-13  9:45 ` [PATCH 3/4] Add aarch64-linux " Maxim Blinov
@ 2021-11-13  9:45 ` Maxim Blinov
  2021-11-22 14:49   ` [PATCH 0/4] " Maxim Blinov
  2021-11-16  0:19 ` [PATCH 1/4] Generate off-stack nested function trampolines Joseph Myers
  2021-12-03  3:17 ` Jeff Law
  4 siblings, 1 reply; 12+ messages in thread
From: Maxim Blinov @ 2021-11-13  9:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: iain, maxim.blinov, Andrew Burgess

Note: This patch is not yet ready for trunk as its dependent on some
patches that are not-yet-upstream, however it serves as motivation for
the previous patch(es) which are independent.

----

Implement the __builtin_nested_func_ptr_{created,deleted} functions
for the aarch64-darwin platform. For this platform
--enable-off-stack-trampolines is enabled by default, and
-foff-stack-trampolines is enabled by default if the host MacOS
operating system version is 11.x or greater.

Co-authored-by: Andrew Burgess <andrew.burgess@embecosm.com>

libgcc/ChangeLog:

        * config/aarch64/heap-trampoline.c (allocate_trampoline_page):
	Request for MAP_JIT in the case of __APPLE__.
	Provide __APPLE__ variant of aarch64_trampoline_insns that uses
	x16 as the chain pointer.
	(__builtin_nested_func_ptr_created): Call
	pthread_jit_write_protect_np() to toggle read/write permission on
	page.
        * config.host (aarch64*-*darwin* | arm64*-*darwin*): Handle
        --enable-off-stack-trampolines.
        * configure.ac (--enable-off-stack-trampolines): Permit setting
	for target aarch64*-*darwin* | arm64*-*darwin*, and set default to
	enabled.
        * configure: Regenerate.
---
 gcc/config.gcc                          |  7 +++++
 libgcc/config.host                      |  4 +++
 libgcc/config/aarch64/heap-trampoline.c | 36 +++++++++++++++++++++++++
 libgcc/configure                        |  6 +++++
 libgcc/configure.ac                     |  6 +++++
 5 files changed, 59 insertions(+)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 031be563c5d..c13f7629d44 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1072,6 +1072,13 @@ esac
 
 # Figure out if we need to enable -foff-stack-trampolines by default.
 case ${target} in
+aarch64*-*darwin* | arm64*-*darwin*)
+  if test ${macos_maj} = 11 || test ${macos_maj} = 12; then
+    tm_defines="$tm_defines OFF_STACK_TRAMPOLINES_INIT=1"
+  else
+    tm_defines="$tm_defines OFF_STACK_TRAMPOLINES_INIT=0"
+  fi
+  ;;
 *)
   tm_defines="$tm_defines OFF_STACK_TRAMPOLINES_INIT=0"
   ;;
diff --git a/libgcc/config.host b/libgcc/config.host
index d1a491d27e7..3c536b0928a 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -414,6 +414,10 @@ aarch64*-*darwin* | arm64*-*darwin* )
 	tmake_file="${tmake_file} t-crtfm"
 	# No soft float for now because our long double is DF not TF.
 	md_unwind_header=aarch64/aarch64-unwind.h
+	if test x$off_stack_trampolines = xyes; then
+	    extra_parts="$extra_parts heap-trampoline.o"
+	    tmake_file="${tmake_file} ${cpu_type}/t-heap-trampoline"
+	fi
 	;;
 aarch64*-*-freebsd*)
 	extra_parts="$extra_parts crtfastmath.o"
diff --git a/libgcc/config/aarch64/heap-trampoline.c b/libgcc/config/aarch64/heap-trampoline.c
index 721a2bed400..6994602beaf 100644
--- a/libgcc/config/aarch64/heap-trampoline.c
+++ b/libgcc/config/aarch64/heap-trampoline.c
@@ -5,6 +5,9 @@
 #include <stdio.h>
 #include <string.h>
 
+/* For pthread_jit_write_protect_np */
+#include <pthread.h>
+
 void *allocate_trampoline_page (void);
 int get_trampolines_per_page (void);
 struct tramp_ctrl_data *allocate_tramp_ctrl (struct tramp_ctrl_data *parent);
@@ -43,8 +46,15 @@ allocate_trampoline_page (void)
 {
   void *page;
 
+#if defined(__gnu_linux__)
   page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
 	       MAP_ANON | MAP_PRIVATE, 0, 0);
+#elif defined(__APPLE__)
+  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
+	       MAP_ANON | MAP_PRIVATE | MAP_JIT, 0, 0);
+#else
+  page = MAP_FAILED;
+#endif
 
   return page;
 }
@@ -67,6 +77,7 @@ allocate_tramp_ctrl (struct tramp_ctrl_data *parent)
   return p;
 }
 
+#if defined(__gnu_linux__)
 static const uint32_t aarch64_trampoline_insns[] = {
   0xd503245f, /* hint    34 */
   0x580000b1, /* ldr     x17, .+20 */
@@ -76,6 +87,20 @@ static const uint32_t aarch64_trampoline_insns[] = {
   0xd5033fdf /* isb */
 };
 
+#elif defined(__APPLE__)
+static const uint32_t aarch64_trampoline_insns[] = {
+  0xd503245f, /* hint    34 */
+  0x580000b1, /* ldr     x17, .+20 */
+  0x580000d0, /* ldr     x16, .+24 */
+  0xd61f0220, /* br      x17 */
+  0xd5033f9f, /* dsb     sy */
+  0xd5033fdf /* isb */
+};
+
+#else
+#error "Unsupported AArch64 platform for heap trampolines"
+#endif
+
 void
 __builtin_nested_func_ptr_created (void *chain, void *func, void **dst)
 {
@@ -99,11 +124,22 @@ __builtin_nested_func_ptr_created (void *chain, void *func, void **dst)
     = &tramp_ctrl_curr->trampolines[get_trampolines_per_page ()
 				    - tramp_ctrl_curr->free_trampolines];
 
+#if defined(__APPLE__)
+  /* Disable write protection for the MAP_JIT regions in this thread (see
+     https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon) */
+  pthread_jit_write_protect_np (0);
+#endif
+
   memcpy (trampoline->insns, aarch64_trampoline_insns,
 	  sizeof(aarch64_trampoline_insns));
   trampoline->func_ptr = func;
   trampoline->chain_ptr = chain;
 
+#if defined(__APPLE__)
+  /* Re-enable write protection.  */
+  pthread_jit_write_protect_np (1);
+#endif
+
   tramp_ctrl_curr->free_trampolines -= 1;
 
   __builtin___clear_cache ((void *)trampoline->insns,
diff --git a/libgcc/configure b/libgcc/configure
index 5abcea8bed3..35cdf8f8c05 100755
--- a/libgcc/configure
+++ b/libgcc/configure
@@ -2267,6 +2267,9 @@ case "$target" in
   aarch64*-*-linux* )
     off_stack_trampolines=$enableval
     ;;
+  aarch64*-*darwin* | arm64*-*darwin* )
+    off_stack_trampolines=$enableval
+    ;;
   *)
     as_fn_error $? "Configure option --enable-off-stack-trampolines is not supported \
 for this platform" "$LINENO" 5
@@ -2276,6 +2279,9 @@ esac
 else
 
 case "$target" in
+  aarch64*-*darwin* | arm64*-*darwin* )
+    off_stack_trampolines=yes
+    ;;
   *)
     off_stack_trampolines=no
     ;;
diff --git a/libgcc/configure.ac b/libgcc/configure.ac
index c6eaceec957..3b129f1a4b8 100644
--- a/libgcc/configure.ac
+++ b/libgcc/configure.ac
@@ -78,6 +78,9 @@ case "$target" in
   aarch64*-*-linux* )
     off_stack_trampolines=$enableval
     ;;
+  aarch64*-*darwin* | arm64*-*darwin* )
+    off_stack_trampolines=$enableval
+    ;;
   *)
     AC_MSG_ERROR([Configure option --enable-off-stack-trampolines is not supported \
 for this platform])
@@ -85,6 +88,9 @@ for this platform])
     ;;
 esac],[
 case "$target" in
+  aarch64*-*darwin* | arm64*-*darwin* )
+    off_stack_trampolines=yes
+    ;;
   *)
     off_stack_trampolines=no
     ;;
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] Generate off-stack nested function trampolines
  2021-11-13  9:45 [PATCH 1/4] Generate off-stack nested function trampolines Maxim Blinov
                   ` (2 preceding siblings ...)
  2021-11-13  9:45 ` [PATCH 4/4] Add aarch64-darwin " Maxim Blinov
@ 2021-11-16  0:19 ` Joseph Myers
  2021-12-03  3:17 ` Jeff Law
  4 siblings, 0 replies; 12+ messages in thread
From: Joseph Myers @ 2021-11-16  0:19 UTC (permalink / raw)
  To: Maxim Blinov; +Cc: gcc-patches, iain, Andrew Burgess

On Sat, 13 Nov 2021, Maxim Blinov wrote:

> the target specifically requires it, or you manually provide
> --enable-off-stack-trampolines when configuring gcc/libgcc.

If you're adding a new configure option, it needs documenting in 
install.texi.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] Add aarch64-darwin support for off-stack trampolines
  2021-11-13  9:45 ` [PATCH 4/4] Add aarch64-darwin " Maxim Blinov
@ 2021-11-22 14:49   ` Maxim Blinov
  2021-12-03  3:12     ` Jeff Law
  0 siblings, 1 reply; 12+ messages in thread
From: Maxim Blinov @ 2021-11-22 14:49 UTC (permalink / raw)
  To: GCC Patches; +Cc: Iain Sandoe, Andrew Burgess

Hi all, apologies for forgetting to add the cover letter.

The motivation of this work is to provide (limited) support for GCC
nested function trampolines on targets that do not have an executable
stack. This code has been (roughly) tested by creating several
thousand nested functions (i.e. enough to force allocation of a new
page), making sure all the nested functions execute correctly, and
consequently returning back up and ensuring that the pages are freed
when there are no more active trampolines in them.

I was provided the initial design and prototype implementation by
Andrew Burgess, and have since refactored the code to support
allocation/deallocation of trampoline pages, and added AArch64
Linux/Darwin support.

One of the limitations of the implementation in its current state is
the inability to track longjmps. There has been some discussion about
instrumenting calls to setjmp/longjmp so that the state of trampolines
is correctly tracked and freed when necessary, however that hasn't
been worked on yet.

On Sat, 13 Nov 2021 at 09:45, Maxim Blinov <maxim.blinov@embecosm.com> wrote:
>
> Note: This patch is not yet ready for trunk as its dependent on some
> patches that are not-yet-upstream, however it serves as motivation for
> the previous patch(es) which are independent.
>
> ----
>
> Implement the __builtin_nested_func_ptr_{created,deleted} functions
> for the aarch64-darwin platform. For this platform
> --enable-off-stack-trampolines is enabled by default, and
> -foff-stack-trampolines is enabled by default if the host MacOS
> operating system version is 11.x or greater.
>
> Co-authored-by: Andrew Burgess <andrew.burgess@embecosm.com>
>
> libgcc/ChangeLog:
>
>         * config/aarch64/heap-trampoline.c (allocate_trampoline_page):
>         Request for MAP_JIT in the case of __APPLE__.
>         Provide __APPLE__ variant of aarch64_trampoline_insns that uses
>         x16 as the chain pointer.
>         (__builtin_nested_func_ptr_created): Call
>         pthread_jit_write_protect_np() to toggle read/write permission on
>         page.
>         * config.host (aarch64*-*darwin* | arm64*-*darwin*): Handle
>         --enable-off-stack-trampolines.
>         * configure.ac (--enable-off-stack-trampolines): Permit setting
>         for target aarch64*-*darwin* | arm64*-*darwin*, and set default to
>         enabled.
>         * configure: Regenerate.
> ---
>  gcc/config.gcc                          |  7 +++++
>  libgcc/config.host                      |  4 +++
>  libgcc/config/aarch64/heap-trampoline.c | 36 +++++++++++++++++++++++++
>  libgcc/configure                        |  6 +++++
>  libgcc/configure.ac                     |  6 +++++
>  5 files changed, 59 insertions(+)
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 031be563c5d..c13f7629d44 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -1072,6 +1072,13 @@ esac
>
>  # Figure out if we need to enable -foff-stack-trampolines by default.
>  case ${target} in
> +aarch64*-*darwin* | arm64*-*darwin*)
> +  if test ${macos_maj} = 11 || test ${macos_maj} = 12; then
> +    tm_defines="$tm_defines OFF_STACK_TRAMPOLINES_INIT=1"
> +  else
> +    tm_defines="$tm_defines OFF_STACK_TRAMPOLINES_INIT=0"
> +  fi
> +  ;;
>  *)
>    tm_defines="$tm_defines OFF_STACK_TRAMPOLINES_INIT=0"
>    ;;
> diff --git a/libgcc/config.host b/libgcc/config.host
> index d1a491d27e7..3c536b0928a 100644
> --- a/libgcc/config.host
> +++ b/libgcc/config.host
> @@ -414,6 +414,10 @@ aarch64*-*darwin* | arm64*-*darwin* )
>         tmake_file="${tmake_file} t-crtfm"
>         # No soft float for now because our long double is DF not TF.
>         md_unwind_header=aarch64/aarch64-unwind.h
> +       if test x$off_stack_trampolines = xyes; then
> +           extra_parts="$extra_parts heap-trampoline.o"
> +           tmake_file="${tmake_file} ${cpu_type}/t-heap-trampoline"
> +       fi
>         ;;
>  aarch64*-*-freebsd*)
>         extra_parts="$extra_parts crtfastmath.o"
> diff --git a/libgcc/config/aarch64/heap-trampoline.c b/libgcc/config/aarch64/heap-trampoline.c
> index 721a2bed400..6994602beaf 100644
> --- a/libgcc/config/aarch64/heap-trampoline.c
> +++ b/libgcc/config/aarch64/heap-trampoline.c
> @@ -5,6 +5,9 @@
>  #include <stdio.h>
>  #include <string.h>
>
> +/* For pthread_jit_write_protect_np */
> +#include <pthread.h>
> +
>  void *allocate_trampoline_page (void);
>  int get_trampolines_per_page (void);
>  struct tramp_ctrl_data *allocate_tramp_ctrl (struct tramp_ctrl_data *parent);
> @@ -43,8 +46,15 @@ allocate_trampoline_page (void)
>  {
>    void *page;
>
> +#if defined(__gnu_linux__)
>    page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
>                MAP_ANON | MAP_PRIVATE, 0, 0);
> +#elif defined(__APPLE__)
> +  page = mmap (0, getpagesize (), PROT_WRITE | PROT_EXEC,
> +              MAP_ANON | MAP_PRIVATE | MAP_JIT, 0, 0);
> +#else
> +  page = MAP_FAILED;
> +#endif
>
>    return page;
>  }
> @@ -67,6 +77,7 @@ allocate_tramp_ctrl (struct tramp_ctrl_data *parent)
>    return p;
>  }
>
> +#if defined(__gnu_linux__)
>  static const uint32_t aarch64_trampoline_insns[] = {
>    0xd503245f, /* hint    34 */
>    0x580000b1, /* ldr     x17, .+20 */
> @@ -76,6 +87,20 @@ static const uint32_t aarch64_trampoline_insns[] = {
>    0xd5033fdf /* isb */
>  };
>
> +#elif defined(__APPLE__)
> +static const uint32_t aarch64_trampoline_insns[] = {
> +  0xd503245f, /* hint    34 */
> +  0x580000b1, /* ldr     x17, .+20 */
> +  0x580000d0, /* ldr     x16, .+24 */
> +  0xd61f0220, /* br      x17 */
> +  0xd5033f9f, /* dsb     sy */
> +  0xd5033fdf /* isb */
> +};
> +
> +#else
> +#error "Unsupported AArch64 platform for heap trampolines"
> +#endif
> +
>  void
>  __builtin_nested_func_ptr_created (void *chain, void *func, void **dst)
>  {
> @@ -99,11 +124,22 @@ __builtin_nested_func_ptr_created (void *chain, void *func, void **dst)
>      = &tramp_ctrl_curr->trampolines[get_trampolines_per_page ()
>                                     - tramp_ctrl_curr->free_trampolines];
>
> +#if defined(__APPLE__)
> +  /* Disable write protection for the MAP_JIT regions in this thread (see
> +     https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon) */
> +  pthread_jit_write_protect_np (0);
> +#endif
> +
>    memcpy (trampoline->insns, aarch64_trampoline_insns,
>           sizeof(aarch64_trampoline_insns));
>    trampoline->func_ptr = func;
>    trampoline->chain_ptr = chain;
>
> +#if defined(__APPLE__)
> +  /* Re-enable write protection.  */
> +  pthread_jit_write_protect_np (1);
> +#endif
> +
>    tramp_ctrl_curr->free_trampolines -= 1;
>
>    __builtin___clear_cache ((void *)trampoline->insns,
> diff --git a/libgcc/configure b/libgcc/configure
> index 5abcea8bed3..35cdf8f8c05 100755
> --- a/libgcc/configure
> +++ b/libgcc/configure
> @@ -2267,6 +2267,9 @@ case "$target" in
>    aarch64*-*-linux* )
>      off_stack_trampolines=$enableval
>      ;;
> +  aarch64*-*darwin* | arm64*-*darwin* )
> +    off_stack_trampolines=$enableval
> +    ;;
>    *)
>      as_fn_error $? "Configure option --enable-off-stack-trampolines is not supported \
>  for this platform" "$LINENO" 5
> @@ -2276,6 +2279,9 @@ esac
>  else
>
>  case "$target" in
> +  aarch64*-*darwin* | arm64*-*darwin* )
> +    off_stack_trampolines=yes
> +    ;;
>    *)
>      off_stack_trampolines=no
>      ;;
> diff --git a/libgcc/configure.ac b/libgcc/configure.ac
> index c6eaceec957..3b129f1a4b8 100644
> --- a/libgcc/configure.ac
> +++ b/libgcc/configure.ac
> @@ -78,6 +78,9 @@ case "$target" in
>    aarch64*-*-linux* )
>      off_stack_trampolines=$enableval
>      ;;
> +  aarch64*-*darwin* | arm64*-*darwin* )
> +    off_stack_trampolines=$enableval
> +    ;;
>    *)
>      AC_MSG_ERROR([Configure option --enable-off-stack-trampolines is not supported \
>  for this platform])
> @@ -85,6 +88,9 @@ for this platform])
>      ;;
>  esac],[
>  case "$target" in
> +  aarch64*-*darwin* | arm64*-*darwin* )
> +    off_stack_trampolines=yes
> +    ;;
>    *)
>      off_stack_trampolines=no
>      ;;
> --
> 2.30.1 (Apple Git-130)
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] Add aarch64-darwin support for off-stack trampolines
  2021-11-22 14:49   ` [PATCH 0/4] " Maxim Blinov
@ 2021-12-03  3:12     ` Jeff Law
  2021-12-03  7:53       ` Iain Sandoe
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff Law @ 2021-12-03  3:12 UTC (permalink / raw)
  To: Maxim Blinov, GCC Patches; +Cc: Iain Sandoe, Andrew Burgess



On 11/22/2021 7:49 AM, Maxim Blinov wrote:
> Hi all, apologies for forgetting to add the cover letter.
No worries.  I'd already assumed this was to support aarch64 trampolines 
on darwin by having them live elsewere as managed entities.

>
> The motivation of this work is to provide (limited) support for GCC
> nested function trampolines on targets that do not have an executable
> stack. This code has been (roughly) tested by creating several
> thousand nested functions (i.e. enough to force allocation of a new
> page), making sure all the nested functions execute correctly, and
> consequently returning back up and ensuring that the pages are freed
> when there are no more active trampolines in them.
Right.  I'm looking at this wondering if we should do something similar 
for our new architecture.  Avoiding executable stacks is a good thing :-)
> One of the limitations of the implementation in its current state is
> the inability to track longjmps. There has been some discussion about
> instrumenting calls to setjmp/longjmp so that the state of trampolines
> is correctly tracked and freed when necessary, however that hasn't
> been worked on yet.
So in the longjmp case, we just leak trampolines, right?  I'd think that 
should be quite uncommon.  It'd be nice to fix, but the benefits of 
non-executable stacks may ultimately be enough to overcome the limitation.

The other question is why not do a scheme similar to what Ada does with 
function descriptors?  Is that not feasible for some reason?  I realize 
that hasn't been plumbed into the C/C++ compilers, but it may be another 
viable option.

Jeff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] Generate off-stack nested function trampolines
  2021-11-13  9:45 [PATCH 1/4] Generate off-stack nested function trampolines Maxim Blinov
                   ` (3 preceding siblings ...)
  2021-11-16  0:19 ` [PATCH 1/4] Generate off-stack nested function trampolines Joseph Myers
@ 2021-12-03  3:17 ` Jeff Law
  4 siblings, 0 replies; 12+ messages in thread
From: Jeff Law @ 2021-12-03  3:17 UTC (permalink / raw)
  To: Maxim Blinov, gcc-patches; +Cc: iain, Andrew Burgess



On 11/13/2021 2:45 AM, Maxim Blinov wrote:
> Add support for allocating nested function trampolines on an
> executable heap rather than on the stack. This is motivated by targets
> such as AArch64 Darwin, which globally prohibit executing code on the
> stack.
>
> The target-specific routines for allocating and writing trampolines is
> to be provided in libgcc, and is by-default _not_ compiled in unless
> the target specifically requires it, or you manually provide
> --enable-off-stack-trampolines when configuring gcc/libgcc.
>
> The gcc flag -foff-stack-trampolines controls whether to generate code
> that instantiates trampolines on the stack, or to emit calls to
> __builtin_nested_func_ptr_created and
> __builtin_nested_func_ptr_deleted. Note that this flag is completely
> independent of libgcc: If libgcc is for any reason missing those
> symbols, you will get a link failure.
>
> This implementation imposes some implicit restrictions as compared to
> stack trampolines. longjmp'ing back to a state before a trampoline was
> created will cause us to skip over the corresponding
> __builtin_nested_func_ptr_deleted, which will leak trampolines
> starting from the beginning of the linked list of allocated
> trampolines. There may be scope for instrumenting longjmp/setjmp to
> trigger cleanups of trampolines.
>
> Co-authored-by: Andrew Burgess <andrew.burgess@embecosm.com>
>
> gcc/ChangeLog:
>
>          * builtins.def (BUILT_IN_NESTED_PTR_CREATED): Define.
>          (BUILT_IN_NESTED_PTR_DELETED): Ditto.
>          * common.opt (foff-stack-trampolines): Add flag to control
>          generation of heap-based trampoline instantiation.
>          * tree-nested.c (convert_tramp_reference_op): Don't bother calling
>          __builtin_adjust_trampoline for the off-stack case.
>          (finalize_nesting_tree_1): Emit calls to
>          __builtin_nested_...{created,deleted} if we're generating with
>          -foff-stack-trampolines.
>          * tree.c (build_common_builtin_nodes): Build
>          __builtin_nested_...{created,deleted}.
> 	* dov/invoke.texi (-foff-stack-trampolines): Document.
>
> libgcc/ChangeLog:
>
> 	* configure.ac: Add configure parameter
>          --enable-off-stack-trampolines, and do error checking if we've
>          trying to enable off-stack trampolines for a platform that doesn't
>          provide any such implementation.
> 	* configure: Regenerate.
> 	* libgcc-std.ver.in: Ditto.
> 	* libgcc2.h (__builtin_nested_func_ptr_created): Declare.
>          (__builtin_nested_func_ptr_deleted): Ditto.
I'd tend to lean away from having this be a compile-time flag.  For 
aarch64-darwin, we'd just enable it and not allow it to be selectable.  
Similarly for other target where implementations. Ultimately this is an 
ABI decision, so a target needs to make a selection then stick with it IMHO.

Aside from that, I don't see anything in here too terrible.

jeff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] Add x86_64-linux support for off-stack trampolines
  2021-11-13  9:45 ` [PATCH 2/4] Add x86_64-linux support for off-stack trampolines Maxim Blinov
@ 2021-12-03  3:18   ` Jeff Law
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff Law @ 2021-12-03  3:18 UTC (permalink / raw)
  To: Maxim Blinov, gcc-patches; +Cc: iain, Andrew Burgess



On 11/13/2021 2:45 AM, Maxim Blinov wrote:
> Implement the __builtin_nested_func_ptr_{created,deleted} functions
> for the x86_64-linux platform. This serves to exercise the
> infrastructure added in libgcc (--enable-off-stack-trampolines) and
> gcc (-foff-stack-trampolines) in supporting off-stack trampoline
> generation, and is intended primarily for demonstration and debugging
> purposes.
>
> Co-authored-by: Andrew Burgess <andrew.burgess@embecosm.com>
>
> libgcc/ChangeLog:
>
> 	* config/i386/heap-trampoline.c: New file: Implement off-stack
> 	trampolines for x86_64.
> 	* config/i386/t-heap-trampoline: Add rule to build
> 	config/i386/heap-trampoline.c
> 	* config.host (x86_64-*-linux*): Handle
> 	--enable-off-stack-trampolines.
> 	* configure.ac (--enable-off-stack-trampolines): Permit setting
> 	for target x86_64-*-linux*.
> 	* configure: Regenerate.
I'd probably drop this.   I realize it's useful for testing purposes, 
but I'm not sure we want it in the tree.

jeff


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] Add aarch64-linux support for off-stack trampolines
  2021-11-13  9:45 ` [PATCH 3/4] Add aarch64-linux " Maxim Blinov
@ 2021-12-03  3:20   ` Jeff Law
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff Law @ 2021-12-03  3:20 UTC (permalink / raw)
  To: Maxim Blinov, gcc-patches; +Cc: iain, Andrew Burgess



On 11/13/2021 2:45 AM, Maxim Blinov wrote:
> Implement the __builtin_nested_func_ptr_{created,deleted} functions
> for the aarch64-linux platform. This serves to exercise the
> infrastructure added in libgcc (--enable-off-stack-trampolines) and
> gcc (-foff-stack-trampolines) in supporting off-stack trampoline
> generation, and is intended primarily for demonstration and debugging
> purposes.
>
> Co-authored-by: Andrew Burgess <andrew.burgess@embecosm.com>
>
> libgcc/ChangeLog:
>
>          * config/aarch64/heap-trampoline.c: New file: Implement off-stack
> 	trampolines for aarch64.
>          * config/aarch64/t-heap-trampoline: Add rule to build
>          config/aarch64/heap-trampoline.c
>          * config.host (aarch64-*-linux*): Handle
>          --enable-off-stack-trampolines.
>          * configure.ac (--enable-off-stack-trampolines): Permit setting
>          for target aarch64-*-linux*.
>          * configure: Regenerate.
I'd leave this to the aarch64 maintainers.  I don't  see anything 
particularly bad.  I'd probably drop the all the configure time 
selectability stuff and just have the aarch64 darwin default to this 
implementation if the basic concept goes forward.

jeff


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] Add aarch64-darwin support for off-stack trampolines
  2021-12-03  3:12     ` Jeff Law
@ 2021-12-03  7:53       ` Iain Sandoe
  2021-12-03 17:08         ` Jeff Law
  0 siblings, 1 reply; 12+ messages in thread
From: Iain Sandoe @ 2021-12-03  7:53 UTC (permalink / raw)
  To: Jeff Law; +Cc: Maxim Blinov, GCC Patches, Andrew Burgess



> On 3 Dec 2021, at 03:12, Jeff Law <jeffreyalaw@gmail.com> wrote:
> 
> 
> 
> On 11/22/2021 7:49 AM, Maxim Blinov wrote:
>> Hi all, apologies for forgetting to add the cover letter.
> No worries.  I'd already assumed this was to support aarch64 trampolines on darwin by having them live elsewere as managed entities.
> 
>> 
>> The motivation of this work is to provide (limited) support for GCC
>> nested function trampolines on targets that do not have an executable
>> stack. This code has been (roughly) tested by creating several
>> thousand nested functions (i.e. enough to force allocation of a new
>> page), making sure all the nested functions execute correctly, and
>> consequently returning back up and ensuring that the pages are freed
>> when there are no more active trampolines in them.
> Right.  I'm looking at this wondering if we should do something similar for our new architecture.  Avoiding executable stacks is a good thing :-)
>> One of the limitations of the implementation in its current state is
>> the inability to track longjmps. There has been some discussion about
>> instrumenting calls to setjmp/longjmp so that the state of trampolines
>> is correctly tracked and freed when necessary, however that hasn't
>> been worked on yet.
> So in the longjmp case, we just leak trampolines, right?  I'd think that should be quite uncommon.  It'd be nice to fix, but the benefits of non-executable stacks may ultimately be enough to overcome the limitation.
> 
> The other question is why not do a scheme similar to what Ada does with function descriptors?  Is that not feasible for some reason?  I realize that hasn't been plumbed into the C/C++ compilers, but it may be another viable option.

The problem is that it breaks ABI ;)

[in a function ptr] we need an address bit to test to determine if we are handling a case which has the descriptor, or if we have a regular indirect call.

Unfortunately, although aarch64 aligns functions to 4 bytes, the two lower bits are reserved by Arm and therefore we’d have to force function alignment to 8bytes and that’s an ABI break (that cannot reasonably be expected to happen) for aarch64-darwin (or any other arch that has a release in the wild, I’d imagine).

(FWIW, This is what I’ve currently  implemented on my development branch [not for C++, since that has no nested functions].  I implemented the change for Fortran and re-used Martin Uecker’s proposed C impl. - but I used one of the reserved address bits [as a work-around to get people going with the port])

Here’s the thread discussing the situation when Martin proposed the change for C.

https://gcc.gnu.org/legacy-ml/gcc-patches/2018-12/msg01532.html

Iain


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] Add aarch64-darwin support for off-stack trampolines
  2021-12-03  7:53       ` Iain Sandoe
@ 2021-12-03 17:08         ` Jeff Law
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff Law @ 2021-12-03 17:08 UTC (permalink / raw)
  To: Iain Sandoe; +Cc: Maxim Blinov, GCC Patches, Andrew Burgess



On 12/3/2021 12:53 AM, Iain Sandoe wrote:
>
>> On 3 Dec 2021, at 03:12, Jeff Law <jeffreyalaw@gmail.com> wrote:
>>
>>
>>
>> On 11/22/2021 7:49 AM, Maxim Blinov wrote:
>>> Hi all, apologies for forgetting to add the cover letter.
>> No worries.  I'd already assumed this was to support aarch64 trampolines on darwin by having them live elsewere as managed entities.
>>
>>> The motivation of this work is to provide (limited) support for GCC
>>> nested function trampolines on targets that do not have an executable
>>> stack. This code has been (roughly) tested by creating several
>>> thousand nested functions (i.e. enough to force allocation of a new
>>> page), making sure all the nested functions execute correctly, and
>>> consequently returning back up and ensuring that the pages are freed
>>> when there are no more active trampolines in them.
>> Right.  I'm looking at this wondering if we should do something similar for our new architecture.  Avoiding executable stacks is a good thing :-)
>>> One of the limitations of the implementation in its current state is
>>> the inability to track longjmps. There has been some discussion about
>>> instrumenting calls to setjmp/longjmp so that the state of trampolines
>>> is correctly tracked and freed when necessary, however that hasn't
>>> been worked on yet.
>> So in the longjmp case, we just leak trampolines, right?  I'd think that should be quite uncommon.  It'd be nice to fix, but the benefits of non-executable stacks may ultimately be enough to overcome the limitation.
>>
>> The other question is why not do a scheme similar to what Ada does with function descriptors?  Is that not feasible for some reason?  I realize that hasn't been plumbed into the C/C++ compilers, but it may be another viable option.
> The problem is that it breaks ABI ;)
It was worth asking :-)


>
> [in a function ptr] we need an address bit to test to determine if we are handling a case which has the descriptor, or if we have a regular indirect call.
>
> Unfortunately, although aarch64 aligns functions to 4 bytes, the two lower bits are reserved by Arm and therefore we’d have to force function alignment to 8bytes and that’s an ABI break (that cannot reasonably be expected to happen) for aarch64-darwin (or any other arch that has a release in the wild, I’d imagine).
>
> (FWIW, This is what I’ve currently  implemented on my development branch [not for C++, since that has no nested functions].  I implemented the change for Fortran and re-used Martin Uecker’s proposed C impl. - but I used one of the reserved address bits [as a work-around to get people going with the port])
Good to know.  I keep thinking I should revamp how our port handles 
nested functions.  Right now we're using the tried and true trampolines, 
but we've got low bits available, so we could go with function 
descriptor approach.  Or we could go with trampolines in mmap'd space 
approach.  I just don't want to use old fashioned trampolines :-)


>
> Here’s the thread discussing the situation when Martin proposed the change for C.
>
> https://gcc.gnu.org/legacy-ml/gcc-patches/2018-12/msg01532.html
Yea, I vaguely remember the thread.  ABI stability is a pain :(

jeff

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-12-03 17:08 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-13  9:45 [PATCH 1/4] Generate off-stack nested function trampolines Maxim Blinov
2021-11-13  9:45 ` [PATCH 2/4] Add x86_64-linux support for off-stack trampolines Maxim Blinov
2021-12-03  3:18   ` Jeff Law
2021-11-13  9:45 ` [PATCH 3/4] Add aarch64-linux " Maxim Blinov
2021-12-03  3:20   ` Jeff Law
2021-11-13  9:45 ` [PATCH 4/4] Add aarch64-darwin " Maxim Blinov
2021-11-22 14:49   ` [PATCH 0/4] " Maxim Blinov
2021-12-03  3:12     ` Jeff Law
2021-12-03  7:53       ` Iain Sandoe
2021-12-03 17:08         ` Jeff Law
2021-11-16  0:19 ` [PATCH 1/4] Generate off-stack nested function trampolines Joseph Myers
2021-12-03  3:17 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).