public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Add __builtin_stack_top
@ 2015-08-04 12:31 H.J. Lu
  2015-08-04 15:42 ` Mike Stump
  0 siblings, 1 reply; 24+ messages in thread
From: H.J. Lu @ 2015-08-04 12:31 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches, Uros Bizjak

[-- Attachment #1: Type: text/plain, Size: 3173 bytes --]

On Wed, Jul 22, 2015 at 8:44 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Jul 22, 2015 at 6:59 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Wed, Jul 22, 2015 at 6:55 AM, Segher Boessenkool
>> <segher@kernel.crashing.org> wrote:
>>> On Wed, Jul 22, 2015 at 05:10:04AM -0700, H.J. Lu wrote:
>>>> I got a feedback, suggesting __builtin_stack_top, instead of
>>>> __builtin_ia32_stack_top.  But I don't know if
>>>>
>>>> +      /* After the prologue, stack top is at -WORD(AP) in the current
>>>> +        frame.  */
>>>> +      emit_insn (gen_rtx_SET (target,
>>>> +                             plus_constant (Pmode, arg_pointer_rtx,
>>>> +                                            -UNITS_PER_WORD)));
>>>>
>>>> is true for all backends.  If it works on all backends, I can move
>>>> it to builtins.c.
>>>
>>> It doesn't afaik.  But can't you define INITIAL_FRAME_ADDRESS_RTX?
>>>
>>>
>>> Segher
>>
>> Does INITIAL_FRAME_ADDRESS_RTX point to stack top? It certainly
>> can't be defined for x86.   I will write a midld-end patch and leave to each
>> backend to enable it.
>
> Here is a patch.  Any comments, feedbacks?

Where does this feature belong?  Middle-end or x86 backend?
Here is the updated patch to implement it in middle-end.  Any
comments?

Thanks.

> Thanks.
>
> --
> H.J.
> ---
> When __builtin_frame_address is used to retrieve the address of the
> function stack frame, the frame pointer is always kept, which wastes one
> register and 2 instructions.  For x86-32, one less register means
> significant negative impact on performance.  This patch adds a new
> builtin function, __builtin_stack_top.  It returns the stack address
> when the function is called.
>
> This patch only enables __builtin_stack_top for x86 backend.  Using
> __builtin_stack_top with other backends will lead to
>
> sorry, unimplemented: ‘__builtin_stack_top’ not supported on this target
>
> TARGET_STACK_TOP_RTX must be defined to enable __builtin_stack_top.
> default_stack_top_rtx may be extended to support more backends,
> including those with INITIAL_FRAME_ADDRESS_RTX.
>
> gcc/
>
> PR target/66960
> * builtin-types.def (BT_FN_PTR_VOID): New function type.
> * builtins.c (expand_builtin): Handle BUILT_IN_STACK_TOP.
> (is_simple_builtin): Likewise.
> * ipa-pure-const.c (special_builtin_state): Likewise.
> * builtins.def: Add BUILT_IN_STACK_TOP.
> * function.h (function): Add stack_top_taken.
> * target.def (stack_top_rtx): New target hook.
> * targhooks.c (default_stack_top_rtx): New.
> * targhooks.h (default_stack_top_rtx): Likewise.
> * config/i386/i386.c (ix86_expand_prologue): Sorry if DRAP is
> used and the stack address has been taken.
> (TARGET_STACK_TOP_RTX): New.
> * doc/extend.texi: Document __builtin_stack_top.
> * doc/tm.texi.in (TARGET_STACK_TOP_RTX): New.
> * doc/tm.texi: Regenerated.
>
> gcc/testsuite/
>
> PR target/66960
> * gcc.target/i386/pr66960-1.c: New test.
> * gcc.target/i386/pr66960-2.c: Likewise.
> * gcc.target/i386/pr66960-3.c: Likewise.
> * gcc.target/i386/pr66960-4.c: Likewise.
> * gcc.target/i386/pr66960-5.c: Likewise.



-- 
H.J.

[-- Attachment #2: 0001-Add-__builtin_stack_top.patch --]
[-- Type: text/x-patch, Size: 16688 bytes --]

From 267982f7c76cc6eece0dd7896555d27291f587ef Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Tue, 21 Jul 2015 14:32:09 -0700
Subject: [PATCH] Add __builtin_stack_top
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

When __builtin_frame_address is used to retrieve the address of the
function stack frame, the frame pointer register is required, which
wastes one register and 2 instructions.  For x86-32, one less register
means significant negative impact on performance.  This patch adds a
new builtin function, __builtin_stack_top.  It returns the stack address
when the function is called.

This patch only enables __builtin_stack_top for x86 backend.  Using
__builtin_stack_top with other backends will lead to

sorry, unimplemented: ‘__builtin_stack_top’ not supported on this target

TARGET_STACK_TOP_RTX must be defined to enable __builtin_stack_top.
default_stack_top_rtx may be extended to support more backends,
including those with INITIAL_FRAME_ADDRESS_RTX.

gcc/

	PR target/66960
	* builtin-types.def (BT_FN_PTR_VOID): New function type.
	* builtins.c (expand_builtin): Handle BUILT_IN_STACK_TOP.
	(is_simple_builtin): Likewise.
	* ipa-pure-const.c (special_builtin_state): Likewise.
	* builtins.def: Add BUILT_IN_STACK_TOP.
	* function.h (function): Add stack_top_taken.
	* target.def (stack_top_rtx): New target hook.
	* targhooks.c (default_stack_top_rtx): New.
	* targhooks.h (default_stack_top_rtx): Likewise.
	* config/i386/i386.c (ix86_expand_prologue): Sorry if DRAP is
	used and the stack address has been taken.
	(TARGET_STACK_TOP_RTX): New.
	* doc/extend.texi: Document __builtin_stack_top.
	* doc/tm.texi.in (TARGET_STACK_TOP_RTX): New.
	* doc/tm.texi: Regenerated.

gcc/testsuite/

	PR target/66960
	* gcc.target/i386/pr66960-1.c: New test.
	* gcc.target/i386/pr66960-2.c: Likewise.
	* gcc.target/i386/pr66960-3.c: Likewise.
	* gcc.target/i386/pr66960-4.c: Likewise.
	* gcc.target/i386/pr66960-5.c: Likewise.
---
 gcc/builtin-types.def                     |  1 +
 gcc/builtins.c                            | 11 +++++++++++
 gcc/builtins.def                          |  1 +
 gcc/config/i386/i386.c                    |  8 ++++++++
 gcc/doc/extend.texi                       |  7 +++++++
 gcc/doc/tm.texi                           |  5 +++++
 gcc/doc/tm.texi.in                        |  2 ++
 gcc/function.h                            |  3 +++
 gcc/ipa-pure-const.c                      |  1 +
 gcc/target.def                            |  7 +++++++
 gcc/targhooks.c                           |  9 +++++++++
 gcc/targhooks.h                           |  3 +++
 gcc/testsuite/gcc.target/i386/pr66960-1.c | 33 +++++++++++++++++++++++++++++++
 gcc/testsuite/gcc.target/i386/pr66960-2.c | 33 +++++++++++++++++++++++++++++++
 gcc/testsuite/gcc.target/i386/pr66960-3.c | 17 ++++++++++++++++
 gcc/testsuite/gcc.target/i386/pr66960-4.c | 21 ++++++++++++++++++++
 gcc/testsuite/gcc.target/i386/pr66960-5.c | 21 ++++++++++++++++++++
 17 files changed, 183 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-5.c

diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 0e34531..2b6b5ab 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -177,6 +177,7 @@ DEF_FUNCTION_TYPE_1 (BT_FN_COMPLEX_LONGDOUBLE_LONGDOUBLE,
 		     BT_COMPLEX_LONGDOUBLE, BT_LONGDOUBLE)
 DEF_FUNCTION_TYPE_1 (BT_FN_PTR_UINT, BT_PTR, BT_UINT)
 DEF_FUNCTION_TYPE_1 (BT_FN_PTR_SIZE, BT_PTR, BT_SIZE)
+DEF_FUNCTION_TYPE_1 (BT_FN_PTR_VOID, BT_PTR, BT_VOID)
 DEF_FUNCTION_TYPE_1 (BT_FN_INT_INT, BT_INT, BT_INT)
 DEF_FUNCTION_TYPE_1 (BT_FN_INT_UINT, BT_INT, BT_UINT)
 DEF_FUNCTION_TYPE_1 (BT_FN_INT_LONG, BT_INT, BT_LONG)
diff --git a/gcc/builtins.c b/gcc/builtins.c
index eb7b7b2..b257baf 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -6210,6 +6210,16 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
     case BUILT_IN_CONSTANT_P:
       return const0_rtx;
 
+    case BUILT_IN_STACK_TOP:
+      if (targetm.calls.stack_top_rtx)
+	{
+	  cfun->stack_top_taken = true;
+	  return targetm.calls.stack_top_rtx ();
+	}
+      else
+	sorry ("%<__builtin_stack_top%> not supported on this target");
+      break;
+
     case BUILT_IN_FRAME_ADDRESS:
     case BUILT_IN_RETURN_ADDRESS:
       return expand_builtin_frame_address (fndecl, exp);
@@ -12399,6 +12409,7 @@ is_simple_builtin (tree decl)
       case BUILT_IN_RETURN:
       case BUILT_IN_AGGREGATE_INCOMING_ADDRESS:
       case BUILT_IN_FRAME_ADDRESS:
+      case BUILT_IN_STACK_TOP:
       case BUILT_IN_VA_END:
       case BUILT_IN_STACK_SAVE:
       case BUILT_IN_STACK_RESTORE:
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 80e4a9c..62f0523 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -778,6 +778,7 @@ DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSL, "ffsl", BT_FN_INT_LONG, ATTR_CONST_NOTHRO
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSLL, "ffsll", BT_FN_INT_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN        (BUILT_IN_FORK, "fork", BT_FN_PID, ATTR_NOTHROW_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_FRAME_ADDRESS, "frame_address", BT_FN_PTR_UINT, ATTR_NULL)
+DEF_GCC_BUILTIN        (BUILT_IN_STACK_TOP, "stack_top", BT_FN_PTR_VOID, ATTR_NULL)
 /* [trans-mem]: Adjust BUILT_IN_TM_FREE if BUILT_IN_FREE is changed.  */
 DEF_LIB_BUILTIN        (BUILT_IN_FREE, "free", BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_FROB_RETURN_ADDR, "frob_return_addr", BT_FN_PTR_PTR, ATTR_NULL)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1b0cade..0c77363 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11575,6 +11575,12 @@ ix86_expand_prologue (void)
     {
       int align_bytes = crtl->stack_alignment_needed / BITS_PER_UNIT;
 
+      /* Can't use DRAP if the stack address has been taken.  */
+      if (cfun->stack_top_taken)
+	sorry ("%<__builtin_stack_top%> not supported with stack"
+	       " realignment.  This may be worked around by adding"
+	       " -maccumulate-outgoing-args.");
+
       /* Only need to push parameter pointer reg if it is caller saved.  */
       if (!call_used_regs[REGNO (crtl->drap_reg)])
 	{
@@ -52513,6 +52519,8 @@ ix86_operands_ok_for_move_multiple (rtx *operands, bool load,
 #define TARGET_UPDATE_STACK_BOUNDARY ix86_update_stack_boundary
 #undef TARGET_GET_DRAP_RTX
 #define TARGET_GET_DRAP_RTX ix86_get_drap_rtx
+#undef TARGET_STACK_TOP_RTX
+#define TARGET_STACK_TOP_RTX default_stack_top_rtx
 #undef TARGET_STRICT_ARGUMENT_NAMING
 #define TARGET_STRICT_ARGUMENT_NAMING hook_bool_CUMULATIVE_ARGS_true
 #undef TARGET_STATIC_CHAIN
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 2a47943..834b7f4 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -8841,6 +8841,13 @@ option is in effect.  Such calls should only be made in debugging
 situations.
 @end deftypefn
 
+@deftypefn {Built-in Function} {void *} __builtin_stack_top (void)
+This function is similar to calling @code{__builtin_frame_address}
+with a value of @code{0}, but it returns the stack address when the
+function is called.  Unlike @code{__builtin_frame_address}, the frame
+pointer register isn't required.
+@end deftypefn
+
 @node Vector Extensions
 @section Using Vector Instructions through Built-in Functions
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f95646c..e2cd480 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -11483,6 +11483,11 @@ argument list due to stack realignment.  Return @code{NULL} if no DRAP
 is needed.
 @end deftypefn
 
+@deftypefn {Target Hook} rtx TARGET_STACK_TOP_RTX (void)
+This hook should return an rtx for the stack address when the function
+is called.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS (void)
 When optimization is disabled, this hook indicates whether or not
 arguments should be allocated to stack slots.  Normally, GCC allocates
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 2383fb9..9167069 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -8181,6 +8181,8 @@ and the associated definitions of those functions.
 
 @hook TARGET_GET_DRAP_RTX
 
+@hook TARGET_STACK_TOP_RTX
+
 @hook TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS
 
 @hook TARGET_CONST_ANCHOR
diff --git a/gcc/function.h b/gcc/function.h
index e92c17c..dd1c38a 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -378,6 +378,9 @@ struct GTY(()) function {
 
   /* Set when the tail call has been identified.  */
   unsigned int tail_call_marked : 1;
+
+  /* Set when the address of the stack top has been taken.  */
+  unsigned int stack_top_taken : 1;
 };
 
 /* Add the decl D to the local_decls list of FUN.  */
diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index 8fd8c36..2405082 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -480,6 +480,7 @@ special_builtin_state (enum pure_const_state_e *state, bool *looping,
 	case BUILT_IN_CXA_END_CLEANUP:
 	case BUILT_IN_EH_COPY_VALUES:
 	case BUILT_IN_FRAME_ADDRESS:
+	case BUILT_IN_STACK_TOP:
 	case BUILT_IN_APPLY:
 	case BUILT_IN_APPLY_ARGS:
 	  *looping = false;
diff --git a/gcc/target.def b/gcc/target.def
index 4edc209..7a30f39 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -4525,6 +4525,13 @@ argument list due to stack realignment.  Return @code{NULL} if no DRAP\n\
 is needed.",
  rtx, (void), NULL)
 
+/* Get the stack address when the function is called.  */
+DEFHOOK
+(stack_top_rtx,
+ "This hook should return an rtx for the stack address when the function\n\
+is called.",
+ rtx, (void), NULL)
+
 /* Return true if all function parameters should be spilled to the
    stack.  */
 DEFHOOK
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 3eca47e..f188272 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1926,4 +1926,13 @@ can_use_doloop_if_innermost (const widest_int &, const widest_int &,
   return loop_depth == 1;
 }
 
+/* Get the stack address when the function is called.  After the
+   prologue, stack top is at -WORD(AP) in the current frame.  */
+
+rtx
+default_stack_top_rtx (void)
+{
+  return plus_constant (Pmode, arg_pointer_rtx, -UNITS_PER_WORD);
+}
+
 #include "gt-targhooks.h"
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 5ae991d..094a589 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -240,4 +240,7 @@ extern void default_setup_incoming_vararg_bounds (cumulative_args_t ca ATTRIBUTE
 						  tree type ATTRIBUTE_UNUSED,
 						  int *pretend_arg_size ATTRIBUTE_UNUSED,
 						  int second_time ATTRIBUTE_UNUSED);
+
+extern rtx default_stack_top_rtx (void);
+
 #endif /* GCC_TARGHOOKS_H */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-1.c b/gcc/testsuite/gcc.target/i386/pr66960-1.c
new file mode 100644
index 0000000..477d90d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-1.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fomit-frame-pointer" { target { lp64 } } } */
+/* { dg-options "-O2 -fomit-frame-pointer -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -fomit-frame-pointer -miamcu -fno-pic" { target { ia32 } } } */
+
+extern char **environ;
+extern void exit (int status);
+extern int main (long argc, char **argv, char **envp);
+
+void
+_start (void)
+{
+  void *argc_p = __builtin_stack_top ();
+  char **argv = (char **) (argc_p + sizeof (void *));
+  long argc = *(long *) argc_p;
+  int status;
+
+  environ = argv + argc + 1;
+
+  status = main (argc, argv, environ);
+
+  exit (status);
+}
+
+/* { dg-final { scan-assembler "movq\[ \t\]8\\(%rsp\\), %rdi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]16\\(%rsp\\), %rsi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]24\\(%rsp,%rdi,8\\), %rdx" { target lp64 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]8\\(%esp\\), %edi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%rsp\\), %esi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%rsi,%rdi,4\\), %edx" { target x32 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]\\(%esp\\), %eax" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%esp\\), %edx" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%esp,%eax,4\\), %ecx" { target ia32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-2.c b/gcc/testsuite/gcc.target/i386/pr66960-2.c
new file mode 100644
index 0000000..b9dbde2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-2.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer" { target { lp64 } } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer -miamcu" { target { ia32 } } } */
+
+extern char **environ;
+extern void exit (int status);
+extern int main (long argc, char **argv, char **envp);
+
+void
+_start (void)
+{
+  void *argc_p = __builtin_stack_top ();
+  char **argv = (char **) (argc_p + sizeof (void *));
+  long argc = *(long *) argc_p;
+  int status;
+
+  environ = argv + argc + 1;
+
+  status = main (argc, argv, environ);
+
+  exit (status);
+}
+
+/* { dg-final { scan-assembler "movq\[ \t\]8\\(%rbp\\), %rdi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]16\\(%rbp\\), %rsi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]24\\(%rbp,%rdi,8\\), %rdx" { target lp64 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]8\\(%ebp\\), %edi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%rbp\\), %esi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%rsi,%rdi,4\\), %edx" { target x32 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]4\\(%ebp\\), %eax" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%ebp\\), %edx" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%ebp,%eax,4\\), %ecx" { target ia32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-3.c b/gcc/testsuite/gcc.target/i386/pr66960-3.c
new file mode 100644
index 0000000..48cf25e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args" { target { lp64 } } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args -miamcu" { target { ia32 } } } */
+
+extern void abort (void);
+extern int check_int (int *i, int align);
+typedef int aligned __attribute__((aligned(64)));
+
+void *
+foo (void)
+{
+  aligned j;
+  if (check_int (&j, __alignof__(j)) != j)
+    abort ();
+  return __builtin_stack_top ();
+} /* { dg-message "sorry, unimplemented: .__builtin_stack_top. not supported" } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-4.c b/gcc/testsuite/gcc.target/i386/pr66960-4.c
new file mode 100644
index 0000000..44c0b26
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-4.c
@@ -0,0 +1,21 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args" { target { lp64 } } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args -miamcu" { target { ia32 } } } */
+
+extern void abort (void);
+extern int check_int (int *i, int align);
+typedef int aligned __attribute__((aligned(64)));
+
+void *
+foo (void)
+{
+  aligned j;
+  if (check_int (&j, __alignof__(j)) != j)
+    abort ();
+  return __builtin_stack_top ();
+}
+
+/* { dg-final { scan-assembler "leaq\[ \t\]8\\(%rbp\\), %rax" { target lp64 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%rbp\\), %eax" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%ebp\\), %eax" { target ia32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-5.c b/gcc/testsuite/gcc.target/i386/pr66960-5.c
new file mode 100644
index 0000000..d449437
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-5.c
@@ -0,0 +1,21 @@
+/* { dg-do link } */
+/* { dg-options "-O" } */
+
+extern void link_error (void);
+
+__attribute__ ((noinline, noclone))
+void
+foo (void)
+{
+  void **p = __builtin_stack_top ();
+  void *ra = __builtin_return_address (0);
+  if (*p != ra)
+    link_error ();
+}
+
+int
+main (void)
+{
+  foo ();
+  return 0;
+}
-- 
2.4.3


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-04 12:31 [PATCH] Add __builtin_stack_top H.J. Lu
@ 2015-08-04 15:42 ` Mike Stump
  2015-08-04 15:44   ` H.J. Lu
  0 siblings, 1 reply; 24+ messages in thread
From: Mike Stump @ 2015-08-04 15:42 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Segher Boessenkool, GCC Patches, Uros Bizjak

On Aug 4, 2015, at 5:30 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> Where does this feature belong?

I prefer the middle end.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-04 15:42 ` Mike Stump
@ 2015-08-04 15:44   ` H.J. Lu
  2015-08-04 17:18     ` Mike Stump
  0 siblings, 1 reply; 24+ messages in thread
From: H.J. Lu @ 2015-08-04 15:44 UTC (permalink / raw)
  To: Mike Stump; +Cc: Segher Boessenkool, GCC Patches, Uros Bizjak

On Tue, Aug 4, 2015 at 8:40 AM, Mike Stump <mikestump@comcast.net> wrote:
> On Aug 4, 2015, at 5:30 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> Where does this feature belong?
>
> I prefer the middle end.

Any comments on my middle-end patch?

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-04 15:44   ` H.J. Lu
@ 2015-08-04 17:18     ` Mike Stump
  2015-08-04 17:28       ` H.J. Lu
  0 siblings, 1 reply; 24+ messages in thread
From: Mike Stump @ 2015-08-04 17:18 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Segher Boessenkool, GCC Patches, Uros Bizjak

On Aug 4, 2015, at 8:44 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Aug 4, 2015 at 8:40 AM, Mike Stump <mikestump@comcast.net> wrote:
>> On Aug 4, 2015, at 5:30 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> Where does this feature belong?
>> 
>> I prefer the middle end.
> 
> Any comments on my middle-end patch?

So, if the answer is the same as frame_address (0), why not have the fallback just expand to that?  Then, one can use this builtin everywhere that frame address is used today.  People that want a faster, tighter port can then implement the hook and achieve higher performance.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-04 17:18     ` Mike Stump
@ 2015-08-04 17:28       ` H.J. Lu
  2015-08-04 17:43         ` Segher Boessenkool
  0 siblings, 1 reply; 24+ messages in thread
From: H.J. Lu @ 2015-08-04 17:28 UTC (permalink / raw)
  To: Mike Stump; +Cc: Segher Boessenkool, GCC Patches, Uros Bizjak

On Tue, Aug 4, 2015 at 10:16 AM, Mike Stump <mikestump@comcast.net> wrote:
> On Aug 4, 2015, at 8:44 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, Aug 4, 2015 at 8:40 AM, Mike Stump <mikestump@comcast.net> wrote:
>>> On Aug 4, 2015, at 5:30 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> Where does this feature belong?
>>>
>>> I prefer the middle end.
>>
>> Any comments on my middle-end patch?
>
> So, if the answer is the same as frame_address (0), why not have the fallback just expand to that?  Then, one can use this builtin everywhere that frame address is used today.  People that want a faster, tighter port can then implement the hook and achieve higher performance.

The motivation of __builtin_stack_top is that frame_address requires a
frame pointer register, which isn't desirable for x86.  __builtin_stack_top
doesn't require a frame pointer register.

-- 
H.J.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-04 17:28       ` H.J. Lu
@ 2015-08-04 17:43         ` Segher Boessenkool
  2015-08-04 18:50           ` H.J. Lu
  0 siblings, 1 reply; 24+ messages in thread
From: Segher Boessenkool @ 2015-08-04 17:43 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Tue, Aug 04, 2015 at 10:28:00AM -0700, H.J. Lu wrote:
> >> Any comments on my middle-end patch?
> >
> > So, if the answer is the same as frame_address (0), why not have the fallback just expand to that?  Then, one can use this builtin everywhere that frame address is used today.  People that want a faster, tighter port can then implement the hook and achieve higher performance.
> 
> The motivation of __builtin_stack_top is that frame_address requires a
> frame pointer register, which isn't desirable for x86.  __builtin_stack_top
> doesn't require a frame pointer register.

If the target just returns frame_pointer_rtx from INITIAL_FRAME_ADDRESS_RTX,
you don't get crtl->accesses_prior_frames set either, and as far as I can
see everything works fine?  For __builtin_frame_address(0).

You might have a reason why you want the entry stack address instead of the
frame address, but you didn't really explain I think?  Or I missed it.


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-04 17:43         ` Segher Boessenkool
@ 2015-08-04 18:50           ` H.J. Lu
  2015-08-04 18:51             ` H.J. Lu
  2015-08-04 19:29             ` Segher Boessenkool
  0 siblings, 2 replies; 24+ messages in thread
From: H.J. Lu @ 2015-08-04 18:50 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Tue, Aug 4, 2015 at 10:43 AM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Tue, Aug 04, 2015 at 10:28:00AM -0700, H.J. Lu wrote:
>> >> Any comments on my middle-end patch?
>> >
>> > So, if the answer is the same as frame_address (0), why not have the fallback just expand to that?  Then, one can use this builtin everywhere that frame address is used today.  People that want a faster, tighter port can then implement the hook and achieve higher performance.
>>
>> The motivation of __builtin_stack_top is that frame_address requires a
>> frame pointer register, which isn't desirable for x86.  __builtin_stack_top
>> doesn't require a frame pointer register.
>
> If the target just returns frame_pointer_rtx from INITIAL_FRAME_ADDRESS_RTX,
> you don't get crtl->accesses_prior_frames set either, and as far as I can
> see everything works fine?  For __builtin_frame_address(0).
>
> You might have a reason why you want the entry stack address instead of the
> frame address, but you didn't really explain I think?  Or I missed it.
>

expand_builtin_return_addr sets

crtl->accesses_prior_frames = 1;

for __builtin_frame_address, which requires a frame pointer register.
__builtin_stack_top doesn't set crtl->accesses_prior_frames and frame
pointer register isn't required.

-- 
H.J.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-04 18:50           ` H.J. Lu
@ 2015-08-04 18:51             ` H.J. Lu
  2015-08-04 19:29             ` Segher Boessenkool
  1 sibling, 0 replies; 24+ messages in thread
From: H.J. Lu @ 2015-08-04 18:51 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Tue, Aug 4, 2015 at 11:50 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Aug 4, 2015 at 10:43 AM, Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
>> On Tue, Aug 04, 2015 at 10:28:00AM -0700, H.J. Lu wrote:
>>> >> Any comments on my middle-end patch?
>>> >
>>> > So, if the answer is the same as frame_address (0), why not have the fallback just expand to that?  Then, one can use this builtin everywhere that frame address is used today.  People that want a faster, tighter port can then implement the hook and achieve higher performance.
>>>
>>> The motivation of __builtin_stack_top is that frame_address requires a
>>> frame pointer register, which isn't desirable for x86.  __builtin_stack_top
>>> doesn't require a frame pointer register.
>>
>> If the target just returns frame_pointer_rtx from INITIAL_FRAME_ADDRESS_RTX,
>> you don't get crtl->accesses_prior_frames set either, and as far as I can
>> see everything works fine?  For __builtin_frame_address(0).
>>
>> You might have a reason why you want the entry stack address instead of the
>> frame address, but you didn't really explain I think?  Or I missed it.
>>
>
> expand_builtin_return_addr sets
>
> crtl->accesses_prior_frames = 1;
>
> for __builtin_frame_address, which requires a frame pointer register.
> __builtin_stack_top doesn't set crtl->accesses_prior_frames and frame
> pointer register isn't required.
>

BTW, x86 doesn't define INITIAL_FRAME_ADDRESS_RTX.

-- 
H.J.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-04 18:50           ` H.J. Lu
  2015-08-04 18:51             ` H.J. Lu
@ 2015-08-04 19:29             ` Segher Boessenkool
  2015-08-04 20:00               ` H.J. Lu
  1 sibling, 1 reply; 24+ messages in thread
From: Segher Boessenkool @ 2015-08-04 19:29 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Tue, Aug 04, 2015 at 11:50:00AM -0700, H.J. Lu wrote:
> >> The motivation of __builtin_stack_top is that frame_address requires a
> >> frame pointer register, which isn't desirable for x86.  __builtin_stack_top
> >> doesn't require a frame pointer register.
> >
> > If the target just returns frame_pointer_rtx from INITIAL_FRAME_ADDRESS_RTX,
> > you don't get crtl->accesses_prior_frames set either, and as far as I can
> > see everything works fine?  For __builtin_frame_address(0).
> >
> > You might have a reason why you want the entry stack address instead of the
> > frame address, but you didn't really explain I think?  Or I missed it.
> >
> 
> expand_builtin_return_addr sets
> 
> crtl->accesses_prior_frames = 1;
> 
> for __builtin_frame_address, which requires a frame pointer register.
> __builtin_stack_top doesn't set crtl->accesses_prior_frames and frame
> pointer register isn't required.

Not if you have INITIAL_FRAME_ADDRESS_RTX.  I don't see why the generic code
cannot just use frame_pointer_rtx (instead of hard_frame_pointer_rtx) for
a count of 0; but making it target-specific is certainly more conservative.

You say i386 doesn't have that target macro defined currently.  Yes I know;
so change that?  Or change the generic code, but that is much more testing.


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-04 19:29             ` Segher Boessenkool
@ 2015-08-04 20:00               ` H.J. Lu
  2015-08-04 20:45                 ` Segher Boessenkool
  0 siblings, 1 reply; 24+ messages in thread
From: H.J. Lu @ 2015-08-04 20:00 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Tue, Aug 4, 2015 at 12:29 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Tue, Aug 04, 2015 at 11:50:00AM -0700, H.J. Lu wrote:
>> >> The motivation of __builtin_stack_top is that frame_address requires a
>> >> frame pointer register, which isn't desirable for x86.  __builtin_stack_top
>> >> doesn't require a frame pointer register.
>> >
>> > If the target just returns frame_pointer_rtx from INITIAL_FRAME_ADDRESS_RTX,
>> > you don't get crtl->accesses_prior_frames set either, and as far as I can
>> > see everything works fine?  For __builtin_frame_address(0).
>> >
>> > You might have a reason why you want the entry stack address instead of the
>> > frame address, but you didn't really explain I think?  Or I missed it.
>> >
>>
>> expand_builtin_return_addr sets
>>
>> crtl->accesses_prior_frames = 1;
>>
>> for __builtin_frame_address, which requires a frame pointer register.
>> __builtin_stack_top doesn't set crtl->accesses_prior_frames and frame
>> pointer register isn't required.
>
> Not if you have INITIAL_FRAME_ADDRESS_RTX.  I don't see why the generic code
> cannot just use frame_pointer_rtx (instead of hard_frame_pointer_rtx) for
> a count of 0; but making it target-specific is certainly more conservative.
>
> You say i386 doesn't have that target macro defined currently.  Yes I know;
> so change that?  Or change the generic code, but that is much more testing.

There is another issue with x86, maybe other targets.  You
can't get the real stack top when stack is realigned and
-maccumulate-outgoing-args isn't used since ix86_expand_prologue
will create and return another stack frame for
__builtin_frame_address and __builtin_return_address.
It will be wrong for __builtin_stack_top, which should
return the real stack address.

-- 
H.J.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-04 20:00               ` H.J. Lu
@ 2015-08-04 20:45                 ` Segher Boessenkool
  2015-08-04 20:50                   ` H.J. Lu
  0 siblings, 1 reply; 24+ messages in thread
From: Segher Boessenkool @ 2015-08-04 20:45 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Tue, Aug 04, 2015 at 01:00:32PM -0700, H.J. Lu wrote:
> There is another issue with x86, maybe other targets.  You
> can't get the real stack top when stack is realigned and
> -maccumulate-outgoing-args isn't used since ix86_expand_prologue
> will create and return another stack frame for
> __builtin_frame_address and __builtin_return_address.
> It will be wrong for __builtin_stack_top, which should
> return the real stack address.

That's why I asked:

> >> > You might have a reason why you want the entry stack address instead of the
> >> > frame address, but you didn't really explain I think?  Or I missed it.

What would a C program do with this, that it cannot do with the frame
address, that would be useful and cannot be much better done in straight
assembler?  Do you actually want to expose the argument pointer, maybe?


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-04 20:45                 ` Segher Boessenkool
@ 2015-08-04 20:50                   ` H.J. Lu
  2015-08-19 12:29                     ` H.J. Lu
  0 siblings, 1 reply; 24+ messages in thread
From: H.J. Lu @ 2015-08-04 20:50 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Tue, Aug 4, 2015 at 1:45 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Tue, Aug 04, 2015 at 01:00:32PM -0700, H.J. Lu wrote:
>> There is another issue with x86, maybe other targets.  You
>> can't get the real stack top when stack is realigned and
>> -maccumulate-outgoing-args isn't used since ix86_expand_prologue
>> will create and return another stack frame for
>> __builtin_frame_address and __builtin_return_address.
>> It will be wrong for __builtin_stack_top, which should
>> return the real stack address.
>
> That's why I asked:
>
>> >> > You might have a reason why you want the entry stack address instead of the
>> >> > frame address, but you didn't really explain I think?  Or I missed it.
>
> What would a C program do with this, that it cannot do with the frame
> address, that would be useful and cannot be much better done in straight
> assembler?  Do you actually want to expose the argument pointer, maybe?
>

Yes, we want to use the argument pointer as shown in testcases
included in my patch.


-- 
H.J.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-04 20:50                   ` H.J. Lu
@ 2015-08-19 12:29                     ` H.J. Lu
  2015-08-19 12:57                       ` Segher Boessenkool
  0 siblings, 1 reply; 24+ messages in thread
From: H.J. Lu @ 2015-08-19 12:29 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Tue, Aug 4, 2015 at 1:50 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Aug 4, 2015 at 1:45 PM, Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
>> On Tue, Aug 04, 2015 at 01:00:32PM -0700, H.J. Lu wrote:
>>> There is another issue with x86, maybe other targets.  You
>>> can't get the real stack top when stack is realigned and
>>> -maccumulate-outgoing-args isn't used since ix86_expand_prologue
>>> will create and return another stack frame for
>>> __builtin_frame_address and __builtin_return_address.
>>> It will be wrong for __builtin_stack_top, which should
>>> return the real stack address.
>>
>> That's why I asked:
>>
>>> >> > You might have a reason why you want the entry stack address instead of the
>>> >> > frame address, but you didn't really explain I think?  Or I missed it.
>>
>> What would a C program do with this, that it cannot do with the frame
>> address, that would be useful and cannot be much better done in straight
>> assembler?  Do you actually want to expose the argument pointer, maybe?
>>
>
> Yes, we want to use the argument pointer as shown in testcases
> included in my patch.
>

Where do we stand on this?  We need the hard stack address at
function entry for x86 without using frame pointer.   I added
__builtin_stack_top since __builtin_frame_address can't give
us what we want.  Should __builtin_stack_top be added to
middle-end or x86 backend?

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-19 12:29                     ` H.J. Lu
@ 2015-08-19 12:57                       ` Segher Boessenkool
  2015-08-19 13:03                         ` H.J. Lu
  0 siblings, 1 reply; 24+ messages in thread
From: Segher Boessenkool @ 2015-08-19 12:57 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Wed, Aug 19, 2015 at 05:23:41AM -0700, H.J. Lu wrote:
> >>> >> > You might have a reason why you want the entry stack address instead of the
> >>> >> > frame address, but you didn't really explain I think?  Or I missed it.
> >>
> >> What would a C program do with this, that it cannot do with the frame
> >> address, that would be useful and cannot be much better done in straight
> >> assembler?  Do you actually want to expose the argument pointer, maybe?
> >
> > Yes, we want to use the argument pointer as shown in testcases
> > included in my patch.
> 
> Where do we stand on this?  We need the hard stack address at
> function entry for x86 without using frame pointer.   I added
> __builtin_stack_top since __builtin_frame_address can't give
> us what we want.  Should __builtin_stack_top be added to
> middle-end or x86 backend?

Sorry for not following up; I thought my suggestion was obvious.

Can you do a __builtin_argument_pointer instead?  That should work
for all targets, afaics?


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-19 12:57                       ` Segher Boessenkool
@ 2015-08-19 13:03                         ` H.J. Lu
  2015-08-19 15:31                           ` H.J. Lu
  0 siblings, 1 reply; 24+ messages in thread
From: H.J. Lu @ 2015-08-19 13:03 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Wed, Aug 19, 2015 at 5:51 AM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Wed, Aug 19, 2015 at 05:23:41AM -0700, H.J. Lu wrote:
>> >>> >> > You might have a reason why you want the entry stack address instead of the
>> >>> >> > frame address, but you didn't really explain I think?  Or I missed it.
>> >>
>> >> What would a C program do with this, that it cannot do with the frame
>> >> address, that would be useful and cannot be much better done in straight
>> >> assembler?  Do you actually want to expose the argument pointer, maybe?
>> >
>> > Yes, we want to use the argument pointer as shown in testcases
>> > included in my patch.
>>
>> Where do we stand on this?  We need the hard stack address at
>> function entry for x86 without using frame pointer.   I added
>> __builtin_stack_top since __builtin_frame_address can't give
>> us what we want.  Should __builtin_stack_top be added to
>> middle-end or x86 backend?
>
> Sorry for not following up; I thought my suggestion was obvious.
>
> Can you do a __builtin_argument_pointer instead?  That should work
> for all targets, afaics?

To me, stack top is easier to understand and argument pointer isn't
very clear.  Does argument pointer exist when there is no argument?

But I can live with it.  I will update my patch.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-19 13:03                         ` H.J. Lu
@ 2015-08-19 15:31                           ` H.J. Lu
  2015-08-19 17:08                             ` Segher Boessenkool
  0 siblings, 1 reply; 24+ messages in thread
From: H.J. Lu @ 2015-08-19 15:31 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Mike Stump, GCC Patches, Uros Bizjak

[-- Attachment #1: Type: text/plain, Size: 1713 bytes --]

On Wed, Aug 19, 2015 at 6:00 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Aug 19, 2015 at 5:51 AM, Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
>> On Wed, Aug 19, 2015 at 05:23:41AM -0700, H.J. Lu wrote:
>>> >>> >> > You might have a reason why you want the entry stack address instead of the
>>> >>> >> > frame address, but you didn't really explain I think?  Or I missed it.
>>> >>
>>> >> What would a C program do with this, that it cannot do with the frame
>>> >> address, that would be useful and cannot be much better done in straight
>>> >> assembler?  Do you actually want to expose the argument pointer, maybe?
>>> >
>>> > Yes, we want to use the argument pointer as shown in testcases
>>> > included in my patch.
>>>
>>> Where do we stand on this?  We need the hard stack address at
>>> function entry for x86 without using frame pointer.   I added
>>> __builtin_stack_top since __builtin_frame_address can't give
>>> us what we want.  Should __builtin_stack_top be added to
>>> middle-end or x86 backend?
>>
>> Sorry for not following up; I thought my suggestion was obvious.
>>
>> Can you do a __builtin_argument_pointer instead?  That should work
>> for all targets, afaics?
>
> To me, stack top is easier to understand and argument pointer isn't
> very clear.  Does argument pointer exist when there is no argument?
>
> But I can live with it.  I will update my patch.
>

Here is a patch to add __builtin_argument_pointer.  I only have

 -- Built-in Function: void * __builtin_argument_pointer (void)
     This function returns the argument pointer.

as documentation.  Can you suggest a better description so that it can
be implemented also by other compilers?

Thanks.

-- 
H.J.

[-- Attachment #2: 0001-Add-__builtin_argument_pointer.patch --]
[-- Type: text/x-patch, Size: 12980 bytes --]

From 9af08fdda587e1876e09840499000e35cc841e96 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Tue, 21 Jul 2015 14:32:09 -0700
Subject: [PATCH] Add __builtin_argument_pointer

When __builtin_frame_address is used to retrieve the address of the
function stack frame, the frame pointer register is required, which
wastes one register and 2 instructions.  For x86-32, one less register
means significant negative impact on performance.  This patch adds a
new builtin function, __builtin_argument_pointer.  It returns the
argument pointer, which, on x86, can be used to compute the stack
address when the function is called by subtracting the size of integer
register.

gcc/

	PR target/66960
	* builtin-types.def (BT_FN_PTR_VOID): New function type.
	* builtins.c (expand_builtin): Handle BUILT_IN_ARGUMENT_POINTER.
	(is_simple_builtin): Likewise.
	* ipa-pure-const.c (special_builtin_state): Likewise.
	* builtins.def: Add BUILT_IN_ARGUMENT_POINTER.
	* function.h (function): Add argument_pointer_taken.
	* config/i386/i386.c (ix86_expand_prologue): Sorry if DRAP is
	used and the argument pointer has been taken.
	* doc/extend.texi: Document __builtin_argument_pointer.

gcc/testsuite/

	PR target/66960
	* gcc.target/i386/pr66960-1.c: New test.
	* gcc.target/i386/pr66960-2.c: Likewise.
	* gcc.target/i386/pr66960-3.c: Likewise.
	* gcc.target/i386/pr66960-4.c: Likewise.
	* gcc.target/i386/pr66960-5.c: Likewise.
---
 gcc/builtin-types.def                     |  1 +
 gcc/builtins.c                            |  5 +++++
 gcc/builtins.def                          |  1 +
 gcc/config/i386/i386.c                    |  6 ++++++
 gcc/doc/extend.texi                       |  4 ++++
 gcc/function.h                            |  3 +++
 gcc/ipa-pure-const.c                      |  1 +
 gcc/testsuite/gcc.target/i386/pr66960-1.c | 34 +++++++++++++++++++++++++++++++
 gcc/testsuite/gcc.target/i386/pr66960-2.c | 34 +++++++++++++++++++++++++++++++
 gcc/testsuite/gcc.target/i386/pr66960-3.c | 18 ++++++++++++++++
 gcc/testsuite/gcc.target/i386/pr66960-4.c | 22 ++++++++++++++++++++
 gcc/testsuite/gcc.target/i386/pr66960-5.c | 22 ++++++++++++++++++++
 12 files changed, 151 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-5.c

diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 0e34531..2b6b5ab 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -177,6 +177,7 @@ DEF_FUNCTION_TYPE_1 (BT_FN_COMPLEX_LONGDOUBLE_LONGDOUBLE,
 		     BT_COMPLEX_LONGDOUBLE, BT_LONGDOUBLE)
 DEF_FUNCTION_TYPE_1 (BT_FN_PTR_UINT, BT_PTR, BT_UINT)
 DEF_FUNCTION_TYPE_1 (BT_FN_PTR_SIZE, BT_PTR, BT_SIZE)
+DEF_FUNCTION_TYPE_1 (BT_FN_PTR_VOID, BT_PTR, BT_VOID)
 DEF_FUNCTION_TYPE_1 (BT_FN_INT_INT, BT_INT, BT_INT)
 DEF_FUNCTION_TYPE_1 (BT_FN_INT_UINT, BT_INT, BT_UINT)
 DEF_FUNCTION_TYPE_1 (BT_FN_INT_LONG, BT_INT, BT_LONG)
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 31969ca..b1cfa44 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -6206,6 +6206,10 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
     case BUILT_IN_CONSTANT_P:
       return const0_rtx;
 
+    case BUILT_IN_ARGUMENT_POINTER:
+      cfun->argument_pointer_taken = true;
+      return arg_pointer_rtx;
+
     case BUILT_IN_FRAME_ADDRESS:
     case BUILT_IN_RETURN_ADDRESS:
       return expand_builtin_frame_address (fndecl, exp);
@@ -12395,6 +12399,7 @@ is_simple_builtin (tree decl)
       case BUILT_IN_RETURN:
       case BUILT_IN_AGGREGATE_INCOMING_ADDRESS:
       case BUILT_IN_FRAME_ADDRESS:
+      case BUILT_IN_ARGUMENT_POINTER:
       case BUILT_IN_VA_END:
       case BUILT_IN_STACK_SAVE:
       case BUILT_IN_STACK_RESTORE:
diff --git a/gcc/builtins.def b/gcc/builtins.def
index f7ac4a8..3bc9615 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -778,6 +778,7 @@ DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSL, "ffsl", BT_FN_INT_LONG, ATTR_CONST_NOTHRO
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSLL, "ffsll", BT_FN_INT_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN        (BUILT_IN_FORK, "fork", BT_FN_PID, ATTR_NOTHROW_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_FRAME_ADDRESS, "frame_address", BT_FN_PTR_UINT, ATTR_NULL)
+DEF_GCC_BUILTIN        (BUILT_IN_ARGUMENT_POINTER, "argument_pointer", BT_FN_PTR_VOID, ATTR_NULL)
 /* [trans-mem]: Adjust BUILT_IN_TM_FREE if BUILT_IN_FREE is changed.  */
 DEF_LIB_BUILTIN        (BUILT_IN_FREE, "free", BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_FROB_RETURN_ADDR, "frob_return_addr", BT_FN_PTR_PTR, ATTR_NULL)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 05fa5e1..1594fb9 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11577,6 +11577,12 @@ ix86_expand_prologue (void)
     {
       int align_bytes = crtl->stack_alignment_needed / BITS_PER_UNIT;
 
+      /* Can't use DRAP if the stack address has been taken.  */
+      if (cfun->argument_pointer_taken)
+	sorry ("%<__builtin_argument_pointer%> not supported with stack"
+	       " realignment.  This may be worked around by adding"
+	       " -maccumulate-outgoing-args.");
+
       /* Only need to push parameter pointer reg if it is caller saved.  */
       if (!call_used_regs[REGNO (crtl->drap_reg)])
 	{
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index dba8b43..620f038 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -8841,6 +8841,10 @@ option is in effect.  Such calls should only be made in debugging
 situations.
 @end deftypefn
 
+@deftypefn {Built-in Function} {void *} __builtin_argument_pointer (void)
+This function returns the argument pointer.
+@end deftypefn
+
 @node Vector Extensions
 @section Using Vector Instructions through Built-in Functions
 
diff --git a/gcc/function.h b/gcc/function.h
index e92c17c..41bdaed 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -378,6 +378,9 @@ struct GTY(()) function {
 
   /* Set when the tail call has been identified.  */
   unsigned int tail_call_marked : 1;
+
+  /* Set when argument pointer has been taken.  */
+  unsigned int argument_pointer_taken : 1;
 };
 
 /* Add the decl D to the local_decls list of FUN.  */
diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index 8fd8c36..31c289d 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -480,6 +480,7 @@ special_builtin_state (enum pure_const_state_e *state, bool *looping,
 	case BUILT_IN_CXA_END_CLEANUP:
 	case BUILT_IN_EH_COPY_VALUES:
 	case BUILT_IN_FRAME_ADDRESS:
+	case BUILT_IN_ARGUMENT_POINTER:
 	case BUILT_IN_APPLY:
 	case BUILT_IN_APPLY_ARGS:
 	  *looping = false;
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-1.c b/gcc/testsuite/gcc.target/i386/pr66960-1.c
new file mode 100644
index 0000000..d8caa4f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-1.c
@@ -0,0 +1,34 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fomit-frame-pointer" { target { lp64 } } } */
+/* { dg-options "-O2 -fomit-frame-pointer -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -fomit-frame-pointer -miamcu -fno-pic" { target { ia32 } } } */
+
+extern char **environ;
+extern void exit (int status);
+extern int main (long argc, char **argv, char **envp);
+
+void
+_start (void)
+{
+  void *argc_p = (__builtin_argument_pointer ()
+		  - sizeof (int __attribute__ ((mode (__word__)))));
+  char **argv = (char **) (argc_p + sizeof (void *));
+  long argc = *(long *) argc_p;
+  int status;
+
+  environ = argv + argc + 1;
+
+  status = main (argc, argv, environ);
+
+  exit (status);
+}
+
+/* { dg-final { scan-assembler "movq\[ \t\]8\\(%rsp\\), %rdi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]16\\(%rsp\\), %rsi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]24\\(%rsp,%rdi,8\\), %rdx" { target lp64 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]8\\(%esp\\), %edi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%rsp\\), %esi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%rsi,%rdi,4\\), %edx" { target x32 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]\\(%esp\\), %eax" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%esp\\), %edx" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%esp,%eax,4\\), %ecx" { target ia32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-2.c b/gcc/testsuite/gcc.target/i386/pr66960-2.c
new file mode 100644
index 0000000..f4d2ef6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-2.c
@@ -0,0 +1,34 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer" { target { lp64 } } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer -miamcu" { target { ia32 } } } */
+
+extern char **environ;
+extern void exit (int status);
+extern int main (long argc, char **argv, char **envp);
+
+void
+_start (void)
+{
+  void *argc_p = (__builtin_argument_pointer ()
+		  - sizeof (int __attribute__ ((mode (__word__)))));
+  char **argv = (char **) (argc_p + sizeof (void *));
+  long argc = *(long *) argc_p;
+  int status;
+
+  environ = argv + argc + 1;
+
+  status = main (argc, argv, environ);
+
+  exit (status);
+}
+
+/* { dg-final { scan-assembler "movq\[ \t\]8\\(%rbp\\), %rdi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]16\\(%rbp\\), %rsi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]24\\(%rbp,%rdi,8\\), %rdx" { target lp64 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]8\\(%ebp\\), %edi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%rbp\\), %esi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%rsi,%rdi,4\\), %edx" { target x32 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]4\\(%ebp\\), %eax" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%ebp\\), %edx" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%ebp,%eax,4\\), %ecx" { target ia32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-3.c b/gcc/testsuite/gcc.target/i386/pr66960-3.c
new file mode 100644
index 0000000..7eb2608
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-3.c
@@ -0,0 +1,18 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args" { target { lp64 } } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args -miamcu" { target { ia32 } } } */
+
+extern void abort (void);
+extern int check_int (int *i, int align);
+typedef int aligned __attribute__((aligned(64)));
+
+void *
+foo (void)
+{
+  aligned j;
+  if (check_int (&j, __alignof__(j)) != j)
+    abort ();
+  return (__builtin_argument_pointer ()
+	  - sizeof (int __attribute__ ((mode (__word__)))));
+} /* { dg-message "sorry, unimplemented: .__builtin_argument_pointer. not supported" } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-4.c b/gcc/testsuite/gcc.target/i386/pr66960-4.c
new file mode 100644
index 0000000..361e91c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-4.c
@@ -0,0 +1,22 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args" { target { lp64 } } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args -miamcu" { target { ia32 } } } */
+
+extern void abort (void);
+extern int check_int (int *i, int align);
+typedef int aligned __attribute__((aligned(64)));
+
+void *
+foo (void)
+{
+  aligned j;
+  if (check_int (&j, __alignof__(j)) != j)
+    abort ();
+  return (__builtin_argument_pointer ()
+	  - sizeof (int __attribute__ ((mode (__word__)))));
+}
+
+/* { dg-final { scan-assembler "leaq\[ \t\]8\\(%rbp\\), %rax" { target lp64 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%rbp\\), %eax" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%ebp\\), %eax" { target ia32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-5.c b/gcc/testsuite/gcc.target/i386/pr66960-5.c
new file mode 100644
index 0000000..f70a258
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-5.c
@@ -0,0 +1,22 @@
+/* { dg-do link } */
+/* { dg-options "-O" } */
+
+extern void link_error (void);
+
+__attribute__ ((noinline, noclone))
+void
+foo (void)
+{
+  void **p = (__builtin_argument_pointer ()
+	      - sizeof (int __attribute__ ((mode (__word__)))));
+  void *ra = __builtin_return_address (0);
+  if (*p != ra)
+    link_error ();
+}
+
+int
+main (void)
+{
+  foo ();
+  return 0;
+}
-- 
2.4.3


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-19 15:31                           ` H.J. Lu
@ 2015-08-19 17:08                             ` Segher Boessenkool
  2015-08-19 17:11                               ` H.J. Lu
  0 siblings, 1 reply; 24+ messages in thread
From: Segher Boessenkool @ 2015-08-19 17:08 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Wed, Aug 19, 2015 at 08:25:49AM -0700, H.J. Lu wrote:
> Here is a patch to add __builtin_argument_pointer.  I only have

Sorry to be a pain but...  all the other builtins use _address
instead of _pointer, it's probably best to follow that.

>  -- Built-in Function: void * __builtin_argument_pointer (void)
>      This function returns the argument pointer.
> 
> as documentation.  Can you suggest a better description so that it can
> be implemented also by other compilers?

Maybe something like (heavily cut'n'pasted):


@deftypefn {Built-in Function} {void *} __builtin_argument_address (void)
This function is similar to @code{__builtin_frame_address} with an
argument of 0, but it returns the address of the incoming arguments to
the current function rather than the address of its frame.

The exact definition of this address depends upon the processor and the
calling convention.  Usually some arguments are passed in registers and
the rest on the stack, and this builtin returns the address of the first
argument that is on the stack.


> +      /* Can't use DRAP if the stack address has been taken.  */
> +      if (cfun->argument_pointer_taken)
> +	sorry ("%<__builtin_argument_pointer%> not supported with stack"
> +	       " realignment.  This may be worked around by adding"
> +	       " -maccumulate-outgoing-args.");

This doesn't work with DRAP?  Pity :-(

The patch looks plausible, but I of course can not approve it.

Thanks,


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-19 17:08                             ` Segher Boessenkool
@ 2015-08-19 17:11                               ` H.J. Lu
  2015-08-19 17:53                                 ` Segher Boessenkool
  0 siblings, 1 reply; 24+ messages in thread
From: H.J. Lu @ 2015-08-19 17:11 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Wed, Aug 19, 2015 at 9:58 AM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Wed, Aug 19, 2015 at 08:25:49AM -0700, H.J. Lu wrote:
>> Here is a patch to add __builtin_argument_pointer.  I only have
>
> Sorry to be a pain but...  all the other builtins use _address
> instead of _pointer, it's probably best to follow that.
>
>>  -- Built-in Function: void * __builtin_argument_pointer (void)
>>      This function returns the argument pointer.
>>
>> as documentation.  Can you suggest a better description so that it can
>> be implemented also by other compilers?
>
> Maybe something like (heavily cut'n'pasted):
>
>
> @deftypefn {Built-in Function} {void *} __builtin_argument_address (void)
> This function is similar to @code{__builtin_frame_address} with an
> argument of 0, but it returns the address of the incoming arguments to
> the current function rather than the address of its frame.

This doesn't make senses when there is no argument or arguments
are passed in registers.  To me, argument pointer is a virtual concept
and an implementation detail internal to GCC.  I am not sure if another
compiler can implement it based on this description.

> The exact definition of this address depends upon the processor and the
> calling convention.  Usually some arguments are passed in registers and
> the rest on the stack, and this builtin returns the address of the first
> argument that is on the stack.
>
>
>> +      /* Can't use DRAP if the stack address has been taken.  */
>> +      if (cfun->argument_pointer_taken)
>> +     sorry ("%<__builtin_argument_pointer%> not supported with stack"
>> +            " realignment.  This may be worked around by adding"
>> +            " -maccumulate-outgoing-args.");
>
> This doesn't work with DRAP?  Pity :-(

With DRAP,  we do

      /* Replicate the return address on the stack so that return
         address can be reached via (argp - 1) slot.  This is needed
         to implement macro RETURN_ADDR_RTX and intrinsic function
         expand_builtin_return_addr etc.  */
      t = plus_constant (Pmode, crtl->drap_reg, -UNITS_PER_WORD);
      t = gen_frame_mem (word_mode, t);
      insn = emit_insn (gen_push (t));
      RTX_FRAME_RELATED_P (insn) = 1;

      /* For the purposes of frame and register save area addressing,
         we've started over with a new frame.  */
      m->fs.sp_offset = INCOMING_FRAME_SP_OFFSET;
      m->fs.realigned = true;

which doesn't work for __builtin_argument_pointer.

> The patch looks plausible, but I of course can not approve it.
>

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-19 17:11                               ` H.J. Lu
@ 2015-08-19 17:53                                 ` Segher Boessenkool
  2015-08-19 19:13                                   ` H.J. Lu
  0 siblings, 1 reply; 24+ messages in thread
From: Segher Boessenkool @ 2015-08-19 17:53 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Wed, Aug 19, 2015 at 10:08:01AM -0700, H.J. Lu wrote:
> > Maybe something like (heavily cut'n'pasted):
> >
> >
> > @deftypefn {Built-in Function} {void *} __builtin_argument_address (void)
> > This function is similar to @code{__builtin_frame_address} with an
> > argument of 0, but it returns the address of the incoming arguments to
> > the current function rather than the address of its frame.
> 
> This doesn't make senses when there is no argument or arguments
> are passed in registers.

Sure, but see the weasel-words below ("The exact...")

> To me, argument pointer is a virtual concept
> and an implementation detail internal to GCC.  I am not sure if another
> compiler can implement it based on this description.

The same is true for frame_address, on many machines.

> > The exact definition of this address depends upon the processor and the
> > calling convention.  Usually some arguments are passed in registers and
> > the rest on the stack, and this builtin returns the address of the first
> > argument that is on the stack.


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-19 17:53                                 ` Segher Boessenkool
@ 2015-08-19 19:13                                   ` H.J. Lu
  2015-08-19 22:06                                     ` H.J. Lu
  0 siblings, 1 reply; 24+ messages in thread
From: H.J. Lu @ 2015-08-19 19:13 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Wed, Aug 19, 2015 at 10:48 AM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Wed, Aug 19, 2015 at 10:08:01AM -0700, H.J. Lu wrote:
>> > Maybe something like (heavily cut'n'pasted):
>> >
>> >
>> > @deftypefn {Built-in Function} {void *} __builtin_argument_address (void)
>> > This function is similar to @code{__builtin_frame_address} with an
>> > argument of 0, but it returns the address of the incoming arguments to
>> > the current function rather than the address of its frame.
>>
>> This doesn't make senses when there is no argument or arguments
>> are passed in registers.
>
> Sure, but see the weasel-words below ("The exact...")
>
>> To me, argument pointer is a virtual concept
>> and an implementation detail internal to GCC.  I am not sure if another
>> compiler can implement it based on this description.
>
> The same is true for frame_address, on many machines.

Stack frame is well understood unlike argument pointer which is
pretty vague.

>> > The exact definition of this address depends upon the processor and the
>> > calling convention.  Usually some arguments are passed in registers and
>> > the rest on the stack, and this builtin returns the address of the first
>> > argument that is on the stack.
>
>
> Segher



-- 
H.J.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-19 19:13                                   ` H.J. Lu
@ 2015-08-19 22:06                                     ` H.J. Lu
  2015-08-19 22:18                                       ` Segher Boessenkool
  0 siblings, 1 reply; 24+ messages in thread
From: H.J. Lu @ 2015-08-19 22:06 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Wed, Aug 19, 2015 at 10:53 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Aug 19, 2015 at 10:48 AM, Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
>> On Wed, Aug 19, 2015 at 10:08:01AM -0700, H.J. Lu wrote:
>>> > Maybe something like (heavily cut'n'pasted):
>>> >
>>> >
>>> > @deftypefn {Built-in Function} {void *} __builtin_argument_address (void)
>>> > This function is similar to @code{__builtin_frame_address} with an
>>> > argument of 0, but it returns the address of the incoming arguments to
>>> > the current function rather than the address of its frame.
>>>
>>> This doesn't make senses when there is no argument or arguments
>>> are passed in registers.
>>
>> Sure, but see the weasel-words below ("The exact...")
>>
>>> To me, argument pointer is a virtual concept
>>> and an implementation detail internal to GCC.  I am not sure if another
>>> compiler can implement it based on this description.
>>
>> The same is true for frame_address, on many machines.
>
> Stack frame is well understood unlike argument pointer which is
> pretty vague.
>

How about this

@deftypefn {Built-in Function} {void *} __builtin_argument_pointer (void)
This function is similar to @code{__builtin_frame_address} with an
argument of 0, but it returns the address of the incoming arguments to
the current function rather than the address of its frame.  Unlike
@code{__builtin_frame_address}, the frame pointer register isn't
required.

The exact definition of this address depends upon the processor and the
calling convention.  Usually some arguments are passed in registers and
the rest on the stack, and this builtin returns the address of the
first argument which would be passed on the stack.
@end deftypefn

-- 
H.J.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-19 22:06                                     ` H.J. Lu
@ 2015-08-19 22:18                                       ` Segher Boessenkool
  2015-08-19 22:35                                         ` H.J. Lu
  0 siblings, 1 reply; 24+ messages in thread
From: Segher Boessenkool @ 2015-08-19 22:18 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Wed, Aug 19, 2015 at 02:53:47PM -0700, H.J. Lu wrote:
> How about this
> 
> @deftypefn {Built-in Function} {void *} __builtin_argument_pointer (void)
> This function is similar to @code{__builtin_frame_address} with an
> argument of 0, but it returns the address of the incoming arguments to
> the current function rather than the address of its frame.  Unlike
> @code{__builtin_frame_address}, the frame pointer register isn't
> required.

That last line isn't true, if your port uses INITIAL_FRAME_POINTER_RTX.
Maybe it shouldn't be true otherwise either (but currently a hard frame
pointer is forced, indeed).  Have we gone full circle now? ;-)

> The exact definition of this address depends upon the processor and the
> calling convention.  Usually some arguments are passed in registers and
> the rest on the stack, and this builtin returns the address of the
> first argument which would be passed on the stack.
> @end deftypefn


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-19 22:18                                       ` Segher Boessenkool
@ 2015-08-19 22:35                                         ` H.J. Lu
  2015-08-19 22:55                                           ` Segher Boessenkool
  0 siblings, 1 reply; 24+ messages in thread
From: H.J. Lu @ 2015-08-19 22:35 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Wed, Aug 19, 2015 at 3:10 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Wed, Aug 19, 2015 at 02:53:47PM -0700, H.J. Lu wrote:
>> How about this
>>
>> @deftypefn {Built-in Function} {void *} __builtin_argument_pointer (void)
>> This function is similar to @code{__builtin_frame_address} with an
>> argument of 0, but it returns the address of the incoming arguments to
>> the current function rather than the address of its frame.  Unlike
>> @code{__builtin_frame_address}, the frame pointer register isn't
>> required.
>
> That last line isn't true, if your port uses INITIAL_FRAME_POINTER_RTX.
> Maybe it shouldn't be true otherwise either (but currently a hard frame
> pointer is forced, indeed).  Have we gone full circle now? ;-)

Let's drop it:


@deftypefn {Built-in Function} {void *} __builtin_argument_pointer (void)
This function is similar to @code{__builtin_frame_address} with an
argument of 0, but it returns the address of the incoming arguments to
the current function rather than the address of its frame.

The exact definition of this address depends upon the processor and the
calling convention.  Usually some arguments are passed in registers and
the rest on the stack, and this builtin returns the address of the
first argument which would be passed on the stack.
@end deftypefn


-- 
H.J.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] Add __builtin_stack_top
  2015-08-19 22:35                                         ` H.J. Lu
@ 2015-08-19 22:55                                           ` Segher Boessenkool
  0 siblings, 0 replies; 24+ messages in thread
From: Segher Boessenkool @ 2015-08-19 22:55 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Mike Stump, GCC Patches, Uros Bizjak

On Wed, Aug 19, 2015 at 03:18:46PM -0700, H.J. Lu wrote:
> @deftypefn {Built-in Function} {void *} __builtin_argument_pointer (void)
> This function is similar to @code{__builtin_frame_address} with an
> argument of 0, but it returns the address of the incoming arguments to
> the current function rather than the address of its frame.
> 
> The exact definition of this address depends upon the processor and the
> calling convention.  Usually some arguments are passed in registers and
> the rest on the stack, and this builtin returns the address of the
> first argument which would be passed on the stack.
> @end deftypefn

That is fine by me.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-08-19 22:35 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-04 12:31 [PATCH] Add __builtin_stack_top H.J. Lu
2015-08-04 15:42 ` Mike Stump
2015-08-04 15:44   ` H.J. Lu
2015-08-04 17:18     ` Mike Stump
2015-08-04 17:28       ` H.J. Lu
2015-08-04 17:43         ` Segher Boessenkool
2015-08-04 18:50           ` H.J. Lu
2015-08-04 18:51             ` H.J. Lu
2015-08-04 19:29             ` Segher Boessenkool
2015-08-04 20:00               ` H.J. Lu
2015-08-04 20:45                 ` Segher Boessenkool
2015-08-04 20:50                   ` H.J. Lu
2015-08-19 12:29                     ` H.J. Lu
2015-08-19 12:57                       ` Segher Boessenkool
2015-08-19 13:03                         ` H.J. Lu
2015-08-19 15:31                           ` H.J. Lu
2015-08-19 17:08                             ` Segher Boessenkool
2015-08-19 17:11                               ` H.J. Lu
2015-08-19 17:53                                 ` Segher Boessenkool
2015-08-19 19:13                                   ` H.J. Lu
2015-08-19 22:06                                     ` H.J. Lu
2015-08-19 22:18                                       ` Segher Boessenkool
2015-08-19 22:35                                         ` H.J. Lu
2015-08-19 22:55                                           ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).