* RFC: [PATCH] Add __builtin_ia32_stack_top
@ 2015-07-21 22:56 H.J. Lu
2015-07-22 12:21 ` H.J. Lu
2015-07-30 19:44 ` [PATCH] Add __builtin_stack_top to x86 backend H.J. Lu
0 siblings, 2 replies; 7+ messages in thread
From: H.J. Lu @ 2015-07-21 22:56 UTC (permalink / raw)
To: gcc-patches; +Cc: Uros Bizjak
When __builtin_frame_address is used to retrieve the address of the
function stack frame, the frame pointer is always kept, which wastes one
register and 2 instructions. For x86-32, one less register means
significant negative impact on performance. This patch adds a new
builtin function, __builtin_ia32_stack_top, to x86 backend. It
returns the stack address when the function is called.
Any comments, feedbacks?
Thanks.
H.J.
---
gcc/
PR target/66960
* config/i386/i386.c (ix86_expand_prologue): Sorry if DRAP is
used and the stack address has been taken.
(ix86_builtins): Add IX86_BUILTIN_STACK_TOP.
(ix86_init_mmx_sse_builtins): Add __builtin_ia32_stack_top.
(ix86_expand_builtin): Handle IX86_BUILTIN_STACK_TOP.
* config/i386/i386.h (machine_function): Add stack_top_taken.
* doc/extend.texi: Document __builtin_ia32_stack_top.
gcc/testsuite/
PR target/66960
* gcc.target/i386/pr66960-1.c: New test.
* gcc.target/i386/pr66960-2.c: Likewise.
* gcc.target/i386/pr66960-3.c: Likewise.
* gcc.target/i386/pr66960-4.c: Likewise.
---
gcc/config/i386/i386.c | 28 ++++++++++++++++++++++++++
gcc/config/i386/i386.h | 4 ++++
gcc/doc/extend.texi | 3 +++
gcc/testsuite/gcc.target/i386/pr66960-1.c | 33 +++++++++++++++++++++++++++++++
gcc/testsuite/gcc.target/i386/pr66960-2.c | 33 +++++++++++++++++++++++++++++++
gcc/testsuite/gcc.target/i386/pr66960-3.c | 17 ++++++++++++++++
gcc/testsuite/gcc.target/i386/pr66960-4.c | 21 ++++++++++++++++++++
7 files changed, 139 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-1.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-2.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-3.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-4.c
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 8b9baf8..a252473 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11613,6 +11613,12 @@ ix86_expand_prologue (void)
{
int align_bytes = crtl->stack_alignment_needed / BITS_PER_UNIT;
+ /* Can't use DRAP if the stack address has been taken. */
+ if (cfun->machine->stack_top_taken)
+ sorry ("%<__builtin_ia32_stack_top%> not supported with stack"
+ " realignment. This may be worked around by adding"
+ " -maccumulate-outgoing-arg.");
+
/* Only need to push parameter pointer reg if it is caller saved. */
if (!call_used_regs[REGNO (crtl->drap_reg)])
{
@@ -30779,6 +30785,9 @@ enum ix86_builtins
IX86_BUILTIN_READ_FLAGS,
IX86_BUILTIN_WRITE_FLAGS,
+ /* Get the stack address when the function is called. */
+ IX86_BUILTIN_STACK_TOP,
+
IX86_BUILTIN_MAX
};
@@ -34391,6 +34400,10 @@ ix86_init_mmx_sse_builtins (void)
def_builtin (OPTION_MASK_ISA_MWAITX, "__builtin_ia32_mwaitx",
VOID_FTYPE_UNSIGNED_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAITX);
+ /* Get the stack address when the function is called. */
+ def_builtin (0, "__builtin_ia32_stack_top",
+ PVOID_FTYPE_VOID, IX86_BUILTIN_STACK_TOP);
+
/* Add FMA4 multi-arg argument instructions */
for (i = 0, d = bdesc_multi_arg; i < ARRAY_SIZE (bdesc_multi_arg); i++, d++)
{
@@ -40325,6 +40338,21 @@ addcarryx:
emit_insn (gen_xabort (op0));
return 0;
+ case IX86_BUILTIN_STACK_TOP:
+ cfun->machine->stack_top_taken = true;
+
+ if (!target
+ || GET_MODE (target) != Pmode
+ || !register_operand (target, Pmode))
+ target = gen_reg_rtx (Pmode);
+
+ /* After the prologue, stack top is at -WORD(AP) in the current
+ frame. */
+ emit_insn (gen_rtx_SET (target,
+ plus_constant (Pmode, arg_pointer_rtx,
+ -UNITS_PER_WORD)));
+ return target;
+
default:
break;
}
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index ab668fe..a33a9eb 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2483,6 +2483,10 @@ struct GTY(()) machine_function {
/* If true, it is safe to not save/restore DRAP register. */
BOOL_BITFIELD no_drap_save_restore : 1;
+ /* If true, the stack address of the current function has been
+ taken. */
+ BOOL_BITFIELD stack_top_taken : 1;
+
/* If true, there is register available for argument passing. This
is used only in ix86_function_ok_for_sibcall by 32-bit to determine
if there is scratch register available for indirect sibcall. In
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index b18d8fb..a41defc 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -16661,6 +16661,9 @@ The following built-in function is always available.
@item void __builtin_ia32_pause (void)
Generates the @code{pause} machine instruction with a compiler memory
barrier.
+
+@item void *__builtin_ia32_stack_top (void)
+Retrieves the stack address when the function is called.
@end table
The following floating-point built-in functions are made available in the
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-1.c b/gcc/testsuite/gcc.target/i386/pr66960-1.c
new file mode 100644
index 0000000..74181cf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-1.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fomit-frame-pointer" { target { lp64 } } } */
+/* { dg-options "-O2 -fomit-frame-pointer -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -fomit-frame-pointer -miamcu" { target { ia32 } } } */
+
+extern char **environ;
+extern void exit (int status);
+extern int main (long argc, char **argv, char **envp);
+
+void
+_start (void)
+{
+ void *argc_p = __builtin_ia32_stack_top ();
+ char **argv = (char **) (argc_p + sizeof (void *));
+ long argc = *(long *) argc_p;
+ int status;
+
+ environ = argv + argc + 1;
+
+ status = main (argc, argv, environ);
+
+ exit (status);
+}
+
+/* { dg-final { scan-assembler "movq\[ \t\]8\\(%rsp\\), %rdi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]16\\(%rsp\\), %rsi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]24\\(%rsp,%rdi,8\\), %rdx" { target lp64 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]8\\(%esp\\), %edi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%rsp\\), %esi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%rsi,%rdi,4\\), %edx" { target x32 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]\\(%esp\\), %eax" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%esp\\), %edx" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%esp,%eax,4\\), %ecx" { target ia32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-2.c b/gcc/testsuite/gcc.target/i386/pr66960-2.c
new file mode 100644
index 0000000..8252c2c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-2.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer" { target { lp64 } } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer -miamcu" { target { ia32 } } } */
+
+extern char **environ;
+extern void exit (int status);
+extern int main (long argc, char **argv, char **envp);
+
+void
+_start (void)
+{
+ void *argc_p = __builtin_ia32_stack_top ();
+ char **argv = (char **) (argc_p + sizeof (void *));
+ long argc = *(long *) argc_p;
+ int status;
+
+ environ = argv + argc + 1;
+
+ status = main (argc, argv, environ);
+
+ exit (status);
+}
+
+/* { dg-final { scan-assembler "movq\[ \t\]8\\(%rbp\\), %rdi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]16\\(%rbp\\), %rsi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]24\\(%rbp,%rdi,8\\), %rdx" { target lp64 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]8\\(%ebp\\), %edi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%rbp\\), %esi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%rsi,%rdi,4\\), %edx" { target x32 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]4\\(%ebp\\), %eax" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%ebp\\), %edx" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%ebp,%eax,4\\), %ecx" { target ia32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-3.c b/gcc/testsuite/gcc.target/i386/pr66960-3.c
new file mode 100644
index 0000000..1698003
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args" { target { lp64 } } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args -miamcu" { target { ia32 } } } */
+
+extern void abort (void);
+extern int check_int (int *i, int align);
+typedef int aligned __attribute__((aligned(64)));
+
+void *
+foo (void)
+{
+ aligned j;
+ if (check_int (&j, __alignof__(j)) != j)
+ abort ();
+ return __builtin_ia32_stack_top ();
+} /* { dg-message "sorry, unimplemented: .__builtin_ia32_stack_top. not supported" } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-4.c b/gcc/testsuite/gcc.target/i386/pr66960-4.c
new file mode 100644
index 0000000..82a032c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-4.c
@@ -0,0 +1,21 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args" { target { lp64 } } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args -miamcu" { target { ia32 } } } */
+
+extern void abort (void);
+extern int check_int (int *i, int align);
+typedef int aligned __attribute__((aligned(64)));
+
+void *
+foo (void)
+{
+ aligned j;
+ if (check_int (&j, __alignof__(j)) != j)
+ abort ();
+ return __builtin_ia32_stack_top ();
+}
+
+/* { dg-final { scan-assembler "leaq\[ \t\]8\\(%rbp\\), %rax" { target lp64 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%rbp\\), %eax" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%ebp\\), %eax" { target ia32 } } } */
--
2.4.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RFC: [PATCH] Add __builtin_ia32_stack_top
2015-07-21 22:56 RFC: [PATCH] Add __builtin_ia32_stack_top H.J. Lu
@ 2015-07-22 12:21 ` H.J. Lu
2015-07-22 13:59 ` Segher Boessenkool
2015-07-30 19:44 ` [PATCH] Add __builtin_stack_top to x86 backend H.J. Lu
1 sibling, 1 reply; 7+ messages in thread
From: H.J. Lu @ 2015-07-22 12:21 UTC (permalink / raw)
To: GCC Patches; +Cc: Uros Bizjak
On Tue, Jul 21, 2015 at 2:45 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
> When __builtin_frame_address is used to retrieve the address of the
> function stack frame, the frame pointer is always kept, which wastes one
> register and 2 instructions. For x86-32, one less register means
> significant negative impact on performance. This patch adds a new
> builtin function, __builtin_ia32_stack_top, to x86 backend. It
> returns the stack address when the function is called.
>
> Any comments, feedbacks?
>
> Thanks.
>
>
> H.J.
> ---
> gcc/
>
> PR target/66960
> * config/i386/i386.c (ix86_expand_prologue): Sorry if DRAP is
> used and the stack address has been taken.
> (ix86_builtins): Add IX86_BUILTIN_STACK_TOP.
> (ix86_init_mmx_sse_builtins): Add __builtin_ia32_stack_top.
> (ix86_expand_builtin): Handle IX86_BUILTIN_STACK_TOP.
> * config/i386/i386.h (machine_function): Add stack_top_taken.
> * doc/extend.texi: Document __builtin_ia32_stack_top.
>
I got a feedback, suggesting __builtin_stack_top, instead of
__builtin_ia32_stack_top. But I don't know if
+ /* After the prologue, stack top is at -WORD(AP) in the current
+ frame. */
+ emit_insn (gen_rtx_SET (target,
+ plus_constant (Pmode, arg_pointer_rtx,
+ -UNITS_PER_WORD)));
is true for all backends. If it works on all backends, I can move
it to builtins.c.
--
H.J.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RFC: [PATCH] Add __builtin_ia32_stack_top
2015-07-22 12:21 ` H.J. Lu
@ 2015-07-22 13:59 ` Segher Boessenkool
2015-07-22 14:14 ` H.J. Lu
0 siblings, 1 reply; 7+ messages in thread
From: Segher Boessenkool @ 2015-07-22 13:59 UTC (permalink / raw)
To: H.J. Lu; +Cc: GCC Patches, Uros Bizjak
On Wed, Jul 22, 2015 at 05:10:04AM -0700, H.J. Lu wrote:
> I got a feedback, suggesting __builtin_stack_top, instead of
> __builtin_ia32_stack_top. But I don't know if
>
> + /* After the prologue, stack top is at -WORD(AP) in the current
> + frame. */
> + emit_insn (gen_rtx_SET (target,
> + plus_constant (Pmode, arg_pointer_rtx,
> + -UNITS_PER_WORD)));
>
> is true for all backends. If it works on all backends, I can move
> it to builtins.c.
It doesn't afaik. But can't you define INITIAL_FRAME_ADDRESS_RTX?
Segher
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RFC: [PATCH] Add __builtin_ia32_stack_top
2015-07-22 13:59 ` Segher Boessenkool
@ 2015-07-22 14:14 ` H.J. Lu
2015-07-22 16:01 ` H.J. Lu
0 siblings, 1 reply; 7+ messages in thread
From: H.J. Lu @ 2015-07-22 14:14 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: GCC Patches, Uros Bizjak
On Wed, Jul 22, 2015 at 6:55 AM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Wed, Jul 22, 2015 at 05:10:04AM -0700, H.J. Lu wrote:
>> I got a feedback, suggesting __builtin_stack_top, instead of
>> __builtin_ia32_stack_top. But I don't know if
>>
>> + /* After the prologue, stack top is at -WORD(AP) in the current
>> + frame. */
>> + emit_insn (gen_rtx_SET (target,
>> + plus_constant (Pmode, arg_pointer_rtx,
>> + -UNITS_PER_WORD)));
>>
>> is true for all backends. If it works on all backends, I can move
>> it to builtins.c.
>
> It doesn't afaik. But can't you define INITIAL_FRAME_ADDRESS_RTX?
>
>
> Segher
Does INITIAL_FRAME_ADDRESS_RTX point to stack top? It certainly
can't be defined for x86. I will write a midld-end patch and leave to each
backend to enable it.
--
H.J.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RFC: [PATCH] Add __builtin_ia32_stack_top
2015-07-22 14:14 ` H.J. Lu
@ 2015-07-22 16:01 ` H.J. Lu
0 siblings, 0 replies; 7+ messages in thread
From: H.J. Lu @ 2015-07-22 16:01 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: GCC Patches, Uros Bizjak
[-- Attachment #1: Type: text/plain, Size: 2822 bytes --]
On Wed, Jul 22, 2015 at 6:59 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Jul 22, 2015 at 6:55 AM, Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
>> On Wed, Jul 22, 2015 at 05:10:04AM -0700, H.J. Lu wrote:
>>> I got a feedback, suggesting __builtin_stack_top, instead of
>>> __builtin_ia32_stack_top. But I don't know if
>>>
>>> + /* After the prologue, stack top is at -WORD(AP) in the current
>>> + frame. */
>>> + emit_insn (gen_rtx_SET (target,
>>> + plus_constant (Pmode, arg_pointer_rtx,
>>> + -UNITS_PER_WORD)));
>>>
>>> is true for all backends. If it works on all backends, I can move
>>> it to builtins.c.
>>
>> It doesn't afaik. But can't you define INITIAL_FRAME_ADDRESS_RTX?
>>
>>
>> Segher
>
> Does INITIAL_FRAME_ADDRESS_RTX point to stack top? It certainly
> can't be defined for x86. I will write a midld-end patch and leave to each
> backend to enable it.
Here is a patch. Any comments, feedbacks?
Thanks.
--
H.J.
---
When __builtin_frame_address is used to retrieve the address of the
function stack frame, the frame pointer is always kept, which wastes one
register and 2 instructions. For x86-32, one less register means
significant negative impact on performance. This patch adds a new
builtin function, __builtin_stack_top. It returns the stack address
when the function is called.
This patch only enables __builtin_stack_top for x86 backend. Using
__builtin_stack_top with other backends will lead to
sorry, unimplemented: ‘__builtin_stack_top’ not supported on this target
TARGET_STACK_TOP_RTX must be defined to enable __builtin_stack_top.
default_stack_top_rtx may be extended to support more backends,
including those with INITIAL_FRAME_ADDRESS_RTX.
gcc/
PR target/66960
* builtin-types.def (BT_FN_PTR_VOID): New function type.
* builtins.c (expand_builtin): Handle BUILT_IN_STACK_TOP.
(is_simple_builtin): Likewise.
* ipa-pure-const.c (special_builtin_state): Likewise.
* builtins.def: Add BUILT_IN_STACK_TOP.
* function.h (function): Add stack_top_taken.
* target.def (stack_top_rtx): New target hook.
* targhooks.c (default_stack_top_rtx): New.
* targhooks.h (default_stack_top_rtx): Likewise.
* config/i386/i386.c (ix86_expand_prologue): Sorry if DRAP is
used and the stack address has been taken.
(TARGET_STACK_TOP_RTX): New.
* doc/extend.texi: Document __builtin_stack_top.
* doc/tm.texi.in (TARGET_STACK_TOP_RTX): New.
* doc/tm.texi: Regenerated.
gcc/testsuite/
PR target/66960
* gcc.target/i386/pr66960-1.c: New test.
* gcc.target/i386/pr66960-2.c: Likewise.
* gcc.target/i386/pr66960-3.c: Likewise.
* gcc.target/i386/pr66960-4.c: Likewise.
* gcc.target/i386/pr66960-5.c: Likewise.
[-- Attachment #2: 0001-Add-__builtin_stack_top.patch --]
[-- Type: text/x-patch, Size: 16689 bytes --]
From 53c2dd6e303d48eccf050696020b3765d3c4c382 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Tue, 21 Jul 2015 14:32:09 -0700
Subject: [PATCH] Add __builtin_stack_top
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When __builtin_frame_address is used to retrieve the address of the
function stack frame, the frame pointer is always kept, which wastes one
register and 2 instructions. For x86-32, one less register means
significant negative impact on performance. This patch adds a new
builtin function, __builtin_stack_top. It returns the stack address
when the function is called.
This patch only enables __builtin_stack_top for x86 backend. Using
__builtin_stack_top with other backends will lead to
sorry, unimplemented: ‘__builtin_stack_top’ not supported on this target
TARGET_STACK_TOP_RTX must be defined to enable __builtin_stack_top.
default_stack_top_rtx may be extended to support more backends,
including those with INITIAL_FRAME_ADDRESS_RTX.
gcc/
PR target/66960
* builtin-types.def (BT_FN_PTR_VOID): New function type.
* builtins.c (expand_builtin): Handle BUILT_IN_STACK_TOP.
(is_simple_builtin): Likewise.
* ipa-pure-const.c (special_builtin_state): Likewise.
* builtins.def: Add BUILT_IN_STACK_TOP.
* function.h (function): Add stack_top_taken.
* target.def (stack_top_rtx): New target hook.
* targhooks.c (default_stack_top_rtx): New.
* targhooks.h (default_stack_top_rtx): Likewise.
* config/i386/i386.c (ix86_expand_prologue): Sorry if DRAP is
used and the stack address has been taken.
(TARGET_STACK_TOP_RTX): New.
* doc/extend.texi: Document __builtin_stack_top.
* doc/tm.texi.in (TARGET_STACK_TOP_RTX): New.
* doc/tm.texi: Regenerated.
gcc/testsuite/
PR target/66960
* gcc.target/i386/pr66960-1.c: New test.
* gcc.target/i386/pr66960-2.c: Likewise.
* gcc.target/i386/pr66960-3.c: Likewise.
* gcc.target/i386/pr66960-4.c: Likewise.
* gcc.target/i386/pr66960-5.c: Likewise.
---
gcc/builtin-types.def | 1 +
gcc/builtins.c | 11 +++++++++++
gcc/builtins.def | 1 +
gcc/config/i386/i386.c | 8 ++++++++
gcc/doc/extend.texi | 7 +++++++
gcc/doc/tm.texi | 5 +++++
gcc/doc/tm.texi.in | 2 ++
gcc/function.h | 3 +++
gcc/ipa-pure-const.c | 1 +
gcc/target.def | 7 +++++++
gcc/targhooks.c | 9 +++++++++
gcc/targhooks.h | 3 +++
gcc/testsuite/gcc.target/i386/pr66960-1.c | 33 +++++++++++++++++++++++++++++++
gcc/testsuite/gcc.target/i386/pr66960-2.c | 33 +++++++++++++++++++++++++++++++
gcc/testsuite/gcc.target/i386/pr66960-3.c | 17 ++++++++++++++++
gcc/testsuite/gcc.target/i386/pr66960-4.c | 21 ++++++++++++++++++++
gcc/testsuite/gcc.target/i386/pr66960-5.c | 21 ++++++++++++++++++++
17 files changed, 183 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-1.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-2.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-3.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-4.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-5.c
diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 0e34531..2b6b5ab 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -177,6 +177,7 @@ DEF_FUNCTION_TYPE_1 (BT_FN_COMPLEX_LONGDOUBLE_LONGDOUBLE,
BT_COMPLEX_LONGDOUBLE, BT_LONGDOUBLE)
DEF_FUNCTION_TYPE_1 (BT_FN_PTR_UINT, BT_PTR, BT_UINT)
DEF_FUNCTION_TYPE_1 (BT_FN_PTR_SIZE, BT_PTR, BT_SIZE)
+DEF_FUNCTION_TYPE_1 (BT_FN_PTR_VOID, BT_PTR, BT_VOID)
DEF_FUNCTION_TYPE_1 (BT_FN_INT_INT, BT_INT, BT_INT)
DEF_FUNCTION_TYPE_1 (BT_FN_INT_UINT, BT_INT, BT_UINT)
DEF_FUNCTION_TYPE_1 (BT_FN_INT_LONG, BT_INT, BT_LONG)
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 1750e25..94514b4 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -6218,6 +6218,16 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
case BUILT_IN_CONSTANT_P:
return const0_rtx;
+ case BUILT_IN_STACK_TOP:
+ if (targetm.calls.stack_top_rtx)
+ {
+ cfun->stack_top_taken = true;
+ return targetm.calls.stack_top_rtx ();
+ }
+ else
+ sorry ("%<__builtin_stack_top%> not supported on this target");
+ break;
+
case BUILT_IN_FRAME_ADDRESS:
case BUILT_IN_RETURN_ADDRESS:
return expand_builtin_frame_address (fndecl, exp);
@@ -12407,6 +12417,7 @@ is_simple_builtin (tree decl)
case BUILT_IN_RETURN:
case BUILT_IN_AGGREGATE_INCOMING_ADDRESS:
case BUILT_IN_FRAME_ADDRESS:
+ case BUILT_IN_STACK_TOP:
case BUILT_IN_VA_END:
case BUILT_IN_STACK_SAVE:
case BUILT_IN_STACK_RESTORE:
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 80e4a9c..62f0523 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -778,6 +778,7 @@ DEF_EXT_LIB_BUILTIN (BUILT_IN_FFSL, "ffsl", BT_FN_INT_LONG, ATTR_CONST_NOTHRO
DEF_EXT_LIB_BUILTIN (BUILT_IN_FFSLL, "ffsll", BT_FN_INT_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
DEF_EXT_LIB_BUILTIN (BUILT_IN_FORK, "fork", BT_FN_PID, ATTR_NOTHROW_LIST)
DEF_GCC_BUILTIN (BUILT_IN_FRAME_ADDRESS, "frame_address", BT_FN_PTR_UINT, ATTR_NULL)
+DEF_GCC_BUILTIN (BUILT_IN_STACK_TOP, "stack_top", BT_FN_PTR_VOID, ATTR_NULL)
/* [trans-mem]: Adjust BUILT_IN_TM_FREE if BUILT_IN_FREE is changed. */
DEF_LIB_BUILTIN (BUILT_IN_FREE, "free", BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
DEF_GCC_BUILTIN (BUILT_IN_FROB_RETURN_ADDR, "frob_return_addr", BT_FN_PTR_PTR, ATTR_NULL)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b10569a..6abd96e 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11613,6 +11613,12 @@ ix86_expand_prologue (void)
{
int align_bytes = crtl->stack_alignment_needed / BITS_PER_UNIT;
+ /* Can't use DRAP if the stack address has been taken. */
+ if (cfun->stack_top_taken)
+ sorry ("%<__builtin_stack_top%> not supported with stack"
+ " realignment. This may be worked around by adding"
+ " -maccumulate-outgoing-arg.");
+
/* Only need to push parameter pointer reg if it is caller saved. */
if (!call_used_regs[REGNO (crtl->drap_reg)])
{
@@ -52610,6 +52616,8 @@ ix86_operands_ok_for_move_multiple (rtx *operands, bool load,
#define TARGET_UPDATE_STACK_BOUNDARY ix86_update_stack_boundary
#undef TARGET_GET_DRAP_RTX
#define TARGET_GET_DRAP_RTX ix86_get_drap_rtx
+#undef TARGET_STACK_TOP_RTX
+#define TARGET_STACK_TOP_RTX default_stack_top_rtx
#undef TARGET_STRICT_ARGUMENT_NAMING
#define TARGET_STRICT_ARGUMENT_NAMING hook_bool_CUMULATIVE_ARGS_true
#undef TARGET_STATIC_CHAIN
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index b18d8fb..e08b9f9 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -8695,6 +8695,13 @@ This function should only be used with a nonzero argument for debugging
purposes.
@end deftypefn
+@deftypefn {Built-in Function} {void *} __builtin_stack_top (void)
+This function is similar to calling @code{__builtin_frame_address}
+with a value of @code{0}, but it returns the stack address when the
+function is called. Unlike @code{__builtin_frame_address}, the frame
+pointer register is kept only when necessary.
+@end deftypefn
+
@node Vector Extensions
@section Using Vector Instructions through Built-in Functions
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index b911b7d..428e746 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -11483,6 +11483,11 @@ argument list due to stack realignment. Return @code{NULL} if no DRAP
is needed.
@end deftypefn
+@deftypefn {Target Hook} rtx TARGET_STACK_TOP_RTX (void)
+This hook should return an rtx for the stack address when the function
+is called.
+@end deftypefn
+
@deftypefn {Target Hook} bool TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS (void)
When optimization is disabled, this hook indicates whether or not
arguments should be allocated to stack slots. Normally, GCC allocates
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 47550cc..c68338a 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -8181,6 +8181,8 @@ and the associated definitions of those functions.
@hook TARGET_GET_DRAP_RTX
+@hook TARGET_STACK_TOP_RTX
+
@hook TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS
@hook TARGET_CONST_ANCHOR
diff --git a/gcc/function.h b/gcc/function.h
index e92c17c..dd1c38a 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -378,6 +378,9 @@ struct GTY(()) function {
/* Set when the tail call has been identified. */
unsigned int tail_call_marked : 1;
+
+ /* Set when the address of the stack top has been taken. */
+ unsigned int stack_top_taken : 1;
};
/* Add the decl D to the local_decls list of FUN. */
diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index 8fd8c36..2405082 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -480,6 +480,7 @@ special_builtin_state (enum pure_const_state_e *state, bool *looping,
case BUILT_IN_CXA_END_CLEANUP:
case BUILT_IN_EH_COPY_VALUES:
case BUILT_IN_FRAME_ADDRESS:
+ case BUILT_IN_STACK_TOP:
case BUILT_IN_APPLY:
case BUILT_IN_APPLY_ARGS:
*looping = false;
diff --git a/gcc/target.def b/gcc/target.def
index 4edc209..7a30f39 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -4525,6 +4525,13 @@ argument list due to stack realignment. Return @code{NULL} if no DRAP\n\
is needed.",
rtx, (void), NULL)
+/* Get the stack address when the function is called. */
+DEFHOOK
+(stack_top_rtx,
+ "This hook should return an rtx for the stack address when the function\n\
+is called.",
+ rtx, (void), NULL)
+
/* Return true if all function parameters should be spilled to the
stack. */
DEFHOOK
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 3eca47e..f188272 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1926,4 +1926,13 @@ can_use_doloop_if_innermost (const widest_int &, const widest_int &,
return loop_depth == 1;
}
+/* Get the stack address when the function is called. After the
+ prologue, stack top is at -WORD(AP) in the current frame. */
+
+rtx
+default_stack_top_rtx (void)
+{
+ return plus_constant (Pmode, arg_pointer_rtx, -UNITS_PER_WORD);
+}
+
#include "gt-targhooks.h"
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 5ae991d..094a589 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -240,4 +240,7 @@ extern void default_setup_incoming_vararg_bounds (cumulative_args_t ca ATTRIBUTE
tree type ATTRIBUTE_UNUSED,
int *pretend_arg_size ATTRIBUTE_UNUSED,
int second_time ATTRIBUTE_UNUSED);
+
+extern rtx default_stack_top_rtx (void);
+
#endif /* GCC_TARGHOOKS_H */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-1.c b/gcc/testsuite/gcc.target/i386/pr66960-1.c
new file mode 100644
index 0000000..aaab3cf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-1.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fomit-frame-pointer" { target { lp64 } } } */
+/* { dg-options "-O2 -fomit-frame-pointer -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -fomit-frame-pointer -miamcu" { target { ia32 } } } */
+
+extern char **environ;
+extern void exit (int status);
+extern int main (long argc, char **argv, char **envp);
+
+void
+_start (void)
+{
+ void *argc_p = __builtin_stack_top ();
+ char **argv = (char **) (argc_p + sizeof (void *));
+ long argc = *(long *) argc_p;
+ int status;
+
+ environ = argv + argc + 1;
+
+ status = main (argc, argv, environ);
+
+ exit (status);
+}
+
+/* { dg-final { scan-assembler "movq\[ \t\]8\\(%rsp\\), %rdi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]16\\(%rsp\\), %rsi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]24\\(%rsp,%rdi,8\\), %rdx" { target lp64 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]8\\(%esp\\), %edi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%rsp\\), %esi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%rsi,%rdi,4\\), %edx" { target x32 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]\\(%esp\\), %eax" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%esp\\), %edx" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%esp,%eax,4\\), %ecx" { target ia32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-2.c b/gcc/testsuite/gcc.target/i386/pr66960-2.c
new file mode 100644
index 0000000..b9dbde2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-2.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer" { target { lp64 } } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer -miamcu" { target { ia32 } } } */
+
+extern char **environ;
+extern void exit (int status);
+extern int main (long argc, char **argv, char **envp);
+
+void
+_start (void)
+{
+ void *argc_p = __builtin_stack_top ();
+ char **argv = (char **) (argc_p + sizeof (void *));
+ long argc = *(long *) argc_p;
+ int status;
+
+ environ = argv + argc + 1;
+
+ status = main (argc, argv, environ);
+
+ exit (status);
+}
+
+/* { dg-final { scan-assembler "movq\[ \t\]8\\(%rbp\\), %rdi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]16\\(%rbp\\), %rsi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]24\\(%rbp,%rdi,8\\), %rdx" { target lp64 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]8\\(%ebp\\), %edi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%rbp\\), %esi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%rsi,%rdi,4\\), %edx" { target x32 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]4\\(%ebp\\), %eax" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%ebp\\), %edx" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%ebp,%eax,4\\), %ecx" { target ia32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-3.c b/gcc/testsuite/gcc.target/i386/pr66960-3.c
new file mode 100644
index 0000000..48cf25e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args" { target { lp64 } } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args -miamcu" { target { ia32 } } } */
+
+extern void abort (void);
+extern int check_int (int *i, int align);
+typedef int aligned __attribute__((aligned(64)));
+
+void *
+foo (void)
+{
+ aligned j;
+ if (check_int (&j, __alignof__(j)) != j)
+ abort ();
+ return __builtin_stack_top ();
+} /* { dg-message "sorry, unimplemented: .__builtin_stack_top. not supported" } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-4.c b/gcc/testsuite/gcc.target/i386/pr66960-4.c
new file mode 100644
index 0000000..44c0b26
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-4.c
@@ -0,0 +1,21 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args" { target { lp64 } } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args -miamcu" { target { ia32 } } } */
+
+extern void abort (void);
+extern int check_int (int *i, int align);
+typedef int aligned __attribute__((aligned(64)));
+
+void *
+foo (void)
+{
+ aligned j;
+ if (check_int (&j, __alignof__(j)) != j)
+ abort ();
+ return __builtin_stack_top ();
+}
+
+/* { dg-final { scan-assembler "leaq\[ \t\]8\\(%rbp\\), %rax" { target lp64 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%rbp\\), %eax" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%ebp\\), %eax" { target ia32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-5.c b/gcc/testsuite/gcc.target/i386/pr66960-5.c
new file mode 100644
index 0000000..d449437
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-5.c
@@ -0,0 +1,21 @@
+/* { dg-do link } */
+/* { dg-options "-O" } */
+
+extern void link_error (void);
+
+__attribute__ ((noinline, noclone))
+void
+foo (void)
+{
+ void **p = __builtin_stack_top ();
+ void *ra = __builtin_return_address (0);
+ if (*p != ra)
+ link_error ();
+}
+
+int
+main (void)
+{
+ foo ();
+ return 0;
+}
--
2.4.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] Add __builtin_stack_top to x86 backend
2015-07-21 22:56 RFC: [PATCH] Add __builtin_ia32_stack_top H.J. Lu
2015-07-22 12:21 ` H.J. Lu
@ 2015-07-30 19:44 ` H.J. Lu
2015-08-03 8:31 ` Uros Bizjak
1 sibling, 1 reply; 7+ messages in thread
From: H.J. Lu @ 2015-07-30 19:44 UTC (permalink / raw)
To: gcc-patches, Uros Bizjak
On Tue, Jul 21, 2015 at 02:45:39PM -0700, H.J. Lu wrote:
> When __builtin_frame_address is used to retrieve the address of the
> function stack frame, the frame pointer is always kept, which wastes one
> register and 2 instructions. For x86-32, one less register means
> significant negative impact on performance. This patch adds a new
> builtin function, __builtin_ia32_stack_top, to x86 backend. It
> returns the stack address when the function is called.
>
> Any comments, feedbacks?
>
Although this function is generic, but implementation is target
specific. I submitted a generic patch:
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01859.html
So far there are no interests from other backends. Here is a patch
to implement __builtin_stack_top in x86 backend. We can update x86
backedn after it is added to middle-end. OK for trunk?
Thanks.
H.J.
--
gcc/
PR target/66960
* config/i386/i386.c (ix86_expand_prologue): Sorry if DRAP is
used and the stack address has been taken.
(ix86_builtins): Add IX86_BUILTIN_STACK_TOP.
(ix86_init_mmx_sse_builtins): Add __builtin_stack_top.
(ix86_expand_builtin): Handle IX86_BUILTIN_STACK_TOP.
* config/i386/i386.h (machine_function): Add stack_top_taken.
* doc/extend.texi: Document __builtin_stack_top.
gcc/testsuite/
PR target/66960
* gcc.target/i386/pr66960-1.c: New test.
* gcc.target/i386/pr66960-2.c: Likewise.
* gcc.target/i386/pr66960-3.c: Likewise.
* gcc.target/i386/pr66960-4.c: Likewise.
* gcc.target/i386/pr66960-5.c: Likewise.
---
gcc/config/i386/i386.c | 28 ++++++++++++++++++++++++++
gcc/config/i386/i386.h | 4 ++++
gcc/doc/extend.texi | 6 ++++++
gcc/testsuite/gcc.target/i386/pr66960-1.c | 33 +++++++++++++++++++++++++++++++
gcc/testsuite/gcc.target/i386/pr66960-2.c | 33 +++++++++++++++++++++++++++++++
gcc/testsuite/gcc.target/i386/pr66960-3.c | 17 ++++++++++++++++
gcc/testsuite/gcc.target/i386/pr66960-4.c | 21 ++++++++++++++++++++
gcc/testsuite/gcc.target/i386/pr66960-5.c | 21 ++++++++++++++++++++
8 files changed, 163 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-1.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-2.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-3.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-4.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr66960-5.c
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ede8ea0..ef7ba6d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11575,6 +11575,12 @@ ix86_expand_prologue (void)
{
int align_bytes = crtl->stack_alignment_needed / BITS_PER_UNIT;
+ /* Can't use DRAP if the stack address has been taken. */
+ if (cfun->machine->stack_top_taken)
+ sorry ("%<__builtin_stack_top%> not supported with stack"
+ " realignment. This may be worked around by adding"
+ " -maccumulate-outgoing-arg.");
+
/* Only need to push parameter pointer reg if it is caller saved. */
if (!call_used_regs[REGNO (crtl->drap_reg)])
{
@@ -30741,6 +30747,9 @@ enum ix86_builtins
IX86_BUILTIN_READ_FLAGS,
IX86_BUILTIN_WRITE_FLAGS,
+ /* Get the stack address when the function is called. */
+ IX86_BUILTIN_STACK_TOP,
+
IX86_BUILTIN_MAX
};
@@ -34353,6 +34362,10 @@ ix86_init_mmx_sse_builtins (void)
def_builtin (OPTION_MASK_ISA_MWAITX, "__builtin_ia32_mwaitx",
VOID_FTYPE_UNSIGNED_UNSIGNED_UNSIGNED, IX86_BUILTIN_MWAITX);
+ /* Get the stack address when the function is called. */
+ def_builtin (0, "__builtin_stack_top",
+ PVOID_FTYPE_VOID, IX86_BUILTIN_STACK_TOP);
+
/* Add FMA4 multi-arg argument instructions */
for (i = 0, d = bdesc_multi_arg; i < ARRAY_SIZE (bdesc_multi_arg); i++, d++)
{
@@ -40291,6 +40304,21 @@ addcarryx:
emit_insn (gen_xabort (op0));
return 0;
+ case IX86_BUILTIN_STACK_TOP:
+ cfun->machine->stack_top_taken = true;
+
+ if (!target
+ || GET_MODE (target) != Pmode
+ || !register_operand (target, Pmode))
+ target = gen_reg_rtx (Pmode);
+
+ /* After the prologue, stack top is at -WORD(AP) in the current
+ frame. */
+ emit_insn (gen_rtx_SET (target,
+ plus_constant (Pmode, arg_pointer_rtx,
+ -UNITS_PER_WORD)));
+ return target;
+
default:
break;
}
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 7bd23ec..3781a22 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2492,6 +2492,10 @@ struct GTY(()) machine_function {
/* If true, it is safe to not save/restore DRAP register. */
BOOL_BITFIELD no_drap_save_restore : 1;
+ /* If true, the stack address of the current function has been
+ taken. */
+ BOOL_BITFIELD stack_top_taken : 1;
+
/* If true, there is register available for argument passing. This
is used only in ix86_function_ok_for_sibcall by 32-bit to determine
if there is scratch register available for indirect sibcall. In
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index b18d8fb..971b584 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -16661,6 +16661,12 @@ The following built-in function is always available.
@item void __builtin_ia32_pause (void)
Generates the @code{pause} machine instruction with a compiler memory
barrier.
+
+@item void *__builtin_stack_top (void)
+This function is similar to calling @code{__builtin_frame_address}
+with a value of @code{0}, but it returns the stack address when the
+function is called. Unlike @code{__builtin_frame_address}, the frame
+pointer register isn't required.
@end table
The following floating-point built-in functions are made available in the
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-1.c b/gcc/testsuite/gcc.target/i386/pr66960-1.c
new file mode 100644
index 0000000..aaab3cf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-1.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fomit-frame-pointer" { target { lp64 } } } */
+/* { dg-options "-O2 -fomit-frame-pointer -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -fomit-frame-pointer -miamcu" { target { ia32 } } } */
+
+extern char **environ;
+extern void exit (int status);
+extern int main (long argc, char **argv, char **envp);
+
+void
+_start (void)
+{
+ void *argc_p = __builtin_stack_top ();
+ char **argv = (char **) (argc_p + sizeof (void *));
+ long argc = *(long *) argc_p;
+ int status;
+
+ environ = argv + argc + 1;
+
+ status = main (argc, argv, environ);
+
+ exit (status);
+}
+
+/* { dg-final { scan-assembler "movq\[ \t\]8\\(%rsp\\), %rdi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]16\\(%rsp\\), %rsi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]24\\(%rsp,%rdi,8\\), %rdx" { target lp64 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]8\\(%esp\\), %edi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%rsp\\), %esi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%rsi,%rdi,4\\), %edx" { target x32 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]\\(%esp\\), %eax" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%esp\\), %edx" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%esp,%eax,4\\), %ecx" { target ia32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-2.c b/gcc/testsuite/gcc.target/i386/pr66960-2.c
new file mode 100644
index 0000000..b9dbde2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-2.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer" { target { lp64 } } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -fno-omit-frame-pointer -miamcu" { target { ia32 } } } */
+
+extern char **environ;
+extern void exit (int status);
+extern int main (long argc, char **argv, char **envp);
+
+void
+_start (void)
+{
+ void *argc_p = __builtin_stack_top ();
+ char **argv = (char **) (argc_p + sizeof (void *));
+ long argc = *(long *) argc_p;
+ int status;
+
+ environ = argv + argc + 1;
+
+ status = main (argc, argv, environ);
+
+ exit (status);
+}
+
+/* { dg-final { scan-assembler "movq\[ \t\]8\\(%rbp\\), %rdi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]16\\(%rbp\\), %rsi" { target lp64 } } } */
+/* { dg-final { scan-assembler "leaq\[ \t\]24\\(%rbp,%rdi,8\\), %rdx" { target lp64 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]8\\(%ebp\\), %edi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%rbp\\), %esi" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%rsi,%rdi,4\\), %edx" { target x32 } } } */
+/* { dg-final { scan-assembler "movl\[ \t\]4\\(%ebp\\), %eax" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%ebp\\), %edx" { target ia32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]12\\(%ebp,%eax,4\\), %ecx" { target ia32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-3.c b/gcc/testsuite/gcc.target/i386/pr66960-3.c
new file mode 100644
index 0000000..48cf25e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args" { target { lp64 } } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args -miamcu" { target { ia32 } } } */
+
+extern void abort (void);
+extern int check_int (int *i, int align);
+typedef int aligned __attribute__((aligned(64)));
+
+void *
+foo (void)
+{
+ aligned j;
+ if (check_int (&j, __alignof__(j)) != j)
+ abort ();
+ return __builtin_stack_top ();
+} /* { dg-message "sorry, unimplemented: .__builtin_stack_top. not supported" } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-4.c b/gcc/testsuite/gcc.target/i386/pr66960-4.c
new file mode 100644
index 0000000..44c0b26
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-4.c
@@ -0,0 +1,21 @@
+/* { dg-do compile { target *-*-linux* } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args" { target { lp64 } } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args -maddress-mode=short" { target { x32 } } } */
+/* { dg-options "-O2 -maccumulate-outgoing-args -miamcu" { target { ia32 } } } */
+
+extern void abort (void);
+extern int check_int (int *i, int align);
+typedef int aligned __attribute__((aligned(64)));
+
+void *
+foo (void)
+{
+ aligned j;
+ if (check_int (&j, __alignof__(j)) != j)
+ abort ();
+ return __builtin_stack_top ();
+}
+
+/* { dg-final { scan-assembler "leaq\[ \t\]8\\(%rbp\\), %rax" { target lp64 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]8\\(%rbp\\), %eax" { target x32 } } } */
+/* { dg-final { scan-assembler "leal\[ \t\]4\\(%ebp\\), %eax" { target ia32 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr66960-5.c b/gcc/testsuite/gcc.target/i386/pr66960-5.c
new file mode 100644
index 0000000..d449437
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr66960-5.c
@@ -0,0 +1,21 @@
+/* { dg-do link } */
+/* { dg-options "-O" } */
+
+extern void link_error (void);
+
+__attribute__ ((noinline, noclone))
+void
+foo (void)
+{
+ void **p = __builtin_stack_top ();
+ void *ra = __builtin_return_address (0);
+ if (*p != ra)
+ link_error ();
+}
+
+int
+main (void)
+{
+ foo ();
+ return 0;
+}
--
2.4.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Add __builtin_stack_top to x86 backend
2015-07-30 19:44 ` [PATCH] Add __builtin_stack_top to x86 backend H.J. Lu
@ 2015-08-03 8:31 ` Uros Bizjak
0 siblings, 0 replies; 7+ messages in thread
From: Uros Bizjak @ 2015-08-03 8:31 UTC (permalink / raw)
To: H.J. Lu; +Cc: gcc-patches, Segher Boessenkool
On Thu, Jul 30, 2015 at 8:41 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
> On Tue, Jul 21, 2015 at 02:45:39PM -0700, H.J. Lu wrote:
>> When __builtin_frame_address is used to retrieve the address of the
>> function stack frame, the frame pointer is always kept, which wastes one
>> register and 2 instructions. For x86-32, one less register means
>> significant negative impact on performance. This patch adds a new
>> builtin function, __builtin_ia32_stack_top, to x86 backend. It
>> returns the stack address when the function is called.
>>
>> Any comments, feedbacks?
>>
>
> Although this function is generic, but implementation is target
> specific. I submitted a generic patch:
>
> https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01859.html
>
> So far there are no interests from other backends. Here is a patch
> to implement __builtin_stack_top in x86 backend. We can update x86
> backedn after it is added to middle-end. OK for trunk?
I think that the discussion about generic implementation should come
to some conclusion first. From the discussion, here was no resolution
on which way to go.
Uros.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-08-03 8:31 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-21 22:56 RFC: [PATCH] Add __builtin_ia32_stack_top H.J. Lu
2015-07-22 12:21 ` H.J. Lu
2015-07-22 13:59 ` Segher Boessenkool
2015-07-22 14:14 ` H.J. Lu
2015-07-22 16:01 ` H.J. Lu
2015-07-30 19:44 ` [PATCH] Add __builtin_stack_top to x86 backend H.J. Lu
2015-08-03 8:31 ` Uros Bizjak
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).