From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22610 invoked by alias); 29 Oct 2009 16:45:21 -0000 Received: (qmail 22493 invoked by uid 22791); 29 Oct 2009 16:45:11 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from mel.act-europe.fr (HELO mel.act-europe.fr) (212.99.106.210) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 29 Oct 2009 16:45:01 +0000 Received: from localhost (localhost [127.0.0.1]) by filtered-smtp.eu.adacore.com (Postfix) with ESMTP id 007D029000F; Thu, 29 Oct 2009 17:44:59 +0100 (CET) Received: from mel.act-europe.fr ([127.0.0.1]) by localhost (smtp.eu.adacore.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wT3PUnSrMfXZ; Thu, 29 Oct 2009 17:44:52 +0100 (CET) Received: from [192.168.1.2] (88-122-76-2.rev.libertysurf.net [88.122.76.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mel.act-europe.fr (Postfix) with ESMTP id 213B8290022; Thu, 29 Oct 2009 17:44:51 +0100 (CET) From: Eric Botcazou To: Ian Lance Taylor Subject: Re: [Patch] New -fstack-check implementation (2/n) Date: Thu, 29 Oct 2009 17:18:00 -0000 User-Agent: KMail/1.9.6 (enterprise 20070904.708012) Cc: gcc-patches@gcc.gnu.org References: <200908041337.11922.ebotcazou@adacore.com> <200909300112.32829.ebotcazou@adacore.com> In-Reply-To: <200909300112.32829.ebotcazou@adacore.com> MIME-Version: 1.0 Content-Type: Multipart/Mixed; boundary="Boundary-00=_9cc6KewQO3MtzJS" Message-Id: <200910291747.57447.ebotcazou@adacore.com> Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2009-10/txt/msg01746.txt.bz2 --Boundary-00=_9cc6KewQO3MtzJS Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Content-length: 3636 > OK, I've removed all the non-essential bits from the patch and left some > broken stuff in the compiler as-is. This implements working stack checking > for x86/x86-64, Linux and Solaris only. Full ACATS passes with > -fstack-check as well as with -O2 -fstack-check. Updated patch, the Ada bits have been installed independently in-between. Re-tested on i586-suse-linux, OK for mainline? 2009-10-29 Eric Botcazou PR target/10127 PR ada/20548 * expr.h (STACK_CHECK_PROBE_INTERVAL): Delete. (STACK_CHECK_PROBE_INTERVAL_EXP): New macro. (STACK_CHECK_MOVING_SP): Likewise. * system.h (STACK_CHECK_PROBE_INTERVAL): Poison it. * doc/tm.texi (Stack Checking): Delete STACK_CHECK_PROBE_INTERVAL. Document STACK_CHECK_PROBE_INTERVAL_EXP and STACK_CHECK_MOVING_SP. * explow.c (anti_adjust_stack_and_probe): New function. (allocate_dynamic_stack_space): Do not directly allocate space if STACK_CHECK_MOVING_SP, instead invoke above function. (emit_stack_probe): Handle probe_stack insn. (PROBE_INTERVAL): New macro. (STACK_GROW_OPTAB): Likewise. (STACK_HIGH, STACK_LOW): Likewise. (probe_stack_range): Remove support code for dedicated pattern. Fix loop condition in the small constant case. Rewrite in the general case to be immune to wraparounds. Make sure the address of probes is valid. Try to use [base + disp] addressing mode if possible. * ira.c (setup_eliminable_regset): Set frame_pointer_needed if stack checking is enabled and STACK_CHECK_MOVING_SP. * rtlanal.c (may_trap_p_1) : If stack checking is enabled, return 1 for volatile references to the stack pointer. * tree.c (build_common_builtin_nodes): Do not set ECF_NOTHROW on __builtin_alloca if stack checking is enabled. * unwind-dw2.c (uw_identify_context): Take into account whether the context is that of a signal frame or not. * config/i386/linux-unwind.h (x86_frob_update_context): New function. (MD_FROB_UPDATE_CONTEXT): Define. * config/i386/linux.h (STACK_CHECK_STATIC_BUILTIN): Likewise. (STACK_CHECK_MOVING_SP): Likewise. * config/i386/linux64.h (STACK_CHECK_STATIC_BUILTIN): Likewise. (STACK_CHECK_MOVING_SP): Likewise. * config/i386/sol2.h (STACK_CHECK_STATIC_BUILTIN): Likewise. * config/i386/i386.c (ix86_compute_frame_layout): Force use of push instructions to save registers if stack checking with probes is on. (get_scratch_register_on_entry): New function. (release_scratch_register_on_entry): Likewise. (output_probe_op): Likewise. (output_adjust_stack_and_probe_op): Likewise. (output_adjust_stack_and_probe): Likewise. (ix86_gen_adjust_stack_and_probe): Likewise. (ix86_adjust_stack_and_probe): Likewise. (output_probe_stack_range_op): Likewise. (ix86_gen_probe_stack_range): Likewise (ix86_emit_probe_stack_range): Likewise. (ix86_expand_prologue): Emit stack checking code if static builtin stack checking is enabled. * config/i386/i386-protos.h (output_adjust_stack_and_probe): Declare. (output_probe_stack_range): Likewise. * config/i386/i386.md (UNSPECV_STACK_PROBE_INLINE): New constant. (probe_stack): New expander. (adjust_stack_and_probe): New insn. (probe_stack_range): Likewise. (logical operation peepholes): Do not split stack checking probes. -- Eric Botcazou --Boundary-00=_9cc6KewQO3MtzJS Content-Type: text/x-diff; charset="iso-8859-1"; name="gcc-45_stack-check-2c.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="gcc-45_stack-check-2c.diff" Content-length: 40704 Index: doc/tm.texi =================================================================== --- doc/tm.texi (revision 153694) +++ doc/tm.texi (working copy) @@ -3542,11 +3542,12 @@ like to do static stack checking in some approach. The default value of this macro is zero. @end defmac -@defmac STACK_CHECK_PROBE_INTERVAL -An integer representing the interval at which GCC must generate stack -probe instructions. You will normally define this macro to be no larger -than the size of the ``guard pages'' at the end of a stack area. The -default value of 4096 is suitable for most systems. +@defmac STACK_CHECK_PROBE_INTERVAL_EXP +An integer specifying the interval at which GCC must generate stack probe +instructions, defined as 2 raised to this integer. You will normally +define this macro so that the interval be no larger than the size of +the ``guard pages'' at the end of a stack area. The default value +of 12 (4096-byte interval) is suitable for most systems. @end defmac @defmac STACK_CHECK_PROBE_LOAD @@ -3555,6 +3556,15 @@ as a load instruction and zero if GCC sh The default is zero, which is the most efficient choice on most systems. @end defmac +@defmac STACK_CHECK_MOVING_SP +An integer which is nonzero if GCC should move the stack pointer page by page +when doing probes. This can be necessary on systems where the stack pointer +contains the bottom address of the memory area accessible to the executing +thread at any point in time. In this situation an alternate signal stack +is required in order to be able to recover from a stack overflow. The +default value of this macro is zero. +@end defmac + @defmac STACK_CHECK_PROTECT The number of bytes of stack needed to recover from a stack overflow, for languages where such a recovery is supported. The default value of Index: tree.c =================================================================== --- tree.c (revision 153694) +++ tree.c (working copy) @@ -9009,7 +9009,8 @@ build_common_builtin_nodes (void) tmp = tree_cons (NULL_TREE, size_type_node, void_list_node); ftype = build_function_type (ptr_type_node, tmp); local_define_builtin ("__builtin_alloca", ftype, BUILT_IN_ALLOCA, - "alloca", ECF_NOTHROW | ECF_MALLOC); + "alloca", + ECF_MALLOC | (flag_stack_check ? 0 : ECF_NOTHROW)); } tmp = tree_cons (NULL_TREE, ptr_type_node, void_list_node); Index: rtlanal.c =================================================================== --- rtlanal.c (revision 153694) +++ rtlanal.c (working copy) @@ -2252,6 +2252,11 @@ may_trap_p_1 (const_rtx x, unsigned flag /* Memory ref can trap unless it's a static var or a stack slot. */ case MEM: + /* Recognize specific pattern of stack checking probes. */ + if (flag_stack_check + && MEM_VOLATILE_P (x) + && XEXP (x, 0) == stack_pointer_rtx) + return 1; if (/* MEM_NOTRAP_P only relates to the actual position of the memory reference; moving it out of context such as when moving code when optimizing, might cause its address to become invalid. */ Index: expr.h =================================================================== --- expr.h (revision 153694) +++ expr.h (working copy) @@ -218,9 +218,9 @@ do { \ #define STACK_CHECK_STATIC_BUILTIN 0 #endif -/* The default interval is one page. */ -#ifndef STACK_CHECK_PROBE_INTERVAL -#define STACK_CHECK_PROBE_INTERVAL 4096 +/* The default interval is one page (4096 bytes). */ +#ifndef STACK_CHECK_PROBE_INTERVAL_EXP +#define STACK_CHECK_PROBE_INTERVAL_EXP 12 #endif /* The default is to do a store into the stack. */ @@ -228,6 +228,11 @@ do { \ #define STACK_CHECK_PROBE_LOAD 0 #endif +/* The default is not to move the stack pointer. */ +#ifndef STACK_CHECK_MOVING_SP +#define STACK_CHECK_MOVING_SP 0 +#endif + /* This is a kludge to try to capture the discrepancy between the old mechanism (generic stack checking) and the new mechanism (static builtin stack checking). STACK_CHECK_PROTECT needs to be bumped @@ -252,7 +257,7 @@ do { \ one probe per function. */ #ifndef STACK_CHECK_MAX_FRAME_SIZE #define STACK_CHECK_MAX_FRAME_SIZE \ - (STACK_CHECK_PROBE_INTERVAL - UNITS_PER_WORD) + ((1 << STACK_CHECK_PROBE_INTERVAL_EXP) - UNITS_PER_WORD) #endif /* This is arbitrary, but should be large enough everywhere. */ Index: unwind-dw2.c =================================================================== --- unwind-dw2.c (revision 153694) +++ unwind-dw2.c (working copy) @@ -1559,7 +1559,13 @@ uw_install_context_1 (struct _Unwind_Con static inline _Unwind_Ptr uw_identify_context (struct _Unwind_Context *context) { - return _Unwind_GetCFA (context); + /* The CFA is not sufficient to disambiguate the context of a function + interrupted by a signal before establishing its frame and the context + of the signal itself. */ + if (STACK_GROWS_DOWNWARD) + return _Unwind_GetCFA (context) - _Unwind_IsSignalFrame (context); + else + return _Unwind_GetCFA (context) + _Unwind_IsSignalFrame (context); } Index: explow.c =================================================================== --- explow.c (revision 153694) +++ explow.c (working copy) @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3. static rtx break_out_memory_refs (rtx); static void emit_stack_probe (rtx); +static void anti_adjust_stack_and_probe (rtx); /* Truncate and perhaps sign-extend C as appropriate for MODE. */ @@ -1235,7 +1236,9 @@ allocate_dynamic_stack_space (rtx size, /* If needed, check that we have the required amount of stack. Take into account what has already been checked. */ - if (flag_stack_check == GENERIC_STACK_CHECK) + if (STACK_CHECK_MOVING_SP) + ; + else if (flag_stack_check == GENERIC_STACK_CHECK) probe_stack_range (STACK_OLD_CHECK_PROTECT + STACK_CHECK_MAX_FRAME_SIZE, size); else if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK) @@ -1304,7 +1307,10 @@ allocate_dynamic_stack_space (rtx size, emit_label (space_available); } - anti_adjust_stack (size); + if (flag_stack_check && STACK_CHECK_MOVING_SP) + anti_adjust_stack_and_probe (size); + else + anti_adjust_stack (size); #ifdef STACK_GROWS_DOWNWARD emit_move_insn (target, virtual_stack_dynamic_rtx); @@ -1355,6 +1361,12 @@ emit_stack_probe (rtx address) MEM_VOLATILE_P (memref) = 1; + /* See if we have an insn to probe the stack. */ +#ifdef HAVE_probe_stack + if (HAVE_probe_stack) + emit_insn (gen_probe_stack (memref)); + else +#endif if (STACK_CHECK_PROBE_LOAD) emit_move_insn (gen_reg_rtx (word_mode), memref); else @@ -1367,10 +1379,18 @@ emit_stack_probe (rtx address) subtract from the stack. If SIZE is constant, this is done with a fixed number of probes. Otherwise, we must make a loop. */ +#define PROBE_INTERVAL (1 << STACK_CHECK_PROBE_INTERVAL_EXP) + #ifdef STACK_GROWS_DOWNWARD -#define STACK_GROW_OP MINUS +#define STACK_GROW_OP MINUS +#define STACK_GROW_OPTAB sub_optab +#define STACK_HIGH(high,low) low +#define STACK_LOW(high,low) high #else -#define STACK_GROW_OP PLUS +#define STACK_GROW_OP PLUS +#define STACK_GROW_OPTAB add_optab +#define STACK_HIGH(high,low) high +#define STACK_LOW(high,low) low #endif void @@ -1394,99 +1414,252 @@ probe_stack_range (HOST_WIDE_INT first, ptr_mode); } - /* Next see if we have an insn to check the stack. Use it if so. */ -#ifdef HAVE_check_stack - else if (HAVE_check_stack) + /* Otherwise we have to generate explicit probes. If we have a constant + small number of them to generate, that's the easy case. */ + else if (CONST_INT_P (size) && INTVAL (size) < 7 * PROBE_INTERVAL) + { + HOST_WIDE_INT i, offset, size_int = INTVAL (size); + rtx addr; + + /* Probe at FIRST + N * PROBE_INTERVAL for values of N from 1 until + it exceeds SIZE. If only one probe is needed, this will not + generate any code. Then probe at FIRST + SIZE. */ + for (i = PROBE_INTERVAL; i < size_int; i += PROBE_INTERVAL) + { + offset = first + i; +#ifdef STACK_GROWS_DOWNWARD + offset = -offset; +#endif + addr = memory_address (Pmode, + plus_constant (stack_pointer_rtx, offset)); + emit_stack_probe (addr); + } + + offset = first + size_int; +#ifdef STACK_GROWS_DOWNWARD + offset = -offset; +#endif + addr = memory_address (Pmode, plus_constant (stack_pointer_rtx, offset)); + emit_stack_probe (addr); + } + + /* In the variable case, do the same as above, but in a loop. Note that we + must be extra careful with variables wrapping around because we might be + at the very top (or the very bottom) of the address space and we have to + be able to handle this case properly; in particular, we use an equality + test for the loop condition. */ + else { - insn_operand_predicate_fn pred; - rtx last_addr - = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode, - stack_pointer_rtx, - plus_constant (size, first)), - NULL_RTX); + rtx rounded_size, rounded_size_op, test_addr, last_addr, temp; + rtx loop_lab = gen_label_rtx (); + rtx end_lab = gen_label_rtx (); - pred = insn_data[(int) CODE_FOR_check_stack].operand[0].predicate; - if (pred && ! ((*pred) (last_addr, Pmode))) - last_addr = copy_to_mode_reg (Pmode, last_addr); + /* Step 1: round SIZE to the previous multiple of the interval. */ - emit_insn (gen_check_stack (last_addr)); - } + /* ROUNDED_SIZE = SIZE & -PROBE_INTERVAL */ + rounded_size = simplify_gen_binary (AND, Pmode, + size, + GEN_INT (-PROBE_INTERVAL)); + rounded_size_op = force_operand (rounded_size, NULL_RTX); + + + /* Step 2: compute initial and final value of the loop counter. */ + + /* TEST_ADDR = SP + FIRST. */ + test_addr = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode, + stack_pointer_rtx, + GEN_INT (first)), + NULL_RTX); + + /* LAST_ADDR = SP + FIRST + ROUNDED_SIZE. */ + last_addr = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode, + test_addr, + rounded_size_op), + NULL_RTX); + + + /* Step 3: the loop + + while (TEST_ADDR != LAST_ADDR) + { + TEST_ADDR = TEST_ADDR + PROBE_INTERVAL + probe at TEST_ADDR + } + + probes at FIRST + N * PROBE_INTERVAL for values of N from 1 + until it is equal to ROUNDED_SIZE. */ + + emit_label (loop_lab); + + /* Jump to END_LAB if TEST_ADDR == LAST_ADDR. */ + emit_cmp_and_jump_insns (test_addr, last_addr, EQ, + NULL_RTX, Pmode, 1, end_lab); + + /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL. */ + temp = expand_binop (Pmode, STACK_GROW_OPTAB, test_addr, + GEN_INT (PROBE_INTERVAL), test_addr, + 1, OPTAB_WIDEN); + + gcc_assert (temp == test_addr); + + /* Probe at TEST_ADDR. */ + emit_stack_probe (test_addr); + + emit_jump (loop_lab); + + emit_label (end_lab); + + /* Step 4: probe at FIRST + SIZE if we cannot assert at compile-time + that SIZE is equal to ROUNDED_SIZE. */ + + /* TEMP = SIZE - ROUNDED_SIZE. */ + temp = simplify_gen_binary (MINUS, Pmode, size, rounded_size); + if (temp != const0_rtx) + { + rtx addr; + + if (GET_CODE (temp) == CONST_INT) + { + /* Use [base + disp} addressing mode if supported. */ + HOST_WIDE_INT offset = INTVAL (temp); +#ifdef STACK_GROWS_DOWNWARD + offset = -offset; #endif + addr = memory_address (Pmode, plus_constant (last_addr, offset)); + } + else + { + /* Manual CSE if the difference is not known at compile-time. */ + temp = gen_rtx_MINUS (Pmode, size, rounded_size_op); + addr = memory_address (Pmode, + gen_rtx_fmt_ee (STACK_GROW_OP, Pmode, + last_addr, temp)); + } - /* If we have to generate explicit probes, see if we have a constant - small number of them to generate. If so, that's the easy case. */ - else if (CONST_INT_P (size) - && INTVAL (size) < 10 * STACK_CHECK_PROBE_INTERVAL) - { - HOST_WIDE_INT offset; + emit_stack_probe (addr); + } + } +} + +/* Adjust the stack by SIZE bytes while probing it. Note that we skip + the probe for the first interval and instead probe one interval past + the specified size in order to maintain a protection area. */ + +static void +anti_adjust_stack_and_probe (rtx size) +{ + rtx probe_interval = GEN_INT (PROBE_INTERVAL); - /* Start probing at FIRST + N * STACK_CHECK_PROBE_INTERVAL - for values of N from 1 until it exceeds LAST. If only one - probe is needed, this will not generate any code. Then probe - at LAST. */ - for (offset = first + STACK_CHECK_PROBE_INTERVAL; - offset < INTVAL (size); - offset = offset + STACK_CHECK_PROBE_INTERVAL) - emit_stack_probe (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode, - stack_pointer_rtx, - GEN_INT (offset))); + /* First ensure SIZE is Pmode. */ + if (GET_MODE (size) != VOIDmode && GET_MODE (size) != Pmode) + size = convert_to_mode (Pmode, size, 1); - emit_stack_probe (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode, - stack_pointer_rtx, - plus_constant (size, first))); + /* If we have a constant small number of probes to generate, that's the + easy case. */ + if (GET_CODE (size) == CONST_INT && INTVAL (size) < 7 * PROBE_INTERVAL) + { + HOST_WIDE_INT i, int_size = INTVAL (size); + bool first_probe = true; + + /* Adjust SP and probe to PROBE_INTERVAL + N * PROBE_INTERVAL for + values of N from 1 until it exceeds SIZE. If only one probe is + needed, this will not generate any code. Then adjust and probe + to PROBE_INTERVAL + SIZE. */ + for (i = PROBE_INTERVAL; i < int_size; i += PROBE_INTERVAL) + { + if (first_probe) + { + anti_adjust_stack (GEN_INT (2 * PROBE_INTERVAL)); + first_probe = false; + } + else + anti_adjust_stack (probe_interval); + emit_stack_probe (stack_pointer_rtx); + } + + if (first_probe) + anti_adjust_stack (plus_constant (size, PROBE_INTERVAL)); + else + anti_adjust_stack (plus_constant (size, PROBE_INTERVAL - i)); + emit_stack_probe (stack_pointer_rtx); } - /* In the variable case, do the same as above, but in a loop. We emit loop - notes so that loop optimization can be done. */ + /* In the variable case, do the same as above, but in a loop. Note that we + must be extra careful with variables wrapping around because we might be + at the very top (or the very bottom) of the address space and we have to + be able to handle this case properly; in particular, we use an equality + test for the loop condition. */ else { - rtx test_addr - = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode, - stack_pointer_rtx, - GEN_INT (first + STACK_CHECK_PROBE_INTERVAL)), - NULL_RTX); - rtx last_addr - = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode, - stack_pointer_rtx, - plus_constant (size, first)), - NULL_RTX); - rtx incr = GEN_INT (STACK_CHECK_PROBE_INTERVAL); + rtx rounded_size, rounded_size_op, last_addr, temp; rtx loop_lab = gen_label_rtx (); - rtx test_lab = gen_label_rtx (); rtx end_lab = gen_label_rtx (); - rtx temp; - if (!REG_P (test_addr) - || REGNO (test_addr) < FIRST_PSEUDO_REGISTER) - test_addr = force_reg (Pmode, test_addr); + /* Step 1: round SIZE to the previous multiple of the interval. */ + + /* ROUNDED_SIZE = SIZE & -PROBE_INTERVAL */ + rounded_size = simplify_gen_binary (AND, Pmode, + size, + GEN_INT (-PROBE_INTERVAL)); + rounded_size_op = force_operand (rounded_size, NULL_RTX); - emit_jump (test_lab); + + /* Step 2: compute initial and final value of the loop counter. */ + + /* SP = SP_0 + PROBE_INTERVAL. */ + anti_adjust_stack (probe_interval); + + /* LAST_ADDR = SP_0 + PROBE_INTERVAL + ROUNDED_SIZE. */ + last_addr = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode, + stack_pointer_rtx, + rounded_size_op), + NULL_RTX); + + + /* Step 3: the loop + + while (SP != LAST_ADDR) + { + SP = SP + PROBE_INTERVAL + probe at SP + } + + adjusts SP and probes to PROBE_INTERVAL + N * PROBE_INTERVAL for + values of N from 1 until it is equal to ROUNDED_SIZE. */ emit_label (loop_lab); - emit_stack_probe (test_addr); -#ifdef STACK_GROWS_DOWNWARD -#define CMP_OPCODE GTU - temp = expand_binop (Pmode, sub_optab, test_addr, incr, test_addr, - 1, OPTAB_WIDEN); -#else -#define CMP_OPCODE LTU - temp = expand_binop (Pmode, add_optab, test_addr, incr, test_addr, - 1, OPTAB_WIDEN); -#endif + /* Jump to END_LAB if SP == LAST_ADDR. */ + emit_cmp_and_jump_insns (stack_pointer_rtx, last_addr, EQ, + NULL_RTX, Pmode, 1, end_lab); + + /* SP = SP + PROBE_INTERVAL and probe at SP. */ + anti_adjust_stack (probe_interval); + emit_stack_probe (stack_pointer_rtx); - gcc_assert (temp == test_addr); + emit_jump (loop_lab); - emit_label (test_lab); - emit_cmp_and_jump_insns (test_addr, last_addr, CMP_OPCODE, - NULL_RTX, Pmode, 1, loop_lab); - emit_jump (end_lab); emit_label (end_lab); - emit_stack_probe (last_addr); + /* Step 4: adjust SP and probe to PROBE_INTERVAL + SIZE if we cannot + assert at compile-time that SIZE is equal to ROUNDED_SIZE. */ + + /* TEMP = SIZE - ROUNDED_SIZE. */ + temp = simplify_gen_binary (MINUS, Pmode, size, rounded_size); + if (temp != const0_rtx) + { + /* Manual CSE if the difference is not known at compile-time. */ + if (GET_CODE (temp) != CONST_INT) + temp = gen_rtx_MINUS (Pmode, size, rounded_size_op); + anti_adjust_stack (temp); + emit_stack_probe (stack_pointer_rtx); + } } + + /* Adjust back to account for the additional first interval. */ + adjust_stack (probe_interval); } - + /* Return an rtx representing the register or memory location in which a scalar value of data type VALTYPE was returned by a function call to function FUNC. Index: ira.c =================================================================== --- ira.c (revision 153694) +++ ira.c (working copy) @@ -1442,6 +1442,9 @@ ira_setup_eliminable_regset (void) int need_fp = (! flag_omit_frame_pointer || (cfun->calls_alloca && EXIT_IGNORE_STACK) + /* We need the frame pointer to catch stack overflow exceptions + if the stack pointer is moving. */ + || (flag_stack_check && STACK_CHECK_MOVING_SP) || crtl->accesses_prior_frames || crtl->stack_realign_needed || targetm.frame_pointer_required ()); Index: system.h =================================================================== --- system.h (revision 153694) +++ system.h (working copy) @@ -761,7 +761,7 @@ extern void fancy_abort (const char *, i TARGET_ASM_EXCEPTION_SECTION TARGET_ASM_EH_FRAME_SECTION \ SMALL_ARG_MAX ASM_OUTPUT_SHARED_BSS ASM_OUTPUT_SHARED_COMMON \ ASM_OUTPUT_SHARED_LOCAL UNALIGNED_WORD_ASM_OP \ - ASM_MAKE_LABEL_LINKONCE + ASM_MAKE_LABEL_LINKONCE STACK_CHECK_PROBE_INTERVAL /* Hooks that are no longer used. */ #pragma GCC poison LANG_HOOKS_FUNCTION_MARK LANG_HOOKS_FUNCTION_FREE \ Index: config/i386/linux.h =================================================================== --- config/i386/linux.h (revision 153694) +++ config/i386/linux.h (working copy) @@ -207,6 +207,12 @@ along with GCC; see the file COPYING3. #define MD_UNWIND_SUPPORT "config/i386/linux-unwind.h" +/* Static stack checking is supported by means of probes. */ +#define STACK_CHECK_STATIC_BUILTIN 1 + +/* The stack pointer needs to be moved while checking the stack. */ +#define STACK_CHECK_MOVING_SP 1 + /* This macro may be overridden in i386/k*bsd-gnu.h. */ #define REG_NAME(reg) reg Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 153694) +++ config/i386/i386.md (working copy) @@ -223,15 +223,16 @@ (define_constants (define_constants [(UNSPECV_BLOCKAGE 0) (UNSPECV_STACK_PROBE 1) - (UNSPECV_EMMS 2) - (UNSPECV_LDMXCSR 3) - (UNSPECV_STMXCSR 4) - (UNSPECV_FEMMS 5) - (UNSPECV_CLFLUSH 6) - (UNSPECV_ALIGN 7) - (UNSPECV_MONITOR 8) - (UNSPECV_MWAIT 9) - (UNSPECV_CMPXCHG 10) + (UNSPECV_STACK_PROBE_INLINE 2) + (UNSPECV_EMMS 3) + (UNSPECV_LDMXCSR 4) + (UNSPECV_STMXCSR 5) + (UNSPECV_FEMMS 6) + (UNSPECV_CLFLUSH 7) + (UNSPECV_ALIGN 8) + (UNSPECV_MONITOR 9) + (UNSPECV_MWAIT 10) + (UNSPECV_CMPXCHG 11) (UNSPECV_XCHG 12) (UNSPECV_LOCK 13) (UNSPECV_PROLOGUE_USE 14) @@ -19961,6 +19962,38 @@ (define_expand "allocate_stack" DONE; }) +(define_expand "probe_stack" + [(match_operand 0 "memory_operand" "")] + "" +{ + if (GET_MODE (operands[0]) == DImode) + emit_insn (gen_iordi3 (operands[0], operands[0], const0_rtx)); + else + emit_insn (gen_iorsi3 (operands[0], operands[0], const0_rtx)); + DONE; +}) + +(define_insn "adjust_stack_and_probe" + [(unspec_volatile:P [(match_operand:P 0 "const_int_operand" "n")] + UNSPECV_STACK_PROBE_INLINE) + (set (reg:P SP_REG) (minus:P (reg:P SP_REG) (match_dup 0))) + (clobber (match_operand:P 1 "general_operand" "=rn")) + (clobber (reg:CC FLAGS_REG)) + (clobber (mem:BLK (scratch)))] + "" + "* return output_adjust_stack_and_probe (operands[0], operands[1]);" + [(set_attr "type" "multi")]) + +(define_insn "probe_stack_range" + [(unspec_volatile:P [(match_operand:P 0 "const_int_operand" "n") + (match_operand:P 1 "const_int_operand" "n")] + UNSPECV_STACK_PROBE_INLINE) + (clobber (match_operand:P 2 "general_operand" "=rn")) + (clobber (reg:CC FLAGS_REG))] + "" + "* return output_probe_stack_range (operands[0], operands[1], operands[2]);" + [(set_attr "type" "multi")]) + (define_expand "builtin_setjmp_receiver" [(label_ref (match_operand 0 "" ""))] "!TARGET_64BIT && flag_pic" @@ -20464,7 +20497,9 @@ (define_peephole2 [(match_dup 0) (match_operand:SI 1 "nonmemory_operand" "")])) (clobber (reg:CC FLAGS_REG))])] - "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE" + "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE + /* Do not split stack checking probes. */ + && GET_CODE (operands[3]) != IOR && operands[1] != const0_rtx" [(set (match_dup 2) (match_dup 0)) (parallel [(set (match_dup 2) (match_op_dup 3 [(match_dup 2) (match_dup 1)])) @@ -20479,7 +20514,9 @@ (define_peephole2 [(match_operand:SI 1 "nonmemory_operand" "") (match_dup 0)])) (clobber (reg:CC FLAGS_REG))])] - "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE" + "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE + /* Do not split stack checking probes. */ + && GET_CODE (operands[3]) != IOR && operands[1] != const0_rtx" [(set (match_dup 2) (match_dup 0)) (parallel [(set (match_dup 2) (match_op_dup 3 [(match_dup 1) (match_dup 2)])) Index: config/i386/sol2.h =================================================================== --- config/i386/sol2.h (revision 153694) +++ config/i386/sol2.h (working copy) @@ -113,6 +113,9 @@ along with GCC; see the file COPYING3. #undef X86_FILE_START_VERSION_DIRECTIVE #define X86_FILE_START_VERSION_DIRECTIVE false +/* Static stack checking is supported by means of probes. */ +#define STACK_CHECK_STATIC_BUILTIN 1 + /* Only recent versions of Solaris 11 ld properly support hidden .gnu.linkonce sections, so don't use them. */ #ifndef TARGET_GNU_LD Index: config/i386/linux64.h =================================================================== --- config/i386/linux64.h (revision 153694) +++ config/i386/linux64.h (working copy) @@ -110,6 +110,12 @@ see the files COPYING3 and COPYING.RUNTI #define MD_UNWIND_SUPPORT "config/i386/linux-unwind.h" +/* Static stack checking is supported by means of probes. */ +#define STACK_CHECK_STATIC_BUILTIN 1 + +/* The stack pointer needs to be moved while checking the stack. */ +#define STACK_CHECK_MOVING_SP 1 + /* This macro may be overridden in i386/k*bsd-gnu.h. */ #define REG_NAME(reg) reg Index: config/i386/linux-unwind.h =================================================================== --- config/i386/linux-unwind.h (revision 153694) +++ config/i386/linux-unwind.h (working copy) @@ -172,6 +172,25 @@ x86_fallback_frame_state (struct _Unwind fs->signal_frame = 1; return _URC_NO_REASON; } + +#define MD_FROB_UPDATE_CONTEXT x86_frob_update_context + +/* Fix up for kernels that have vDSO, but don't have S flag in it. */ + +static void +x86_frob_update_context (struct _Unwind_Context *context, + _Unwind_FrameState *fs ATTRIBUTE_UNUSED) +{ + unsigned char *pc = context->ra; + + /* movl $__NR_rt_sigreturn,%eax ; {int $0x80 | syscall} */ + if (*(unsigned char *)(pc+0) == 0xb8 + && *(unsigned int *)(pc+1) == 173 + && (*(unsigned short *)(pc+5) == 0x80cd + || *(unsigned short *)(pc+5) == 0x050f)) + _Unwind_SetSignalFrame (context, 1); +} + #endif /* not glibc 2.0 */ #endif /* ifdef __x86_64__ */ #endif /* ifdef inhibit_libc */ Index: config/i386/i386-protos.h =================================================================== --- config/i386/i386-protos.h (revision 153694) +++ config/i386/i386-protos.h (working copy) @@ -69,6 +69,8 @@ extern const char *output_387_binary_op extern const char *output_387_reg_move (rtx, rtx*); extern const char *output_fix_trunc (rtx, rtx*, int); extern const char *output_fp_compare (rtx, rtx*, int, int); +extern const char *output_adjust_stack_and_probe (rtx, rtx); +extern const char *output_probe_stack_range (rtx, rtx, rtx); extern void ix86_expand_clear (rtx); extern void ix86_expand_move (enum machine_mode, rtx[]); Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 153694) +++ config/i386/i386.c (working copy) @@ -7899,6 +7899,11 @@ ix86_compute_frame_layout (struct ix86_f else frame->save_regs_using_mov = false; + /* If static stack checking is enabled and done with probes, the registers + need to be saved before allocating the frame. */ + if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK) + frame->save_regs_using_mov = false; + /* Skip return address. */ offset = UNITS_PER_WORD; @@ -8277,6 +8282,408 @@ ix86_internal_arg_pointer (void) return virtual_incoming_args_rtx; } +struct scratch_reg { + rtx reg; + bool saved; +}; + +/* Return a short-lived scratch register for use on function entry. + In 32-bit mode, it is valid only after the registers are saved + in the prologue. This register must be released by means of + release_scratch_register_on_entry once it is dead. */ + +static void +get_scratch_register_on_entry (struct scratch_reg *sr) +{ + int regno; + + sr->saved = false; + + if (TARGET_64BIT) + regno = FIRST_REX_INT_REG + 3; /* r11 */ + else + { + tree decl = current_function_decl, fntype = TREE_TYPE (decl); + bool fastcall_p + = lookup_attribute ("fastcall", TYPE_ATTRIBUTES (fntype)) != NULL_TREE; + int regparm = ix86_function_regparm (fntype, decl); + int drap_regno + = crtl->drap_reg ? REGNO (crtl->drap_reg) : INVALID_REGNUM; + + /* 'fastcall' sets regparm to 2 and uses ecx+edx. */ + if ((regparm < 1 || fastcall_p) && drap_regno != 0) + regno = 0; + else if (regparm < 2 && drap_regno != 1) + regno = 1; + else if (regparm < 3 && !fastcall_p && drap_regno != 2 + /* ecx is the static chain register. */ + && !DECL_STATIC_CHAIN (decl)) + regno = 2; + else if (ix86_save_reg (3, true)) + regno = 3; + else if (ix86_save_reg (4, true)) + regno = 4; + else if (ix86_save_reg (5, true)) + regno = 5; + else + { + regno = (drap_regno == 0 ? 1 : 0); + sr->saved = true; + } + } + + sr->reg = gen_rtx_REG (Pmode, regno); + if (sr->saved) + { + rtx insn = emit_insn (gen_push (sr->reg)); + RTX_FRAME_RELATED_P (insn) = 1; + } +} + +/* Release a scratch register obtained from the preceding function. */ + +static void +release_scratch_register_on_entry (struct scratch_reg *sr) +{ + if (sr->saved) + { + rtx insn, x; + + if (TARGET_64BIT) + insn = emit_insn (gen_popdi1 (sr->reg)); + else + insn = emit_insn (gen_popsi1 (sr->reg)); + RTX_FRAME_RELATED_P (insn) = 1; + + /* The RTX_FRAME_RELATED_P mechanism doesn't know about pop. */ + x = plus_constant (stack_pointer_rtx, UNITS_PER_WORD); + x = gen_rtx_SET (VOIDmode, stack_pointer_rtx, x); + add_reg_note (insn, REG_CFA_ADJUST_CFA, x); + } +} + +/* The run-time loop is made up of 8 insns in the generic case while this + compile-time loop is made up of n insns for n # of intervals. */ +#define PROBE_INTERVAL (1 << STACK_CHECK_PROBE_INTERVAL_EXP) +#define SMALL_INTERVAL(size) ((size) <= 8 * PROBE_INTERVAL) + +/* Output one probe. */ + +static inline void +output_probe_op (void) +{ + fputs (TARGET_64BIT ? "\torq\t$0, " : "\torl\t$0, ", asm_out_file); +} + +/* Adjust the stack by SIZE bytes and output one probe. */ + +static void +output_adjust_stack_and_probe_op (HOST_WIDE_INT size) +{ + fprintf (asm_out_file, "\tsub\t$"HOST_WIDE_INT_PRINT_DEC",", size); + print_reg (stack_pointer_rtx, 0, asm_out_file); + fputc ('\n', asm_out_file); + output_probe_op (); + fputc ('(', asm_out_file); + print_reg (stack_pointer_rtx, 0, asm_out_file); + fputs (")\n", asm_out_file); +} + +/* Adjust the stack by SIZE bytes while probing it. Note that we skip + the probe for the first interval and instead probe one interval past + the specified size in order to maintain a protection area. */ + +const char * +output_adjust_stack_and_probe (rtx size_rtx, rtx reg) +{ + static int labelno = 0; + HOST_WIDE_INT size = INTVAL (size_rtx); + HOST_WIDE_INT rounded_size; + char loop_lab[32], end_lab[32]; + + /* See if we have a constant small number of probes to generate. If so, + that's the easy case. */ + if (SMALL_INTERVAL (size)) + { + HOST_WIDE_INT i; + bool first_probe = true; + + /* Adjust SP and probe to PROBE_INTERVAL + N * PROBE_INTERVAL for + values of N from 1 until it exceeds SIZE. If only one probe is + needed, this will not generate any code. Then adjust and probe + to PROBE_INTERVAL + SIZE. */ + for (i = PROBE_INTERVAL; i < size; i += PROBE_INTERVAL) + { + if (first_probe) + { + output_adjust_stack_and_probe_op (2 * PROBE_INTERVAL); + first_probe = false; + } + else + output_adjust_stack_and_probe_op (PROBE_INTERVAL); + } + + if (first_probe) + output_adjust_stack_and_probe_op (size + PROBE_INTERVAL); + else + output_adjust_stack_and_probe_op (size + PROBE_INTERVAL - i); + } + + /* In the variable case, do the same as above, but in a loop. Note that we + must be extra careful with variables wrapping around because we might be + at the very top (or the very bottom) of the address space and we have to + be able to handle this case properly; in particular, we use an equality + test for the loop condition. */ + else + { + /* Step 1: round SIZE to the previous multiple of the interval. */ + rounded_size = size & -PROBE_INTERVAL; + + + /* Step 2: compute initial and final value of the loop counter. */ + + /* SP = SP_0 + PROBE_INTERVAL. */ + fprintf (asm_out_file, "\tsub\t$%d, ", PROBE_INTERVAL); + print_reg (stack_pointer_rtx, 0, asm_out_file); + + /* LAST_ADDR = SP_0 + PROBE_INTERVAL + ROUNDED_SIZE. */ + fprintf (asm_out_file, "\n\tmov\t$-"HOST_WIDE_INT_PRINT_DEC", ", + rounded_size); + print_reg (reg, 0, asm_out_file); + fputs ("\n\tadd\t", asm_out_file); + print_reg (stack_pointer_rtx, 0, asm_out_file); + fputs (", ", asm_out_file); + print_reg (reg, 0, asm_out_file); + fputc ('\n', asm_out_file); + + + /* Step 3: the loop + + while (SP != LAST_ADDR) + { + SP = SP + PROBE_INTERVAL + probe at SP + } + + adjusts SP and probes to PROBE_INTERVAL + N * PROBE_INTERVAL for + values of N from 1 until it is equal to ROUNDED_SIZE. */ + + ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno); + ASM_OUTPUT_LABEL (asm_out_file, loop_lab); + + /* Jump to END_LAB if SP == LAST_ADDR. */ + fputs ("\tcmp\t", asm_out_file); + print_reg (stack_pointer_rtx, 0, asm_out_file); + fputs (", ", asm_out_file); + print_reg (reg, 0, asm_out_file); + fputc ('\n', asm_out_file); + ASM_GENERATE_INTERNAL_LABEL (end_lab, "LPSRE", labelno++); + fputs ("\tje\t", asm_out_file); assemble_name (asm_out_file, end_lab); + fputc ('\n', asm_out_file); + + /* SP = SP + PROBE_INTERVAL and probe at SP. */ + output_adjust_stack_and_probe_op (PROBE_INTERVAL); + + fprintf (asm_out_file, "\tjmp\t"); assemble_name (asm_out_file, loop_lab); + fputc ('\n', asm_out_file); + + ASM_OUTPUT_LABEL (asm_out_file, end_lab); + + + /* Step 4: adjust SP and probe to PROBE_INTERVAL + SIZE if we cannot + assert at compile-time that SIZE is equal to ROUNDED_SIZE. */ + if (size != rounded_size) + output_adjust_stack_and_probe_op (size - rounded_size); + } + + /* Adjust back to account for the additional first interval. */ + fprintf (asm_out_file, "\tadd\t$%d, ", PROBE_INTERVAL); + print_reg (stack_pointer_rtx, 0, asm_out_file); + fputc ('\n', asm_out_file); + + return ""; +} + +/* Wrapper around gen_adjust_stack_and_probe. */ + +static rtx +ix86_gen_adjust_stack_and_probe (rtx op0, rtx op1) +{ + if (TARGET_64BIT) + return gen_adjust_stack_and_probedi (op0, op1); + else + return gen_adjust_stack_and_probesi (op0, op1); +} + +/* Emit code to adjust the stack by SIZE bytes while probing it. */ + +static void +ix86_adjust_stack_and_probe (HOST_WIDE_INT size) +{ + rtx size_rtx = GEN_INT (size); + + if (SMALL_INTERVAL (size)) + emit_insn (ix86_gen_adjust_stack_and_probe (size_rtx, const0_rtx)); + else + { + struct scratch_reg sr; + get_scratch_register_on_entry (&sr); + emit_insn (ix86_gen_adjust_stack_and_probe (size_rtx, sr.reg)); + release_scratch_register_on_entry (&sr); + } + + gcc_assert (ix86_cfa_state->reg != stack_pointer_rtx); + + /* Make sure nothing is scheduled before we are done. */ + emit_insn (gen_blockage ()); +} + +/* Output one probe at OFFSET + INDEX from the current stack pointer. */ + +static void +output_probe_stack_range_op (HOST_WIDE_INT offset, rtx index) +{ + output_probe_op (); + if (offset) + fprintf (asm_out_file, "-"HOST_WIDE_INT_PRINT_DEC, offset); + fputc ('(', asm_out_file); + print_reg (stack_pointer_rtx, 0, asm_out_file); + if (index) + { + fputc (',', asm_out_file); + print_reg (index, 0, asm_out_file); + fputs (",1", asm_out_file); + } + fputs (")\n", asm_out_file); +} + +/* Probe a range of stack addresses from FIRST to FIRST+SIZE, inclusive. + These are offsets from the current stack pointer. */ + +const char * +output_probe_stack_range (rtx first_rtx, rtx size_rtx, rtx reg) +{ + static int labelno = 0; + HOST_WIDE_INT first = INTVAL (first_rtx); + HOST_WIDE_INT size = INTVAL (size_rtx); + HOST_WIDE_INT rounded_size; + char loop_lab[32], end_lab[32]; + + /* See if we have a constant small number of probes to generate. If so, + that's the easy case. */ + if (SMALL_INTERVAL (size)) + { + HOST_WIDE_INT i; + + /* Probe at FIRST + N * PROBE_INTERVAL for values of N from 1 until + it exceeds SIZE. If only one probe is needed, this will not + generate any code. Then probe at FIRST + SIZE. */ + for (i = PROBE_INTERVAL; i < size; i += PROBE_INTERVAL) + output_probe_stack_range_op (first + i, NULL_RTX); + + output_probe_stack_range_op (first + size, NULL_RTX); + } + + /* Otherwise, do the same as above, but in a loop. Note that we must be + extra careful with variables wrapping around because we might be at + the very top (or the very bottom) of the address space and we have + to be able to handle this case properly; in particular, we use an + equality test for the loop condition. */ + else + { + /* Step 1: round SIZE to the previous multiple of the interval. */ + rounded_size = size & -PROBE_INTERVAL; + + + /* Step 2: compute initial and final value of the loop counter. */ + + /* TEST_OFFSET = FIRST. */ + fprintf (asm_out_file, "\tmov\t$-"HOST_WIDE_INT_PRINT_DEC", ", first); + print_reg (reg, 0, asm_out_file); + fputc ('\n', asm_out_file); + + /* LAST_OFFSET = FIRST + ROUNDED_SIZE. */ + + + /* Step 3: the loop + + while (TEST_ADDR != LAST_ADDR) + { + TEST_ADDR = TEST_ADDR + PROBE_INTERVAL + probe at TEST_ADDR + } + + probes at FIRST + N * PROBE_INTERVAL for values of N from 1 + until it is equal to ROUNDED_SIZE. */ + + ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno); + ASM_OUTPUT_LABEL (asm_out_file, loop_lab); + + /* Jump to END_LAB if TEST_ADDR == LAST_ADDR. */ + fprintf (asm_out_file, "\tcmp\t$-"HOST_WIDE_INT_PRINT_DEC", ", + first + rounded_size); + print_reg (reg, 0, asm_out_file); + fputc ('\n', asm_out_file); + ASM_GENERATE_INTERNAL_LABEL (end_lab, "LPSRE", labelno++); + fputs ("\tje\t", asm_out_file); assemble_name (asm_out_file, end_lab); + fputc ('\n', asm_out_file); + + /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL. */ + fprintf (asm_out_file, "\tsub\t$%d, ", PROBE_INTERVAL); + print_reg (reg, 0, asm_out_file); + fputc ('\n', asm_out_file); + + /* Probe at TEST_ADDR. */ + output_probe_stack_range_op (0, reg); + + fprintf (asm_out_file, "\tjmp\t"); assemble_name (asm_out_file, loop_lab); + fputc ('\n', asm_out_file); + + ASM_OUTPUT_LABEL (asm_out_file, end_lab); + + + /* Step 4: probe at FIRST + SIZE if we cannot assert at compile-time + that SIZE is equal to ROUNDED_SIZE. */ + if (size != rounded_size) + output_probe_stack_range_op (size - rounded_size, reg); + } + + return ""; +} + +/* Wrapper around gen_probe_stack_range. */ + +static rtx +ix86_gen_probe_stack_range (rtx op0, rtx op1, rtx op2) +{ + if (TARGET_64BIT) + return gen_probe_stack_rangedi (op0, op1, op2); + else + return gen_probe_stack_rangesi (op0, op1, op2); +} + +/* Emit code to probe a range of stack addresses from FIRST to FIRST+SIZE, + inclusive. These are offsets from the current stack pointer. */ + +static void +ix86_emit_probe_stack_range (HOST_WIDE_INT first, HOST_WIDE_INT size) +{ + if (SMALL_INTERVAL (size)) + emit_insn (ix86_gen_probe_stack_range (GEN_INT (first), GEN_INT (size), + const0_rtx)); + else + { + struct scratch_reg sr; + get_scratch_register_on_entry (&sr); + emit_insn (ix86_gen_probe_stack_range (GEN_INT (first), GEN_INT (size), + sr.reg)); + release_scratch_register_on_entry (&sr); + } + + /* Make sure nothing is scheduled before we are done. */ + emit_insn (gen_blockage ()); +} + /* Finalize stack_realign_needed flag, which will guide prologue/epilogue to be generated in correct form. */ static void @@ -8469,6 +8876,31 @@ ix86_expand_prologue (void) else allocate += frame.nregs * UNITS_PER_WORD; + /* The stack has already been decremented by the instruction calling us + so we need to probe unconditionally to preserve the protection area. */ + if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK) + { + /* We expect the registers to be saved when probes are used. */ + gcc_assert (!frame.save_regs_using_mov); + + if (STACK_CHECK_MOVING_SP) + { + ix86_adjust_stack_and_probe (allocate); + allocate = 0; + } + else + { + const HOST_WIDE_INT max_size = 0x7fffffff - STACK_CHECK_PROTECT; + HOST_WIDE_INT size = allocate; + + /* Don't bother probing more than 2 GB, this is easier. */ + if (size > max_size) + size = max_size; + + ix86_emit_probe_stack_range (STACK_CHECK_PROTECT, size); + } + } + /* When using red zone we may start register saving before allocating the stack frame saving one cycle of the prologue. However I will avoid doing this if I am going to have to probe the stack since --Boundary-00=_9cc6KewQO3MtzJS--