public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [Patch] New -fstack-check implementation (2/n)
@ 2009-08-04 11:38 Eric Botcazou
  2009-09-02  0:46 ` Ian Lance Taylor
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Botcazou @ 2009-08-04 11:38 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 6817 bytes --]

Hi,

this is the second patch in a series that stems from the split of:
  http://gcc.gnu.org/ml/gcc-patches/2008-03/msg01853.html

the first patch was installed about one year ago(!):
  http://gcc.gnu.org/ml/gcc-patches/2008-05/msg02041.html

This implements functional stack checking for most x86 and x86-64 targets, 
including Linux, the notable exception being Windows (to be submitted later).

It is functional in the sense that compiling code with -fstack-check should 
produce working executables that run as usual except that they check the 
stack before using it, whatever the language.  This is very similar to what 
all programs do on Tru64 or Windows for example.

The main difference is that the ckecking is designed to make it possible to 
recover from a stack overflow.  This recovery requires support in the runtime 
and is only implemented for Ada; for the other languages, a signal will be 
raised instead.

The main items of the patch are:
 1. rewrite of the generic stack checking code in explow.c; the existing code 
is subject to wrap-arounds and can fail to detect a stack overflow at the 
very beginning or very end of the address space.
 2. introduction of a variant of the probing method: probing with moving stack 
pointer; this is required for Linux.
 3. introduction of a third sub-method (STACK_CHECK_PROBE_IOR) used for x86 
and x86-64.
 4. introduction of a new method: checking against a symbol; for VxWorks.
 5. implementation of the probing, the probing with moving stack pointer and 
the checking against a symbol methods in the i386 back-end.
 6. a couple of tweaks to the x86 DWARF-2 unwinder: a fixup for x86 kernels 
that don't have the S flag for signal frames and a tweak to help the unwinder 
to disambiguate contexts in the presence of signal frames.
 7. a fix for libada: proper detection of _Unwind_GetIPInfo like for the other 
libraries.

The whole ACATS testsuite compiled with -fstack-check passes with this patch 
on i586-suse-linux and x86_64-suse-linux.

Tested on i586-suse-linux and x86_64-suse-linux, OK for mainline?


2009-08-04  Eric Botcazou  <ebotcazou@adacore.com>

	PR target/10127
	PR ada/20548
	* expr.h (STACK_CHECK_PROBE_INTERVAL): Delete.
	(STACK_CHECK_PROBE_INTERVAL_EXP): New macro.
	(STACK_CHECK_PROBE_IOR): Likewise.
	(STACK_CHECK_MOVING_SP): Likewise.
	* system.h (STACK_CHECK_PROBE_INTERVAL): Poison it.
	* doc/tm.texi (Stack Checking): Delete STACK_CHECK_PROBE_INTERVAL.
	Document STACK_CHECK_PROBE_INTERVAL_EXP, STACK_CHECK_PROBE_IOR and
	STACK_CHECK_MOVING_SP.
	* calls.c (emit_library_call_value_1): Clear the ECF_NOTHROW flag if
	the libcall is LCT_MAY_THROW.
	* explow.c (anti_adjust_stack_and_probe): New function.
	(allocate_dynamic_stack_space): Do not directly allocate space if
	STACK_CHECK_MOVING_SP, instead invoke above function.
	(set_stack_check_libfunc): Delete.
	(stack_check_libfunc): Make public.
	(stack_check_symbol): New public variable.
	(emit_stack_probe): Deal with STACK_CHECK_PROBE_IOR.
	(PROBE_INTERVAL): New macro.
	(STACK_GROW_OPTAB): Likewise.
	(STACK_HIGH, STACK_LOW): Likewise.
	(probe_stack_range): Cope with SPARC_STACK_BIAS.  Pass LCT_MAY_THROW
	to emit_library_call for the checking routine.  Remove support code
	for dedicated pattern.  Add support for stack limits provided by the
	stack_check_symbol variable.  Fix loop condition in the small constant
	case.  Rewrite in the general case to be immune to wrap-around.
	Make sure the address of probes is valid.  Try to use [base + disp]
	addressing mode as much as possible.
	Do not include gt-explow.h.
	* ira.c (setup_eliminable_regset): Set frame_pointer_needed if stack
	checking is enabled and STACK_CHECK_MOVING_SP.
	* rtlanal.c (may_trap_p_1) <MEM>: If stack checking is enabled,
	return 1 for volatile references to the stack pointer.
	* rtl.h (set_stack_check_libfunc): Delete.
	(stack_check_libfunc): Declare.
	(stack_check_symbol): Likewise.
	(libcall_type enum): Add LCT_MAY_THROW.
	* tree.c (build_common_builtin_nodes): Do not set ECF_NOTHROW on
	__builtin_alloca if stack checking is enabled.
	* Makefile.in (explow.o): Remove gt-explow.h.
	* unwind-dw2.c (uw_identify_context): Take into account whether the
	context is that of a signal frame or not.
	* config/i386/linux-unwind.h (x86_frob_update_context): New function.
	(MD_FROB_UPDATE_CONTEXT): Define.
	* config/i386/freebsd.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
	* config/i386/linux.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
	(STACK_CHECK_MOVING_SP): Likewise.
	* config/i386/linux64.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
	(STACK_CHECK_MOVING_SP): Likewise.
	* config/i386/lynx.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
	* config/i386/sol2.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
	* config/i386/vxworks.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
	(STACK_CHECK_PROTECT): 8K is reserved in the stack to propagate
	exceptions reliably in case of stack overflow.
	* config/i386/vxworksae.h (STACK_CHECK_PROTECT): Redefine to 4K
	which is enough for executing a possible last chance handler.
	* config/i386/i386.h (STACK_CHECK_PROBE_IOR): Define to 1.
	* config/i386/i386.c (ix86_compute_frame_layout): Force use of push
	instructions to save registers if stack checking with probes is on.
	(get_scratch_register_on_entry): New function.
	(release_scratch_register_on_entry): Likewise.
	(output_probe_op): Likewise.
	(output_adjust_stack_and_probe_op): Likewise.
	(output_adjust_stack_and_probe): Likewise.
	(ix86_gen_adjust_stack_and_probe): Likewise.
	(ix86_adjust_stack_and_probe): Likewise.
	(output_cond_trap): Likewise.
	(output_probe_stack_range_op): Likewise.
	(ix86_gen_probe_stack_range): Likewise.
	(ix86_emit_probe_stack_range): Likewise.
	(ix86_expand_prologue): Emit stack checking code if static builtin
	stack checking is enabled.
	Test ix86_target_stack_probe instead of TARGET_STACK_PROBE.
	* config/i386/i386-protos.h (ix86_target_stack_probe): Declare.
	(output_adjust_stack_and_probe): Likewise.
	(output_cond_trap): Likewise.
	(output_probe_stack_range): Likewise.
	* config/i386/i386.md (UNSPECV_STACK_PROBE_INLINE): New constant.
	(adjust_stack_and_probe): New insn.
	(probe_stack_range): Likewise.
	(logical operation peepholes): Do not split stack checking probes.
	(cond_trap): New insn.
ada/
	* gcc-interface/Makefile.in: Pass GNATLIBCFLAGS_FOR_C to recursive
	invocations.
	* gcc-interface/trans.c (gigi): Set stack_check_symbol to
	__gnat_stack_limit on selected platforms.


2009-08-04  Eric Botcazou  <ebotcazou@adacore.com>

	* Makefile.in (GNATLIBCFLAGS_FOR_C): New variable.
	(LIBADA_FLAGS_TO_PASS): Add GNATLIBCFLAGS_FOR_C.
	* configure.ac: Include config/unwind_ipinfo.m4.
	Check for _Unwind_GetIPInfo.
	* configure: Regenerate.


2009-08-04  Eric Botcazou  <ebotcazou@adacore.com>

	* gnat.dg/stack_check.adb: New test.


-- 
Eric Botcazou

[-- Attachment #2: gcc-45_stack-check-2.diff --]
[-- Type: text/x-diff, Size: 58330 bytes --]

Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	(revision 150351)
+++ gcc/doc/tm.texi	(working copy)
@@ -3532,11 +3532,17 @@ like to do static stack checking in some
 approach.  The default value of this macro is zero.
 @end defmac
 
-@defmac STACK_CHECK_PROBE_INTERVAL
-An integer representing the interval at which GCC must generate stack
-probe instructions.  You will normally define this macro to be no larger
-than the size of the ``guard pages'' at the end of a stack area.  The
-default value of 4096 is suitable for most systems.
+@defmac STACK_CHECK_PROBE_INTERVAL_EXP
+An integer specifying the interval at which GCC must generate stack probe
+instructions, defined as 2 raised to this integer.  You will normally
+define this macro so that the interval be no larger than the size of
+the ``guard pages'' at the end of a stack area.  The default value
+of 12 (4096-byte interval) is suitable for most systems.
+@end defmac
+
+@defmac STACK_CHECK_PROBE_IOR
+An integer which is nonzero if GCC should perform the stack probe
+as an inclusive OR instruction.  The default is zero.
 @end defmac
 
 @defmac STACK_CHECK_PROBE_LOAD
@@ -3545,6 +3551,15 @@ as a load instruction and zero if GCC sh
 The default is zero, which is the most efficient choice on most systems.
 @end defmac
 
+@defmac STACK_CHECK_MOVING_SP
+An integer which is nonzero if GCC should move the stack pointer during
+stack checking.  This can be necessary on systems where the stack pointer
+contains the bottom address of the memory area accessible to the executing
+thread at any point in time.  In this situation an alternate signal stack
+is required in order to be able to recover from a stack overflow.
+The default value of this macro is zero.
+@end defmac
+
 @defmac STACK_CHECK_PROTECT
 The number of bytes of stack needed to recover from a stack overflow,
 for languages where such a recovery is supported.  The default value of
Index: gcc/tree.c
===================================================================
--- gcc/tree.c	(revision 150351)
+++ gcc/tree.c	(working copy)
@@ -8122,7 +8122,8 @@ build_common_builtin_nodes (void)
       tmp = tree_cons (NULL_TREE, size_type_node, void_list_node);
       ftype = build_function_type (ptr_type_node, tmp);
       local_define_builtin ("__builtin_alloca", ftype, BUILT_IN_ALLOCA,
-			    "alloca", ECF_NOTHROW | ECF_MALLOC);
+			    "alloca",
+			    ECF_MALLOC | (flag_stack_check ? 0 : ECF_NOTHROW));
     }
 
   tmp = tree_cons (NULL_TREE, ptr_type_node, void_list_node);
Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c	(revision 150351)
+++ gcc/rtlanal.c	(working copy)
@@ -2251,6 +2251,11 @@ may_trap_p_1 (const_rtx x, unsigned flag
 
       /* Memory ref can trap unless it's a static var or a stack slot.  */
     case MEM:
+      /* Recognize specific pattern of stack checking probes.  */
+      if (flag_stack_check
+	  && MEM_VOLATILE_P (x)
+	  && XEXP (x, 0) == stack_pointer_rtx)
+	return 1;
       if (/* MEM_NOTRAP_P only relates to the actual position of the memory
 	     reference; moving it out of context such as when moving code
 	     when optimizing, might cause its address to become invalid.  */
Index: gcc/testsuite/gnat.dg/stack_check.adb
===================================================================
--- gcc/testsuite/gnat.dg/stack_check.adb	(revision 0)
+++ gcc/testsuite/gnat.dg/stack_check.adb	(revision 0)
@@ -0,0 +1,29 @@
+-- { dg-do run { target i?86-*-linux* x86_64-*-linux* i?86-*-solaris2.* } }
+-- { dg-options "-fstack-check" }
+
+procedure Stack_Check is
+
+  type A is Array (1..2048) of Integer;  -- 8 KB
+
+  procedure Consume_Stack (N : Integer) is
+    My_A : A;
+  begin
+    My_A (1) := 0;
+    if N <= 0 then
+      return;
+    end if;
+    Consume_Stack (N-1);
+  end;
+
+begin
+
+  begin
+    Consume_Stack (Integer'Last);
+    raise Program_Error;
+  exception
+    when Storage_Error => null;
+  end;
+
+  Consume_Stack (128);
+
+end;
Index: gcc/expr.h
===================================================================
--- gcc/expr.h	(revision 150351)
+++ gcc/expr.h	(working copy)
@@ -218,9 +218,14 @@ do {								\
 #define STACK_CHECK_STATIC_BUILTIN 0
 #endif
 
-/* The default interval is one page.  */
-#ifndef STACK_CHECK_PROBE_INTERVAL
-#define STACK_CHECK_PROBE_INTERVAL 4096
+/* The default interval is one page (4096 bytes).  */
+#ifndef STACK_CHECK_PROBE_INTERVAL_EXP
+#define STACK_CHECK_PROBE_INTERVAL_EXP 12
+#endif
+
+/* The default is not to use an inclusive OR.  */
+#ifndef STACK_CHECK_PROBE_IOR
+#define STACK_CHECK_PROBE_IOR 0
 #endif
 
 /* The default is to do a store into the stack.  */
@@ -228,6 +233,11 @@ do {								\
 #define STACK_CHECK_PROBE_LOAD 0
 #endif
 
+/* The default is not to move the stack pointer.  */
+#ifndef STACK_CHECK_MOVING_SP
+#define STACK_CHECK_MOVING_SP 0
+#endif
+
 /* This is a kludge to try to capture the discrepancy between the old
    mechanism (generic stack checking) and the new mechanism (static
    builtin stack checking).  STACK_CHECK_PROTECT needs to be bumped
@@ -252,7 +262,7 @@ do {								\
    one probe per function.  */
 #ifndef STACK_CHECK_MAX_FRAME_SIZE
 #define STACK_CHECK_MAX_FRAME_SIZE \
-  (STACK_CHECK_PROBE_INTERVAL - UNITS_PER_WORD)
+  ((1 << STACK_CHECK_PROBE_INTERVAL_EXP) - UNITS_PER_WORD)
 #endif
 
 /* This is arbitrary, but should be large enough everywhere.  */
Index: gcc/unwind-dw2.c
===================================================================
--- gcc/unwind-dw2.c	(revision 150351)
+++ gcc/unwind-dw2.c	(working copy)
@@ -1559,7 +1559,13 @@ uw_install_context_1 (struct _Unwind_Con
 static inline _Unwind_Ptr
 uw_identify_context (struct _Unwind_Context *context)
 {
-  return _Unwind_GetCFA (context);
+  /* The CFA is not sufficient to disambiguate the context of a function
+     interrupted by a signal before establishing its frame and the context
+     of the signal itself.  */
+  if (STACK_GROWS_DOWNWARD)
+    return _Unwind_GetCFA (context) - _Unwind_IsSignalFrame (context);
+  else
+    return _Unwind_GetCFA (context) + _Unwind_IsSignalFrame (context);
 }
 
 
Index: gcc/ada/gcc-interface/Makefile.in
===================================================================
--- gcc/ada/gcc-interface/Makefile.in	(revision 150351)
+++ gcc/ada/gcc-interface/Makefile.in	(working copy)
@@ -2393,6 +2393,7 @@ gnatlib-shared-default:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS) $(TARGET_LIBGCC2_CFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
              gnatlib
@@ -2418,6 +2419,7 @@ gnatlib-shared-dual:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
              gnatlib-shared-default
@@ -2426,6 +2428,7 @@ gnatlib-shared-dual:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
              gnatlib
@@ -2435,6 +2438,7 @@ gnatlib-shared-dual-win32:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS) $(TARGET_LIBGCC2_CFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
              gnatlib-shared-win32
@@ -2443,6 +2447,7 @@ gnatlib-shared-dual-win32:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
              gnatlib
@@ -2456,6 +2461,7 @@ gnatlib-shared-win32:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS) $(TARGET_LIBGCC2_CFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
              gnatlib
@@ -2474,7 +2480,7 @@ gnatlib-shared-darwin:
 	$(MAKE) $(FLAGS_TO_PASS) \
 	     GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS) $(TARGET_LIBGCC2_CFLAGS) \
-	                    -fno-common" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C) -fno-common" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
 	     gnatlib
@@ -2502,6 +2508,7 @@ gnatlib-shared-vms:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
              gnatlib
@@ -2530,6 +2537,7 @@ gnatlib-shared:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
 	     TARGET_LIBGCC2_CFLAGS="$(TARGET_LIBGCC2_CFLAGS)" \
@@ -2543,6 +2551,7 @@ gnatlib-sjlj:
 	     EH_MECHANISM="" \
 	     GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
 	     TARGET_LIBGCC2_CFLAGS="$(TARGET_LIBGCC2_CFLAGS)" gnatlib
@@ -2555,6 +2564,7 @@ gnatlib-zcx:
 	     EH_MECHANISM="-gcc" \
 	     GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
 	     TARGET_LIBGCC2_CFLAGS="$(TARGET_LIBGCC2_CFLAGS)" gnatlib
Index: gcc/ada/gcc-interface/trans.c
===================================================================
--- gcc/ada/gcc-interface/trans.c	(revision 150352)
+++ gcc/ada/gcc-interface/trans.c	(working copy)
@@ -313,9 +313,13 @@ gigi (Node_Id gnat_root, int max_gnat_no
   dwarf2out_set_descriptive_type_func (get_parallel_type);
 #endif
 
-  /* Enable GNAT stack checking method if needed */
-  if (!Stack_Check_Probes_On_Target)
-    set_stack_check_libfunc (gen_rtx_SYMBOL_REF (Pmode, "_gnat_stack_check"));
+  /* Enable appropriate stack checking method.  */
+  if (Stack_Check_Probes_On_Target)
+    ;
+  else if (Stack_Check_Limits_On_Target)
+    stack_check_symbol = gen_rtx_SYMBOL_REF (Pmode, "__gnat_stack_limit");
+  else
+    stack_check_libfunc = gen_rtx_SYMBOL_REF (Pmode, "_gnat_stack_check");
 
   /* Retrieve alignment settings.  */
   double_float_alignment = get_target_double_float_alignment ();
Index: gcc/calls.c
===================================================================
--- gcc/calls.c	(revision 150351)
+++ gcc/calls.c	(working copy)
@@ -3340,6 +3340,9 @@ emit_library_call_value_1 (int retval, r
     case LCT_THROW:
       flags = ECF_NORETURN;
       break;
+    case LCT_MAY_THROW:
+      flags &= ~ECF_NOTHROW;
+      break;
     case LCT_RETURNS_TWICE:
       flags = ECF_RETURNS_TWICE;
       break;
Index: gcc/explow.c
===================================================================
--- gcc/explow.c	(revision 150351)
+++ gcc/explow.c	(working copy)
@@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.
 
 static rtx break_out_memory_refs (rtx);
 static void emit_stack_probe (rtx);
+static void anti_adjust_stack_and_probe (rtx);
 
 
 /* Truncate and perhaps sign-extend C as appropriate for MODE.  */
@@ -1228,7 +1229,9 @@ allocate_dynamic_stack_space (rtx size,
 
   /* If needed, check that we have the required amount of stack.
      Take into account what has already been checked.  */
-  if (flag_stack_check == GENERIC_STACK_CHECK)
+  if (STACK_CHECK_MOVING_SP)
+    ;
+  else if (flag_stack_check == GENERIC_STACK_CHECK)
     probe_stack_range (STACK_OLD_CHECK_PROTECT + STACK_CHECK_MAX_FRAME_SIZE,
 		       size);
   else if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
@@ -1297,7 +1300,10 @@ allocate_dynamic_stack_space (rtx size,
 	  emit_label (space_available);
 	}
 
-      anti_adjust_stack (size);
+      if (flag_stack_check && STACK_CHECK_MOVING_SP)
+	anti_adjust_stack_and_probe (size);
+      else
+	anti_adjust_stack (size);
 
 #ifdef STACK_GROWS_DOWNWARD
       emit_move_insn (target, virtual_stack_dynamic_rtx);
@@ -1327,18 +1333,18 @@ allocate_dynamic_stack_space (rtx size,
   return target;
 }
 \f
-/* A front end may want to override GCC's stack checking by providing a
-   run-time routine to call to check the stack, so provide a mechanism for
-   calling that routine.  */
-
-static GTY(()) rtx stack_check_libfunc;
+/* A front end may want to override GCC's stack checking by providing
+   either a symbol (data) or a function (code).  In either case, the
+   runtime also needs to provide the associated support.  */
+
+/* Variable whose value is checked against the future value of the stack
+   pointer.  Upon stack overflow, the generated code will raise a trap.  */
+rtx stack_check_symbol;
+
+/* Function that is passed the future value of the stack pointer.  Upon
+   stack overflow, it is responsible for raising the appropriate event.  */
+rtx stack_check_libfunc;
 
-void
-set_stack_check_libfunc (rtx libfunc)
-{
-  stack_check_libfunc = libfunc;
-}
-\f
 /* Emit one stack probe at ADDRESS, an address within the stack.  */
 
 static void
@@ -1348,7 +1354,16 @@ emit_stack_probe (rtx address)
 
   MEM_VOLATILE_P (memref) = 1;
 
-  if (STACK_CHECK_PROBE_LOAD)
+  if (STACK_CHECK_PROBE_IOR)
+    {
+      if (word_mode == SImode)
+	emit_insn (gen_iorsi3 (memref, memref, const0_rtx));
+      else if (word_mode == DImode)
+	emit_insn (gen_iordi3 (memref, memref, const0_rtx));
+      else
+	gcc_unreachable ();
+    }
+  else if (STACK_CHECK_PROBE_LOAD)
     emit_move_insn (gen_reg_rtx (word_mode), memref);
   else
     emit_move_insn (memref, const0_rtx);
@@ -1360,22 +1375,35 @@ emit_stack_probe (rtx address)
    subtract from the stack.  If SIZE is constant, this is done
    with a fixed number of probes.  Otherwise, we must make a loop.  */
 
+#define PROBE_INTERVAL (1 << STACK_CHECK_PROBE_INTERVAL_EXP)
+
 #ifdef STACK_GROWS_DOWNWARD
-#define STACK_GROW_OP MINUS
+#define STACK_GROW_OP     MINUS
+#define STACK_GROW_OPTAB  sub_optab
+#define STACK_HIGH(high,low)  low
+#define STACK_LOW(high,low)   high
 #else
-#define STACK_GROW_OP PLUS
+#define STACK_GROW_OP     PLUS
+#define STACK_GROW_OPTAB  add_optab
+#define STACK_HIGH(high,low)  high
+#define STACK_LOW(high,low)   low
 #endif
 
 void
 probe_stack_range (HOST_WIDE_INT first, rtx size)
 {
+#ifdef SPARC_STACK_BIAS
+  /* The probe offsets are counted negatively whereas the stack bias is
+     counted positively.  */
+  first -= SPARC_STACK_BIAS;
+#endif
+
   /* First ensure SIZE is Pmode.  */
   if (GET_MODE (size) != VOIDmode && GET_MODE (size) != Pmode)
     size = convert_to_mode (Pmode, size, 1);
 
-  /* Next see if the front end has set up a function for us to call to
-     check the stack.  */
-  if (stack_check_libfunc != 0)
+  /* Next see if the runtime has got a function for us to call.  */
+  if (stack_check_libfunc)
     {
       rtx addr = memory_address (QImode,
 				 gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
@@ -1383,103 +1411,280 @@ probe_stack_range (HOST_WIDE_INT first,
 					         plus_constant (size, first)));
 
       addr = convert_memory_address (ptr_mode, addr);
-      emit_library_call (stack_check_libfunc, LCT_NORMAL, VOIDmode, 1, addr,
+      emit_library_call (stack_check_libfunc, LCT_MAY_THROW, VOIDmode, 1, addr,
 			 ptr_mode);
     }
 
-  /* Next see if we have an insn to check the stack.  Use it if so.  */
-#ifdef HAVE_check_stack
-  else if (HAVE_check_stack)
+  /* Next see if the runtime has got a symbol for us to compare against.  */
+  else if (stack_check_symbol)
     {
-      insn_operand_predicate_fn pred;
-      rtx last_addr
-	= force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					 stack_pointer_rtx,
-					 plus_constant (size, first)),
-			 NULL_RTX);
+      rtx stack_check_limit = gen_rtx_MEM (Pmode, stack_check_symbol);
+      rtx avail
+	= expand_binop (Pmode, sub_optab,
+			STACK_HIGH (stack_check_limit, stack_pointer_rtx),
+			STACK_LOW (stack_check_limit, stack_pointer_rtx),
+			NULL_RTX, 1, OPTAB_WIDEN);
+      rtx req = plus_constant (size, first);
+
+#if defined(HAVE_conditional_trap)
+      emit_insn (gen_cond_trap (LTU, avail, req, const0_rtx));
+#elif defined(HAVE_trap)
+      rtx ok_lab = gen_label_rtx ();
+      emit_cmp_and_jump_insns (avail, req, GEU, NULL_RTX, Pmode, 1, ok_lab);
+      emit_insn (gen_trap ());
+      emit_barrier ();
+      emit_label (ok_lab);
+#else
+      error ("stack checking not supported on this target");
+#endif
+    }
 
-      pred = insn_data[(int) CODE_FOR_check_stack].operand[0].predicate;
-      if (pred && ! ((*pred) (last_addr, Pmode)))
-	last_addr = copy_to_mode_reg (Pmode, last_addr);
+  /* Otherwise we have to generate explicit probes.  If we have a constant
+     small number of them to generate, that's the easy case.  */
+  else if (CONST_INT_P (size) && INTVAL (size) < 7 * PROBE_INTERVAL)
+    {
+      HOST_WIDE_INT i, offset, size_int = INTVAL (size);
+      rtx addr;
+
+      /* Probe at FIRST + N * PROBE_INTERVAL for values of N from 1 until
+	 it exceeds SIZE.  If only one probe is needed, this will not
+	 generate any code.  Then probe at FIRST + SIZE.  */
+      for (i = PROBE_INTERVAL; i < size_int; i += PROBE_INTERVAL)
+	{
+	  offset = first + i;
+#ifdef STACK_GROWS_DOWNWARD
+	  offset = -offset;
+#endif
+	  addr = memory_address (Pmode,
+				 plus_constant (stack_pointer_rtx, offset));
+	  emit_stack_probe (addr);
+	}
 
-      emit_insn (gen_check_stack (last_addr));
-    }
+      offset = first + size_int;
+#ifdef STACK_GROWS_DOWNWARD
+      offset = -offset;
 #endif
+      addr = memory_address (Pmode, plus_constant (stack_pointer_rtx, offset));
+      emit_stack_probe (addr);
+    }
 
-  /* If we have to generate explicit probes, see if we have a constant
-     small number of them to generate.  If so, that's the easy case.  */
-  else if (CONST_INT_P (size)
-	   && INTVAL (size) < 10 * STACK_CHECK_PROBE_INTERVAL)
+  /* In the variable case, do the same as above, but in a loop.  Note that we
+     must be extra careful with variables wrapping around because we might be
+     at the very top (or the very bottom) of the address space and we have to
+     be able to handle this case properly; in particular, we use an equality
+     test for the loop condition.  */
+  else
     {
-      HOST_WIDE_INT offset;
+      rtx rounded_size, rounded_size_op, test_addr, last_addr, temp;
+      rtx loop_lab = gen_label_rtx ();
+      rtx end_lab = gen_label_rtx ();
+
+      /* Step 1: round SIZE to the previous multiple of the interval.  */
+
+      /* ROUNDED_SIZE = SIZE & -PROBE_INTERVAL  */
+      rounded_size = simplify_gen_binary (AND, Pmode,
+					  size,
+					  GEN_INT (-PROBE_INTERVAL));
+      rounded_size_op = force_operand (rounded_size, NULL_RTX);
+
+
+      /* Step 2: compute initial and final value of the loop counter.  */
+
+      /* TEST_ADDR = SP + FIRST.  */
+      test_addr = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
+					 	 stack_pointer_rtx,
+					 	 GEN_INT (first)),
+				 NULL_RTX);
+
+      /* LAST_ADDR = SP + FIRST + ROUNDED_SIZE.  */
+      last_addr = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
+						 test_addr,
+						 rounded_size_op),
+				 NULL_RTX);
+
+
+      /* Step 3: the loop
+
+	 while (TEST_ADDR != LAST_ADDR)
+	   {
+	     TEST_ADDR = TEST_ADDR + PROBE_INTERVAL
+	     probe at TEST_ADDR
+	   }
+
+	 probes at FIRST + N * PROBE_INTERVAL for values of N from 1
+	 until it is equal to ROUNDED_SIZE.  */
+
+      emit_label (loop_lab);
+
+      /* Jump to END_LAB if TEST_ADDR == LAST_ADDR.  */
+      emit_cmp_and_jump_insns (test_addr, last_addr, EQ,
+			       NULL_RTX, Pmode, 1, end_lab);
+
+      /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL.  */
+      temp = expand_binop (Pmode, STACK_GROW_OPTAB, test_addr,
+			   GEN_INT (PROBE_INTERVAL), test_addr,
+			   1, OPTAB_WIDEN);
+
+      gcc_assert (temp == test_addr);
+
+      /* Probe at TEST_ADDR.  */
+      emit_stack_probe (test_addr);
+
+      emit_jump (loop_lab);
+
+      emit_label (end_lab);
+
+      /* Step 4: probe at FIRST + SIZE if we cannot assert at compile-time
+	 that SIZE is equal to ROUNDED_SIZE.  */
+
+      /* TEMP = SIZE - ROUNDED_SIZE.  */
+      temp = simplify_gen_binary (MINUS, Pmode, size, rounded_size);
+      if (temp != const0_rtx)
+	{
+	  rtx addr;
 
-      /* Start probing at FIRST + N * STACK_CHECK_PROBE_INTERVAL
-	 for values of N from 1 until it exceeds LAST.  If only one
-	 probe is needed, this will not generate any code.  Then probe
-	 at LAST.  */
-      for (offset = first + STACK_CHECK_PROBE_INTERVAL;
-	   offset < INTVAL (size);
-	   offset = offset + STACK_CHECK_PROBE_INTERVAL)
-	emit_stack_probe (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					  stack_pointer_rtx,
-					  GEN_INT (offset)));
+	  if (GET_CODE (temp) == CONST_INT)
+	    {
+	      /* Use [base + disp} addressing mode if supported.  */
+	      HOST_WIDE_INT offset = INTVAL (temp);
+#ifdef STACK_GROWS_DOWNWARD
+	      offset = -offset;
+#endif
+	      addr = memory_address (Pmode, plus_constant (last_addr, offset));
+	    }
+	  else
+	    {
+	      /* Manual CSE if the difference is not known at compile-time.  */
+	      temp = gen_rtx_MINUS (Pmode, size, rounded_size_op);
+	      addr = memory_address (Pmode,
+				     gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
+						     last_addr, temp));
+	    }
+
+	  emit_stack_probe (addr);
+	}
+    }
+}
+
+/* Adjust the stack by SIZE bytes while probing it.  Note that we skip
+   the probe for the first interval and instead probe one interval past
+   the specified size in order to maintain a protection area.  */
+
+static void
+anti_adjust_stack_and_probe (rtx size)
+{
+  rtx probe_interval = GEN_INT (PROBE_INTERVAL);
+
+  /* First ensure SIZE is Pmode.  */
+  if (GET_MODE (size) != VOIDmode && GET_MODE (size) != Pmode)
+    size = convert_to_mode (Pmode, size, 1);
+
+  /* If we have a constant small number of probes to generate, that's the
+     easy case.  */
+  if (GET_CODE (size) == CONST_INT && INTVAL (size) < 7 * PROBE_INTERVAL)
+    {
+      HOST_WIDE_INT i, int_size = INTVAL (size);
+      bool first_probe = true;
+
+      /* Adjust SP and probe to PROBE_INTERVAL + N * PROBE_INTERVAL for
+	 values of N from 1 until it exceeds SIZE.  If only one probe is
+	 needed, this will not generate any code.  Then adjust and probe
+	 to PROBE_INTERVAL + SIZE.  */
+      for (i = PROBE_INTERVAL; i < int_size; i += PROBE_INTERVAL)
+	{
+	  if (first_probe)
+	    {
+	      anti_adjust_stack (GEN_INT (2 * PROBE_INTERVAL));
+	      first_probe = false;
+	    }
+	  else
+	    anti_adjust_stack (probe_interval);
+	  emit_stack_probe (stack_pointer_rtx);
+	}
 
-      emit_stack_probe (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					stack_pointer_rtx,
-					plus_constant (size, first)));
+      if (first_probe)
+	anti_adjust_stack (plus_constant (size, PROBE_INTERVAL));
+      else
+	anti_adjust_stack (plus_constant (size, PROBE_INTERVAL - i));
+      emit_stack_probe (stack_pointer_rtx);
     }
 
-  /* In the variable case, do the same as above, but in a loop.  We emit loop
-     notes so that loop optimization can be done.  */
+  /* In the variable case, do the same as above, but in a loop.  Note that we
+     must be extra careful with variables wrapping around because we might be
+     at the very top (or the very bottom) of the address space and we have to
+     be able to handle this case properly; in particular, we use an equality
+     test for the loop condition.  */
   else
     {
-      rtx test_addr
-	= force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					 stack_pointer_rtx,
-					 GEN_INT (first + STACK_CHECK_PROBE_INTERVAL)),
-			 NULL_RTX);
-      rtx last_addr
-	= force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					 stack_pointer_rtx,
-					 plus_constant (size, first)),
-			 NULL_RTX);
-      rtx incr = GEN_INT (STACK_CHECK_PROBE_INTERVAL);
+      rtx rounded_size, rounded_size_op, last_addr, temp;
       rtx loop_lab = gen_label_rtx ();
-      rtx test_lab = gen_label_rtx ();
       rtx end_lab = gen_label_rtx ();
-      rtx temp;
 
-      if (!REG_P (test_addr)
-	  || REGNO (test_addr) < FIRST_PSEUDO_REGISTER)
-	test_addr = force_reg (Pmode, test_addr);
+      /* Step 1: round SIZE to the previous multiple of the interval.  */
+
+      /* ROUNDED_SIZE = SIZE & -PROBE_INTERVAL  */
+      rounded_size = simplify_gen_binary (AND, Pmode,
+					  size,
+					  GEN_INT (-PROBE_INTERVAL));
+      rounded_size_op = force_operand (rounded_size, NULL_RTX);
+
+
+      /* Step 2: compute initial and final value of the loop counter.  */
+
+      /* SP = SP_0 + PROBE_INTERVAL.  */
+      anti_adjust_stack (probe_interval);
+
+      /* LAST_ADDR = SP_0 + PROBE_INTERVAL + ROUNDED_SIZE.  */
+      last_addr = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
+						 stack_pointer_rtx,
+						 rounded_size_op),
+				 NULL_RTX);
+
 
-      emit_jump (test_lab);
+      /* Step 3: the loop
+
+	  while (SP != LAST_ADDR)
+	    {
+	      SP = SP + PROBE_INTERVAL
+	      probe at SP
+	    }
+
+	 adjusts SP and probes to PROBE_INTERVAL + N * PROBE_INTERVAL for
+	 values of N from 1 until it is equal to ROUNDED_SIZE.  */
 
       emit_label (loop_lab);
-      emit_stack_probe (test_addr);
 
-#ifdef STACK_GROWS_DOWNWARD
-#define CMP_OPCODE GTU
-      temp = expand_binop (Pmode, sub_optab, test_addr, incr, test_addr,
-			   1, OPTAB_WIDEN);
-#else
-#define CMP_OPCODE LTU
-      temp = expand_binop (Pmode, add_optab, test_addr, incr, test_addr,
-			   1, OPTAB_WIDEN);
-#endif
+      /* Jump to END_LAB if SP == LAST_ADDR.  */
+      emit_cmp_and_jump_insns (stack_pointer_rtx, last_addr, EQ,
+			       NULL_RTX, Pmode, 1, end_lab);
+
+      /* SP = SP + PROBE_INTERVAL and probe at SP.  */
+      anti_adjust_stack (probe_interval);
+      emit_stack_probe (stack_pointer_rtx);
 
-      gcc_assert (temp == test_addr);
+      emit_jump (loop_lab);
 
-      emit_label (test_lab);
-      emit_cmp_and_jump_insns (test_addr, last_addr, CMP_OPCODE,
-			       NULL_RTX, Pmode, 1, loop_lab);
-      emit_jump (end_lab);
       emit_label (end_lab);
 
-      emit_stack_probe (last_addr);
+      /* Step 4: adjust SP and probe to PROBE_INTERVAL + SIZE if we cannot
+	 assert at compile-time that SIZE is equal to ROUNDED_SIZE.  */
+
+      /* TEMP = SIZE - ROUNDED_SIZE.  */
+      temp = simplify_gen_binary (MINUS, Pmode, size, rounded_size);
+      if (temp != const0_rtx)
+	{
+	  /* Manual CSE if the difference is not known at compile-time.  */
+	  if (GET_CODE (temp) != CONST_INT)
+	    temp = gen_rtx_MINUS (Pmode, size, rounded_size_op);
+	  anti_adjust_stack (temp);
+	  emit_stack_probe (stack_pointer_rtx);
+	}
     }
+
+  /* Adjust back to account for the additional first interval.  */
+  adjust_stack (probe_interval);
 }
-\f
+
 /* Return an rtx representing the register or memory location
    in which a scalar value of data type VALTYPE
    was returned by a function call to function FUNC.
@@ -1569,5 +1774,3 @@ rtx_to_tree_code (enum rtx_code code)
     }
   return ((int) tcode);
 }
-
-#include "gt-explow.h"
Index: gcc/ira.c
===================================================================
--- gcc/ira.c	(revision 150351)
+++ gcc/ira.c	(working copy)
@@ -1442,6 +1442,9 @@ setup_eliminable_regset (void)
   int need_fp
     = (! flag_omit_frame_pointer
        || (cfun->calls_alloca && EXIT_IGNORE_STACK)
+       /* We need the frame pointer to catch stack overflow exceptions
+	  if the stack pointer is moving.  */
+       || (flag_stack_check && STACK_CHECK_MOVING_SP)
        || crtl->accesses_prior_frames
        || crtl->stack_realign_needed
        || targetm.frame_pointer_required ());
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h	(revision 150351)
+++ gcc/rtl.h	(working copy)
@@ -1484,7 +1484,8 @@ extern int currently_expanding_to_rtl;
 extern int ceil_log2 (unsigned HOST_WIDE_INT);
 
 /* In explow.c */
-extern void set_stack_check_libfunc (rtx);
+extern GTY(()) rtx stack_check_symbol;
+extern GTY(()) rtx stack_check_libfunc;
 extern HOST_WIDE_INT trunc_int_for_mode	(HOST_WIDE_INT, enum machine_mode);
 extern rtx plus_constant (rtx, HOST_WIDE_INT);
 
@@ -2270,7 +2271,8 @@ enum libcall_type
   LCT_PURE = 2,
   LCT_NORETURN = 3,
   LCT_THROW = 4,
-  LCT_RETURNS_TWICE = 5
+  LCT_MAY_THROW = 5,
+  LCT_RETURNS_TWICE = 6
 };
 
 extern void emit_library_call (rtx, enum libcall_type, enum machine_mode, int,
Index: gcc/system.h
===================================================================
--- gcc/system.h	(revision 150351)
+++ gcc/system.h	(working copy)
@@ -746,7 +746,7 @@ extern void fancy_abort (const char *, i
 	TARGET_ASM_EXCEPTION_SECTION TARGET_ASM_EH_FRAME_SECTION	   \
 	SMALL_ARG_MAX ASM_OUTPUT_SHARED_BSS ASM_OUTPUT_SHARED_COMMON	   \
 	ASM_OUTPUT_SHARED_LOCAL UNALIGNED_WORD_ASM_OP			   \
-	ASM_MAKE_LABEL_LINKONCE
+	ASM_MAKE_LABEL_LINKONCE STACK_CHECK_PROBE_INTERVAL
 
 /* Hooks that are no longer used.  */
  #pragma GCC poison LANG_HOOKS_FUNCTION_MARK LANG_HOOKS_FUNCTION_FREE	\
Index: gcc/Makefile.in
===================================================================
--- gcc/Makefile.in	(revision 150351)
+++ gcc/Makefile.in	(working copy)
@@ -2708,7 +2708,7 @@ expmed.o : expmed.c $(CONFIG_H) $(SYSTEM
    $(TOPLEV_H) $(TM_P_H) langhooks.h $(DF_H) $(TARGET_H)
 explow.o : explow.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \
    $(FLAGS_H) hard-reg-set.h insn-config.h $(EXPR_H) $(OPTABS_H) $(RECOG_H) \
-   $(TOPLEV_H) $(EXCEPT_H) $(FUNCTION_H) $(GGC_H) $(TM_P_H) langhooks.h gt-explow.h \
+   $(TOPLEV_H) $(EXCEPT_H) $(FUNCTION_H) $(GGC_H) $(TM_P_H) langhooks.h \
    $(TARGET_H) output.h
 optabs.o : optabs.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
    $(TREE_H) $(FLAGS_H) insn-config.h $(EXPR_H) $(OPTABS_H) libfuncs.h \
Index: gcc/config/i386/i386.h
===================================================================
--- gcc/config/i386/i386.h	(revision 150351)
+++ gcc/config/i386/i386.h	(working copy)
@@ -2428,6 +2428,9 @@ struct GTY(()) machine_function {
 #define SYMBOL_REF_DLLEXPORT_P(X) \
 	((SYMBOL_REF_FLAGS (X) & SYMBOL_FLAG_DLLEXPORT) != 0)
 
+/* Define this to be nonzero to use an inclusive OR.  */
+#define STACK_CHECK_PROBE_IOR 1
+
 /* Model costs for vectorizer.  */
 
 /* Cost of conditional branch.  */
Index: gcc/config/i386/linux.h
===================================================================
--- gcc/config/i386/linux.h	(revision 150351)
+++ gcc/config/i386/linux.h	(working copy)
@@ -207,6 +207,12 @@ along with GCC; see the file COPYING3.
 
 #define MD_UNWIND_SUPPORT "config/i386/linux-unwind.h"
 
+/* Define this to be nonzero if static stack checking is supported.  */
+#define STACK_CHECK_STATIC_BUILTIN 1
+
+/* Define this to be nonzero if the stack pointer needs to be moved.  */
+#define STACK_CHECK_MOVING_SP 1
+
 /* This macro may be overridden in i386/k*bsd-gnu.h.  */
 #define REG_NAME(reg) reg
 
Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md	(revision 150351)
+++ gcc/config/i386/i386.md	(working copy)
@@ -229,15 +229,16 @@ (define_constants
 (define_constants
   [(UNSPECV_BLOCKAGE		0)
    (UNSPECV_STACK_PROBE		1)
-   (UNSPECV_EMMS		2)
-   (UNSPECV_LDMXCSR		3)
-   (UNSPECV_STMXCSR		4)
-   (UNSPECV_FEMMS		5)
-   (UNSPECV_CLFLUSH		6)
-   (UNSPECV_ALIGN		7)
-   (UNSPECV_MONITOR		8)
-   (UNSPECV_MWAIT		9)
-   (UNSPECV_CMPXCHG		10)
+   (UNSPECV_STACK_PROBE_INLINE  2)
+   (UNSPECV_EMMS		3)
+   (UNSPECV_LDMXCSR		4)
+   (UNSPECV_STMXCSR		5)
+   (UNSPECV_FEMMS		6)
+   (UNSPECV_CLFLUSH		7)
+   (UNSPECV_ALIGN		8)
+   (UNSPECV_MONITOR		9)
+   (UNSPECV_MWAIT		10)
+   (UNSPECV_CMPXCHG		11)
    (UNSPECV_XCHG		12)
    (UNSPECV_LOCK		13)
    (UNSPECV_PROLOGUE_USE	14)
@@ -20963,6 +20964,27 @@ (define_expand "allocate_stack"
   DONE;
 })
 
+(define_insn "adjust_stack_and_probe<P:mode>"
+  [(unspec_volatile:P [(match_operand:P 0 "const_int_operand" "n")]
+    UNSPECV_STACK_PROBE_INLINE)
+   (set (reg:P SP_REG) (minus:P (reg:P SP_REG) (match_dup 0)))
+   (clobber (match_operand:P 1 "general_operand" "=rn"))
+   (clobber (reg:CC FLAGS_REG))
+   (clobber (mem:BLK (scratch)))]
+  ""
+  "* return output_adjust_stack_and_probe (operands[0], operands[1]);"
+  [(set_attr "type" "multi")])
+
+(define_insn "probe_stack_range<P:mode>"
+  [(unspec_volatile:P [(match_operand:P 0 "const_int_operand" "n")
+		       (match_operand:P 1 "const_int_operand" "n")]
+    UNSPECV_STACK_PROBE_INLINE)
+   (clobber (match_operand:P 2 "general_operand" "=rn"))
+   (clobber (reg:CC FLAGS_REG))]
+  ""
+  "* return output_probe_stack_range (operands[0], operands[1], operands[2]);"
+  [(set_attr "type" "multi")])
+
 (define_expand "builtin_setjmp_receiver"
   [(label_ref (match_operand 0 "" ""))]
   "!TARGET_64BIT && flag_pic"
@@ -21466,7 +21488,9 @@ (define_peephole2
                      [(match_dup 0)
                       (match_operand:SI 1 "nonmemory_operand" "")]))
               (clobber (reg:CC FLAGS_REG))])]
-  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE"
+  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE
+   /* Do not split stack checking probes.  */
+   && GET_CODE (operands[3]) != IOR && operands[1] != const0_rtx"
   [(set (match_dup 2) (match_dup 0))
    (parallel [(set (match_dup 2)
                    (match_op_dup 3 [(match_dup 2) (match_dup 1)]))
@@ -21481,7 +21505,9 @@ (define_peephole2
                      [(match_operand:SI 1 "nonmemory_operand" "")
                       (match_dup 0)]))
               (clobber (reg:CC FLAGS_REG))])]
-  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE"
+  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE
+   /* Do not split stack checking probes.  */
+   && GET_CODE (operands[3]) != IOR && operands[1] != const0_rtx"
   [(set (match_dup 2) (match_dup 0))
    (parallel [(set (match_dup 2)
                    (match_op_dup 3 [(match_dup 1) (match_dup 2)]))
@@ -22249,6 +22275,14 @@ (define_insn "trap"
   { return ASM_SHORT "0x0b0f"; }
   [(set_attr "length" "2")])
 
+(define_insn "*cond_trap"
+  [(trap_if (match_operator 0 "comparison_operator"
+             [(reg:CC FLAGS_REG) (const_int 0)])
+	    (const_int 6))]
+  ""
+  "* return output_cond_trap (operands[0]);"
+  [(set_attr "length" "4")])
+
 (define_expand "sse_prologue_save"
   [(parallel [(set (match_operand:BLK 0 "" "")
 		   (unspec:BLK [(reg:DI 21)
Index: gcc/config/i386/sol2.h
===================================================================
--- gcc/config/i386/sol2.h	(revision 150351)
+++ gcc/config/i386/sol2.h	(working copy)
@@ -113,6 +113,9 @@ along with GCC; see the file COPYING3.
 #undef X86_FILE_START_VERSION_DIRECTIVE
 #define X86_FILE_START_VERSION_DIRECTIVE false
 
+/* Define this to be nonzero if static stack checking is supported.  */
+#define STACK_CHECK_STATIC_BUILTIN 1
+
 /* Only recent versions of Solaris 11 ld properly support hidden .gnu.linkonce
    sections, so don't use them.  */
 #ifndef TARGET_GNU_LD
Index: gcc/config/i386/vxworksae.h
===================================================================
--- gcc/config/i386/vxworksae.h	(revision 150351)
+++ gcc/config/i386/vxworksae.h	(working copy)
@@ -24,3 +24,8 @@ along with GCC; see the file COPYING3.
   do						\
     builtin_define ("CPU=SIMNT");		\
   while (0)
+
+/* This platform supports the probing method of stack checking and
+   requires 4K of space for executing a possible last chance handler.  */
+#undef STACK_CHECK_PROTECT
+#define STACK_CHECK_PROTECT 4096
Index: gcc/config/i386/lynx.h
===================================================================
--- gcc/config/i386/lynx.h	(revision 150351)
+++ gcc/config/i386/lynx.h	(working copy)
@@ -88,3 +88,6 @@ along with GCC; see the file COPYING3.
    TLS is detected by configure.  We undefine it here.  */
 
 #undef HAVE_AS_TLS
+
+/* Define this to be nonzero if static stack checking is supported.  */
+#define STACK_CHECK_STATIC_BUILTIN 1
Index: gcc/config/i386/linux64.h
===================================================================
--- gcc/config/i386/linux64.h	(revision 150351)
+++ gcc/config/i386/linux64.h	(working copy)
@@ -110,6 +110,12 @@ see the files COPYING3 and COPYING.RUNTI
 
 #define MD_UNWIND_SUPPORT "config/i386/linux-unwind.h"
 
+/* Define this to be nonzero if static stack checking is supported.  */
+#define STACK_CHECK_STATIC_BUILTIN 1
+
+/* Define this to be nonzero if the stack pointer needs to be moved.  */
+#define STACK_CHECK_MOVING_SP 1
+
 /* This macro may be overridden in i386/k*bsd-gnu.h.  */
 #define REG_NAME(reg) reg
 
Index: gcc/config/i386/linux-unwind.h
===================================================================
--- gcc/config/i386/linux-unwind.h	(revision 150351)
+++ gcc/config/i386/linux-unwind.h	(working copy)
@@ -172,6 +172,25 @@ x86_fallback_frame_state (struct _Unwind
   fs->signal_frame = 1;
   return _URC_NO_REASON;
 }
+
+#define MD_FROB_UPDATE_CONTEXT x86_frob_update_context
+
+/* Fix up for kernels that have vDSO, but don't have S flag in it.  */
+
+static void
+x86_frob_update_context (struct _Unwind_Context *context,
+			 _Unwind_FrameState *fs ATTRIBUTE_UNUSED)
+{
+  unsigned char *pc = context->ra;
+
+  /* movl $__NR_rt_sigreturn,%eax ; {int $0x80 | syscall}  */
+  if (*(unsigned char *)(pc+0) == 0xb8
+      && *(unsigned int *)(pc+1) == 173
+      && (*(unsigned short *)(pc+5) == 0x80cd
+	  || *(unsigned short *)(pc+5) == 0x050f))
+    _Unwind_SetSignalFrame (context, 1);
+}
+
 #endif /* not glibc 2.0 */
 #endif /* ifdef __x86_64__  */
 #endif /* ifdef inhibit_libc  */
Index: gcc/config/i386/i386-protos.h
===================================================================
--- gcc/config/i386/i386-protos.h	(revision 150351)
+++ gcc/config/i386/i386-protos.h	(working copy)
@@ -24,6 +24,7 @@ extern void override_options (bool);
 extern void optimization_options (int, int);
 extern void ix86_conditional_register_usage (void);
 
+extern bool ix86_target_stack_probe (void);
 extern int ix86_can_use_return_insn_p (void);
 extern void ix86_setup_frame_addresses (void);
 
@@ -70,6 +71,9 @@ extern const char *output_387_binary_op
 extern const char *output_387_reg_move (rtx, rtx*);
 extern const char *output_fix_trunc (rtx, rtx*, int);
 extern const char *output_fp_compare (rtx, rtx*, int, int);
+extern const char *output_adjust_stack_and_probe (rtx, rtx);
+extern const char *output_cond_trap (rtx);
+extern const char *output_probe_stack_range (rtx, rtx, rtx);
 
 extern void ix86_expand_clear (rtx);
 extern void ix86_expand_move (enum machine_mode, rtx[]);
Index: gcc/config/i386/freebsd.h
===================================================================
--- gcc/config/i386/freebsd.h	(revision 150351)
+++ gcc/config/i386/freebsd.h	(working copy)
@@ -138,3 +138,6 @@ along with GCC; see the file COPYING3.
    compiler get the contents of <float.h> and std::numeric_limits correct.  */
 #undef TARGET_96_ROUND_53_LONG_DOUBLE
 #define TARGET_96_ROUND_53_LONG_DOUBLE (!TARGET_64BIT)
+
+/* Define this to be nonzero if static stack checking is supported.  */
+#define STACK_CHECK_STATIC_BUILTIN 1
Index: gcc/config/i386/vxworks.h
===================================================================
--- gcc/config/i386/vxworks.h	(revision 150351)
+++ gcc/config/i386/vxworks.h	(working copy)
@@ -76,3 +76,11 @@ along with GCC; see the file COPYING3.
 /* We cannot use PC-relative accesses for VxWorks PIC because there is no
    fixed gap between segments.  */
 #undef ASM_PREFERRED_EH_DATA_FORMAT
+
+/* Define this to be nonzero if static stack checking is supported.  */
+#define STACK_CHECK_STATIC_BUILTIN 1
+
+/* This platform supports the probing method of stack checking (RTP mode)
+   and the ZCX mechanism. 8K is reserved in the stack to propagate
+   exceptions reliably in case of stack overflow. */
+#define STACK_CHECK_PROTECT 8192
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 150351)
+++ gcc/config/i386/i386.c	(working copy)
@@ -1881,6 +1881,7 @@ static struct machine_function * ix86_in
 static rtx ix86_function_value (const_tree, const_tree, bool);
 static int ix86_function_regparm (const_tree, const_tree);
 static void ix86_compute_frame_layout (struct ix86_frame *);
+static rtx ix86_expand_int_compare (enum rtx_code, rtx, rtx);
 static bool ix86_expand_vector_init_one_nonzero (bool, enum machine_mode,
 						 rtx, rtx, int);
 static void ix86_add_new_builtins (int);
@@ -7936,6 +7937,10 @@ ix86_compute_frame_layout (struct ix86_f
   else
     frame->save_regs_using_mov = false;
 
+  /* If static stack checking is enabled and done with probes, the registers
+     need to be saved before allocating the frame.  */
+  if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK && !stack_check_symbol)
+    frame->save_regs_using_mov = false;
 
   /* Skip return address and saved base pointer.  */
   offset = frame_pointer_needed ? UNITS_PER_WORD * 2 : UNITS_PER_WORD;
@@ -8331,6 +8336,442 @@ ix86_internal_arg_pointer (void)
   return virtual_incoming_args_rtx;
 }
 
+struct scratch_reg {
+  rtx reg;
+  bool saved;
+};
+
+/* Return a short-lived scratch register for use on function entry.
+   In 32-bit mode, it is valid only after the registers are saved
+   in the prologue.  This register must be released by means of
+   release_scratch_register_on_entry once it is dead.  */
+
+static void
+get_scratch_register_on_entry (struct scratch_reg *sr)
+{
+  int regno;
+
+  sr->saved = false;
+
+  if (TARGET_64BIT)
+    regno = FIRST_REX_INT_REG + 3; /* r11 */
+  else
+    {
+      tree decl = current_function_decl, fntype = TREE_TYPE (decl);
+      bool fastcall_p
+	= lookup_attribute ("fastcall", TYPE_ATTRIBUTES (fntype)) != NULL_TREE;
+      int regparm = ix86_function_regparm (fntype, decl);
+      int drap_regno
+	= crtl->drap_reg ? REGNO (crtl->drap_reg) : INVALID_REGNUM;
+
+      /* 'fastcall' sets regparm to 2 and uses ecx+edx.  */
+      if ((regparm < 1 || fastcall_p) && drap_regno != 0)
+	regno = 0;
+      else if (regparm < 2 && drap_regno != 1)
+	regno = 1;
+      else if (regparm < 3 && !fastcall_p && drap_regno != 2
+	       /* ecx is the static chain register.  */
+	       && (!decl_function_context (decl)
+		   || DECL_NO_STATIC_CHAIN (decl)))
+	regno = 2;
+      else if (ix86_save_reg (3, true))
+	regno = 3;
+      else if (ix86_save_reg (4, true))
+	regno = 4;
+      else if (ix86_save_reg (5, true))
+	regno = 5;
+      else
+	{
+	  regno = (drap_regno == 0 ? 1 : 0);
+	  sr->saved = true;
+	}
+    }
+
+  sr->reg = gen_rtx_REG (Pmode, regno);
+  if (sr->saved)
+    {
+      rtx insn = emit_insn (gen_push (sr->reg));
+      RTX_FRAME_RELATED_P (insn) = 1;
+    }
+}
+
+/* Release a scratch register obtained from the preceding function.  */
+
+static void
+release_scratch_register_on_entry (struct scratch_reg *sr)
+{
+  if (sr->saved)
+    {
+      rtx insn, x;
+
+      if (TARGET_64BIT)
+	insn = emit_insn (gen_popdi1 (sr->reg));
+      else
+	insn = emit_insn (gen_popsi1 (sr->reg));
+      RTX_FRAME_RELATED_P (insn) = 1;
+
+      /* The RTX_FRAME_RELATED_P mechanism doesn't know about pop.  */
+      x = plus_constant (stack_pointer_rtx, UNITS_PER_WORD);
+      x = gen_rtx_SET (VOIDmode, stack_pointer_rtx, x);
+      add_reg_note (insn, REG_CFA_ADJUST_CFA, x);
+    }
+}
+
+/* The run-time loop is made up of 8 insns in the generic case while this
+   compile-time loop is made up of n insns for n # of intervals.  */
+#define PROBE_INTERVAL (1 << STACK_CHECK_PROBE_INTERVAL_EXP)
+#define SMALL_INTERVAL(size) ((size) <= 8 * PROBE_INTERVAL)
+
+/* Output one probe.  */
+
+static inline void
+output_probe_op (void)
+{
+  fputs (TARGET_64BIT ? "\torq\t$0, " : "\torl\t$0, ", asm_out_file);
+}
+
+/* Adjust the stack by SIZE bytes and output one probe.  */
+
+static void
+output_adjust_stack_and_probe_op (HOST_WIDE_INT size)
+{
+  fprintf (asm_out_file, "\tsub\t$"HOST_WIDE_INT_PRINT_DEC",", size);
+  print_reg (stack_pointer_rtx, 0, asm_out_file);
+  fputc ('\n', asm_out_file);
+  output_probe_op ();
+  fputc ('(', asm_out_file);
+  print_reg (stack_pointer_rtx, 0, asm_out_file);
+  fputs (")\n", asm_out_file);
+}
+
+/* Adjust the stack by SIZE bytes while probing it.  Note that we skip
+   the probe for the first interval and instead probe one interval past
+   the specified size in order to maintain a protection area.  */
+
+const char *
+output_adjust_stack_and_probe (rtx size_rtx, rtx reg)
+{
+  static int labelno = 0;
+  HOST_WIDE_INT size = INTVAL (size_rtx);
+  HOST_WIDE_INT rounded_size;
+  char loop_lab[32], end_lab[32];
+
+  /* See if we have a constant small number of probes to generate.  If so,
+     that's the easy case.  */
+  if (SMALL_INTERVAL (size))
+    {
+      HOST_WIDE_INT i;
+      bool first_probe = true;
+
+      /* Adjust SP and probe to PROBE_INTERVAL + N * PROBE_INTERVAL for
+	 values of N from 1 until it exceeds SIZE.  If only one probe is
+	 needed, this will not generate any code.  Then adjust and probe
+	 to PROBE_INTERVAL + SIZE.  */
+      for (i = PROBE_INTERVAL; i < size; i += PROBE_INTERVAL)
+	{
+	  if (first_probe)
+	    {
+	      output_adjust_stack_and_probe_op (2 * PROBE_INTERVAL);
+	      first_probe = false;
+	    }
+	  else
+	    output_adjust_stack_and_probe_op (PROBE_INTERVAL);
+	}
+
+      if (first_probe)
+	output_adjust_stack_and_probe_op (size + PROBE_INTERVAL);
+      else
+	output_adjust_stack_and_probe_op (size + PROBE_INTERVAL - i);
+    }
+
+  /* In the variable case, do the same as above, but in a loop.  Note that we
+     must be extra careful with variables wrapping around because we might be
+     at the very top (or the very bottom) of the address space and we have to
+     be able to handle this case properly; in particular, we use an equality
+     test for the loop condition.  */
+  else
+    {
+      /* Step 1: round SIZE to the previous multiple of the interval.  */
+      rounded_size = size & -PROBE_INTERVAL;
+
+
+      /* Step 2: compute initial and final value of the loop counter.  */
+
+      /* SP = SP_0 + PROBE_INTERVAL.  */
+      fprintf (asm_out_file, "\tsub\t$%d, ", PROBE_INTERVAL);
+      print_reg (stack_pointer_rtx, 0, asm_out_file);
+
+      /* LAST_ADDR = SP_0 + PROBE_INTERVAL + ROUNDED_SIZE.  */
+      fprintf (asm_out_file, "\n\tmov\t$-"HOST_WIDE_INT_PRINT_DEC", ",
+	       rounded_size);
+      print_reg (reg, 0, asm_out_file);
+      fputs ("\n\tadd\t", asm_out_file);
+      print_reg (stack_pointer_rtx, 0, asm_out_file);
+      fputs (", ", asm_out_file);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+
+
+      /* Step 3: the loop
+
+	  while (SP != LAST_ADDR)
+	    {
+	      SP = SP + PROBE_INTERVAL
+	      probe at SP
+	    }
+
+	 adjusts SP and probes to PROBE_INTERVAL + N * PROBE_INTERVAL for
+	 values of N from 1 until it is equal to ROUNDED_SIZE.  */
+
+      ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno);
+      ASM_OUTPUT_LABEL (asm_out_file, loop_lab);
+
+      /* Jump to END_LAB if SP == LAST_ADDR.  */
+      fputs ("\tcmp\t", asm_out_file);
+      print_reg (stack_pointer_rtx, 0, asm_out_file);
+      fputs (", ", asm_out_file);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+      ASM_GENERATE_INTERNAL_LABEL (end_lab, "LPSRE", labelno++);
+      fputs ("\tje\t", asm_out_file); assemble_name (asm_out_file, end_lab);
+      fputc ('\n', asm_out_file);
+
+      /* SP = SP + PROBE_INTERVAL and probe at SP.  */
+      output_adjust_stack_and_probe_op (PROBE_INTERVAL);
+
+      fprintf (asm_out_file, "\tjmp\t"); assemble_name (asm_out_file, loop_lab);
+      fputc ('\n', asm_out_file);
+
+      ASM_OUTPUT_LABEL (asm_out_file, end_lab);
+
+
+      /* Step 4: adjust SP and probe to PROBE_INTERVAL + SIZE if we cannot
+	 assert at compile-time that SIZE is equal to ROUNDED_SIZE.  */
+      if (size != rounded_size)
+	output_adjust_stack_and_probe_op (size - rounded_size);
+    }
+
+  /* Adjust back to account for the additional first interval.  */
+  fprintf (asm_out_file, "\tadd\t$%d, ", PROBE_INTERVAL);
+  print_reg (stack_pointer_rtx, 0, asm_out_file);
+  fputc ('\n', asm_out_file);
+
+  return "";
+}
+
+/* Wrapper around gen_adjust_stack_and_probe.  */
+
+static rtx
+ix86_gen_adjust_stack_and_probe (rtx op0, rtx op1)
+{
+  if (TARGET_64BIT)
+    return gen_adjust_stack_and_probedi (op0, op1);
+  else
+    return gen_adjust_stack_and_probesi (op0, op1);
+}
+
+/* Emit code to adjust the stack by SIZE bytes while probing it.  */
+
+static void
+ix86_adjust_stack_and_probe (HOST_WIDE_INT size)
+{
+  rtx size_rtx = GEN_INT (size);
+
+  if (SMALL_INTERVAL (size))
+    emit_insn (ix86_gen_adjust_stack_and_probe (size_rtx, const0_rtx));
+  else
+    {
+      struct scratch_reg sr;
+      get_scratch_register_on_entry (&sr);
+      emit_insn (ix86_gen_adjust_stack_and_probe (size_rtx, sr.reg));
+      release_scratch_register_on_entry (&sr);
+    }
+
+  gcc_assert (ix86_cfa_state->reg != stack_pointer_rtx);
+
+  /* Make sure nothing is scheduled before we are done.  */
+  emit_insn (gen_blockage ());
+}
+
+/* Output a conditional trap.  COND is the condition code.  */
+
+const char *
+output_cond_trap (rtx cond)
+{
+  static int labelno = 0;
+  char ok_lab[32];
+
+  ASM_GENERATE_INTERNAL_LABEL (ok_lab, "LOCT", labelno++);
+
+  fputs ("\tj", asm_out_file); print_operand (asm_out_file, cond, 'c');
+  fputs ("\t", asm_out_file); assemble_name (asm_out_file, ok_lab);
+  fputs ("\n" ASM_SHORT "0x0b0f\n", asm_out_file);
+  ASM_OUTPUT_LABEL (asm_out_file, ok_lab);
+
+  return "";
+}
+
+/* Output one probe at OFFSET + INDEX from the current stack pointer.  */
+
+static void
+output_probe_stack_range_op (HOST_WIDE_INT offset, rtx index)
+{
+  output_probe_op ();
+  if (offset)
+    fprintf (asm_out_file, "-"HOST_WIDE_INT_PRINT_DEC, offset);
+  fputc ('(', asm_out_file);
+  print_reg (stack_pointer_rtx, 0, asm_out_file);
+  if (index)
+    {
+      fputc (',', asm_out_file);
+      print_reg (index, 0, asm_out_file);
+      fputs (",1", asm_out_file);
+    }
+  fputs (")\n", asm_out_file);
+}
+
+/* Probe a range of stack addresses from FIRST to FIRST+SIZE, inclusive.
+   These are offsets from the current stack pointer.  */
+
+const char *
+output_probe_stack_range (rtx first_rtx, rtx size_rtx, rtx reg)
+{
+  static int labelno = 0;
+  HOST_WIDE_INT first = INTVAL (first_rtx);
+  HOST_WIDE_INT size = INTVAL (size_rtx);
+  HOST_WIDE_INT rounded_size;
+  char loop_lab[32], end_lab[32];
+
+  /* See if we have a constant small number of probes to generate.  If so,
+     that's the easy case.  */
+  if (SMALL_INTERVAL (size))
+    {
+      HOST_WIDE_INT i;
+
+      /* Probe at FIRST + N * PROBE_INTERVAL for values of N from 1 until
+	 it exceeds SIZE.  If only one probe is needed, this will not
+	 generate any code.  Then probe at FIRST + SIZE.  */
+      for (i = PROBE_INTERVAL; i < size; i += PROBE_INTERVAL)
+	output_probe_stack_range_op (first + i, NULL_RTX);
+
+      output_probe_stack_range_op (first + size, NULL_RTX);
+    }
+
+  /* Otherwise, do the same as above, but in a loop.  Note that we must be
+     extra careful with variables wrapping around because we might be at
+     the very top (or the very bottom) of the address space and we have
+     to be able to handle this case properly; in particular, we use an
+     equality test for the loop condition.  */
+  else
+    {
+      /* Step 1: round SIZE to the previous multiple of the interval.  */
+      rounded_size = size & -PROBE_INTERVAL;
+
+
+      /* Step 2: compute initial and final value of the loop counter.  */
+
+      /* TEST_OFFSET = FIRST.  */
+      fprintf (asm_out_file, "\tmov\t$-"HOST_WIDE_INT_PRINT_DEC", ", first);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+
+      /* LAST_OFFSET = FIRST + ROUNDED_SIZE.  */
+
+
+      /* Step 3: the loop
+
+	 while (TEST_ADDR != LAST_ADDR)
+	   {
+	     TEST_ADDR = TEST_ADDR + PROBE_INTERVAL
+	     probe at TEST_ADDR
+	   }
+
+	 probes at FIRST + N * PROBE_INTERVAL for values of N from 1
+	 until it is equal to ROUNDED_SIZE.  */
+
+      ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno);
+      ASM_OUTPUT_LABEL (asm_out_file, loop_lab);
+
+       /* Jump to END_LAB if TEST_ADDR == LAST_ADDR.  */
+      fprintf (asm_out_file, "\tcmp\t$-"HOST_WIDE_INT_PRINT_DEC", ",
+	       first + rounded_size);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+      ASM_GENERATE_INTERNAL_LABEL (end_lab, "LPSRE", labelno++);
+      fputs ("\tje\t", asm_out_file); assemble_name (asm_out_file, end_lab);
+      fputc ('\n', asm_out_file);
+
+      /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL.  */
+      fprintf (asm_out_file, "\tsub\t$%d, ", PROBE_INTERVAL);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+
+      /* Probe at TEST_ADDR.  */
+      output_probe_stack_range_op (0, reg);
+
+      fprintf (asm_out_file, "\tjmp\t"); assemble_name (asm_out_file, loop_lab);
+      fputc ('\n', asm_out_file);
+
+      ASM_OUTPUT_LABEL (asm_out_file, end_lab);
+
+
+      /* Step 4: probe at FIRST + SIZE if we cannot assert at compile-time
+	 that SIZE is equal to ROUNDED_SIZE.  */
+      if (size != rounded_size)
+	output_probe_stack_range_op (size - rounded_size, reg);
+    }
+
+  return "";
+}
+
+/* Wrapper around gen_probe_stack_range.  */
+
+static rtx
+ix86_gen_probe_stack_range (rtx op0, rtx op1, rtx op2)
+{
+  if (TARGET_64BIT)
+    return gen_probe_stack_rangedi (op0, op1, op2);
+  else
+    return gen_probe_stack_rangesi (op0, op1, op2);
+}
+
+/* Emit code to probe a range of stack addresses from FIRST to FIRST+SIZE,
+   inclusive.  These are offsets from the current stack pointer.  */
+
+static void
+ix86_emit_probe_stack_range (HOST_WIDE_INT first, HOST_WIDE_INT size)
+{
+  gcc_assert (!stack_check_libfunc); /* Not implemented.  */
+
+  if (stack_check_symbol)
+    {
+      struct scratch_reg sr;
+      rtx res;
+
+      get_scratch_register_on_entry (&sr);
+      emit_move_insn (sr.reg,
+		      plus_constant (stack_pointer_rtx, -(first + size)));
+      res = ix86_expand_int_compare (LTU, sr.reg,
+				     gen_rtx_MEM (Pmode, stack_check_symbol));
+      emit_insn (gen_rtx_TRAP_IF (VOIDmode, res, GEN_INT (6)));
+      release_scratch_register_on_entry (&sr);
+    }
+  else if (SMALL_INTERVAL (size))
+     emit_insn (ix86_gen_probe_stack_range (GEN_INT (first), GEN_INT (size),
+					    const0_rtx));
+  else
+    {
+      struct scratch_reg sr;
+      get_scratch_register_on_entry (&sr);
+      emit_insn (ix86_gen_probe_stack_range (GEN_INT (first), GEN_INT (size),
+					     sr.reg));
+      release_scratch_register_on_entry (&sr);
+    }
+
+  /* Make sure nothing is scheduled before we are done.  */
+  emit_insn (gen_blockage ());
+}
+
 /* Finalize stack_realign_needed flag, which will guide prologue/epilogue
    to be generated in correct form.  */
 static void 
@@ -8460,6 +8901,31 @@ ix86_expand_prologue (void)
   else
     allocate += frame.nregs * UNITS_PER_WORD;
 
+  /* The stack has already been decremented by the instruction calling us
+     so we need to probe unconditionally to preserve the protection area.  */
+  if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
+    {
+      /* We expect the registers to be saved when probes are used.  */
+      gcc_assert (!frame.save_regs_using_mov || stack_check_symbol);
+
+      if (STACK_CHECK_MOVING_SP)
+	{
+	  ix86_adjust_stack_and_probe (allocate);
+	  allocate = 0;
+	}
+      else
+	{
+	  const HOST_WIDE_INT max_size = 0x7fffffff - STACK_CHECK_PROTECT;
+	  HOST_WIDE_INT size = allocate;
+
+	  /* Don't bother probing more than 2 GB, this is easier.  */
+	  if (size > max_size)
+	    size = max_size;
+
+	  ix86_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
+	}
+    }
+
   /* When using red zone we may start register saving before allocating
      the stack frame saving one cycle of the prologue. However I will
      avoid doing this if I am going to have to probe the stack since
Index: libada/Makefile.in
===================================================================
--- libada/Makefile.in	(revision 150351)
+++ libada/Makefile.in	(working copy)
@@ -56,6 +56,8 @@ WARN_CFLAGS = @warn_cflags@
 
 TARGET_LIBGCC2_CFLAGS=
 GNATLIBCFLAGS= -g -O2
+GNATLIBCFLAGS_FOR_C = $(GNATLIBCFLAGS) $(TARGET_LIBGCC2_CFLAGS) -fexceptions \
+	-DIN_RTS @have_getipinfo@
 
 # Get target-specific overrides for TARGET_LIBGCC2_CFLAGS.
 host_subdir = @host_subdir@
@@ -78,6 +80,7 @@ LIBADA_FLAGS_TO_PASS = \
         "SHELL=$(SHELL)" \
         "GNATLIBFLAGS=$(GNATLIBFLAGS) $(MULTIFLAGS)" \
         "GNATLIBCFLAGS=$(GNATLIBCFLAGS) $(MULTIFLAGS)" \
+        "GNATLIBCFLAGS_FOR_C=$(GNATLIBCFLAGS_FOR_C) $(MULTIFLAGS)" \
         "TARGET_LIBGCC2_CFLAGS=$(TARGET_LIBGCC2_CFLAGS)" \
         "THREAD_KIND=$(THREAD_KIND)" \
         "TRACE=$(TRACE)" \
Index: libada/configure.ac
===================================================================
--- libada/configure.ac	(revision 150351)
+++ libada/configure.ac	(working copy)
@@ -18,6 +18,7 @@
 sinclude(../config/acx.m4)
 sinclude(../config/multi.m4)
 sinclude(../config/override.m4)
+sinclude(../config/unwind_ipinfo.m4)
 
 AC_INIT
 AC_PREREQ([2.59])
@@ -130,6 +131,14 @@ else
 fi
 AC_SUBST([default_gnatlib_target])
 
+# Check for _Unwind_GetIPInfo.
+GCC_CHECK_UNWIND_GETIPINFO
+have_getipinfo=
+if test x$have_unwind_getipinfo = xyes; then
+  have_getipinfo=-DHAVE_GETIPINFO
+fi
+AC_SUBST(have_getipinfo)
+
 AC_PROG_CC
 warn_cflags=
 if test "x$GCC" = "xyes"; then

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch] New -fstack-check implementation (2/n)
  2009-08-04 11:38 [Patch] New -fstack-check implementation (2/n) Eric Botcazou
@ 2009-09-02  0:46 ` Ian Lance Taylor
  2009-09-02  7:46   ` Eric Botcazou
  2009-09-29 23:25   ` Eric Botcazou
  0 siblings, 2 replies; 11+ messages in thread
From: Ian Lance Taylor @ 2009-09-02  0:46 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

Eric Botcazou <ebotcazou@adacore.com> writes:

+@defmac STACK_CHECK_PROBE_IOR
+An integer which is nonzero if GCC should perform the stack probe
+as an inclusive OR instruction.  The default is zero.
 @end defmac

I know this follows the exisitng pattern of STACK_CHECK_PROBE_LOAD, but
I think it would be better to use an insn.  Why not use introduce a
check_stack_probe insn?  Or, since your patch is eliminating the
check_stack insn, why not reuse that?


> +@defmac STACK_CHECK_MOVING_SP
> +An integer which is nonzero if GCC should move the stack pointer during
> +stack checking.  This can be necessary on systems where the stack pointer
> +contains the bottom address of the memory area accessible to the executing
> +thread at any point in time.  In this situation an alternate signal stack
> +is required in order to be able to recover from a stack overflow.
> +The default value of this macro is zero.
> +@end defmac

I find this documentation to be somewhat cryptic.  If I understand
correctly, when this macro is defined, gcc will adjust the stack pointer
page by page when doing probes.  Please say that in the doc.  It's not
clear to me why the Linux kernel requires this--can you expand on that?
And why a target macro rather than a target hook for something new?


> -  /* Enable GNAT stack checking method if needed */
> -  if (!Stack_Check_Probes_On_Target)
> -    set_stack_check_libfunc (gen_rtx_SYMBOL_REF (Pmode, "_gnat_stack_check"));
> +  /* Enable appropriate stack checking method.  */
> +  if (Stack_Check_Probes_On_Target)
> +    ;
> +  else if (Stack_Check_Limits_On_Target)
> +    stack_check_symbol = gen_rtx_SYMBOL_REF (Pmode, "__gnat_stack_limit");
> +  else
> +    stack_check_libfunc = gen_rtx_SYMBOL_REF (Pmode, "_gnat_stack_check");

Letting the frontend set a global variable in the middle end seems to me
to be an ugly interface.  Why did you change away from the
set_stack_check_libfunc function?

Much more seriously, I don't see how this can work when using LTO.  In
LTO, by the time we expand stack checking, the frontend is gone.  This
needs to work differently.


>  void
>  probe_stack_range (HOST_WIDE_INT first, rtx size)
>  {
> +#ifdef SPARC_STACK_BIAS
> +  /* The probe offsets are counted negatively whereas the stack bias is
> +     counted positively.  */
> +  first -= SPARC_STACK_BIAS;
> +#endif

This does not look good in target independent code.  This needs to
become an officially named and documented target hook.


> +#if defined(HAVE_conditional_trap)
> +      emit_insn (gen_cond_trap (LTU, avail, req, const0_rtx));
> +#elif defined(HAVE_trap)

It's not enough to check the #ifdef; you also have to check available
with if ().


> +/* Define this to be nonzero to use an inclusive OR.  */
> +#define STACK_CHECK_PROBE_IOR 1

Don't write this style of comment in a tm.h file.  That's what the docs
are for.  Instead, write a comment saying why this should be defined to
1 here.


> +/* Define this to be nonzero if static stack checking is supported.  */
> +#define STACK_CHECK_STATIC_BUILTIN 1
> +
> +/* Define this to be nonzero if the stack pointer needs to be moved.  */
> +#define STACK_CHECK_MOVING_SP 1

Same here.


Ian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch] New -fstack-check implementation (2/n)
  2009-09-02  0:46 ` Ian Lance Taylor
@ 2009-09-02  7:46   ` Eric Botcazou
  2009-09-29 23:25   ` Eric Botcazou
  1 sibling, 0 replies; 11+ messages in thread
From: Eric Botcazou @ 2009-09-02  7:46 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-patches

> I know this follows the exisitng pattern of STACK_CHECK_PROBE_LOAD, but
> I think it would be better to use an insn.  Why not use introduce a
> check_stack_probe insn?  Or, since your patch is eliminating the
> check_stack insn, why not reuse that?

Yes, I agree that the STACK_CHECK_PROBE_* interface could be changed.

> I find this documentation to be somewhat cryptic.  If I understand
> correctly, when this macro is defined, gcc will adjust the stack pointer
> page by page when doing probes.  Please say that in the doc.  It's not
> clear to me why the Linux kernel requires this--can you expand on that?

On x86 and x86-64 Linux, you cannot probe below the page that contains the 
current stack pointer, you get a SIGSEGV.  See the discussion with one of 
your coworkers last year:
  http://gcc.gnu.org/ml/gcc-patches/2008-07/msg00869.html

That seems to be unique to this couple of kernels, no problem on other OSes or 
on IA-64.

> And why a target macro rather than a target hook for something new?

For consistency, all the stack checking is parameterized with macros.  I agree 
that this could be changed.

> Letting the frontend set a global variable in the middle end seems to me
> to be an ugly interface.  Why did you change away from the
> set_stack_check_libfunc function?

Because back-ends need to have access to the variable as well.

> Much more seriously, I don't see how this can work when using LTO.  In
> LTO, by the time we expand stack checking, the frontend is gone.  This
> needs to work differently.

OK, let's drop stack checking with run-time support for the time being, this 
only affects VxWorks in practice.

> >  void
> >  probe_stack_range (HOST_WIDE_INT first, rtx size)
> >  {
> > +#ifdef SPARC_STACK_BIAS
> > +  /* The probe offsets are counted negatively whereas the stack bias is
> > +     counted positively.  */
> > +  first -= SPARC_STACK_BIAS;
> > +#endif
>
> This does not look good in target independent code.  This needs to
> become an officially named and documented target hook.

OK, let's drop this hack for now.

> > +#if defined(HAVE_conditional_trap)
> > +      emit_insn (gen_cond_trap (LTU, avail, req, const0_rtx));
> > +#elif defined(HAVE_trap)
>
> It's not enough to check the #ifdef; you also have to check available
> with if ().

OK.  Let's drop it anyway, this is for VxWorks only.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch] New -fstack-check implementation (2/n)
  2009-09-02  0:46 ` Ian Lance Taylor
  2009-09-02  7:46   ` Eric Botcazou
@ 2009-09-29 23:25   ` Eric Botcazou
  2009-10-29 17:18     ` Eric Botcazou
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Botcazou @ 2009-09-29 23:25 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3886 bytes --]

> Much more seriously, I don't see how this can work when using LTO.  In
> LTO, by the time we expand stack checking, the frontend is gone.  This
> needs to work differently.

OK, I've removed all the non-essential bits from the patch and left some 
broken stuff in the compiler as-is.  This implements working stack checking 
for x86/x86-64, Linux and Solaris only.  Full ACATS passes with -fstack-check 
as well as with -O2 -fstack-check.

Tested on i586-suse-linux and x86_64-suse-linux.


2009-09-29  Eric Botcazou  <ebotcazou@adacore.com>

	PR target/10127
	PR ada/20548
	* expr.h (STACK_CHECK_PROBE_INTERVAL): Delete.
	(STACK_CHECK_PROBE_INTERVAL_EXP): New macro.
	(STACK_CHECK_MOVING_SP): Likewise.
	* system.h (STACK_CHECK_PROBE_INTERVAL): Poison it.
	* doc/tm.texi (Stack Checking): Delete STACK_CHECK_PROBE_INTERVAL.
	Document STACK_CHECK_PROBE_INTERVAL_EXP and STACK_CHECK_MOVING_SP.
	* explow.c (anti_adjust_stack_and_probe): New function.
	(allocate_dynamic_stack_space): Do not directly allocate space if
	STACK_CHECK_MOVING_SP, instead invoke above function.
	(emit_stack_probe): Handle probe_stack insn.
	(PROBE_INTERVAL): New macro.
	(STACK_GROW_OPTAB): Likewise.
	(STACK_HIGH, STACK_LOW): Likewise.
	(probe_stack_range): Remove support code for dedicated pattern.  Fix
	loop condition in the small constant case.  Rewrite in the general
	case to be immune to wrap-around.  Make sure the address of probes
	is valid.  Try to use [base + disp] addressing mode if possible.
	* ira.c (setup_eliminable_regset): Set frame_pointer_needed if stack
	checking is enabled and STACK_CHECK_MOVING_SP.
	* rtlanal.c (may_trap_p_1) <MEM>: If stack checking is enabled,
	return 1 for volatile references to the stack pointer.
	* tree.c (build_common_builtin_nodes): Do not set ECF_NOTHROW on
	__builtin_alloca if stack checking is enabled.
	* unwind-dw2.c (uw_identify_context): Take into account whether the
	context is that of a signal frame or not.
	* config/i386/linux-unwind.h (x86_frob_update_context): New function.
	(MD_FROB_UPDATE_CONTEXT): Define.
	* config/i386/linux.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
	(STACK_CHECK_MOVING_SP): Likewise.
	* config/i386/linux64.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
	(STACK_CHECK_MOVING_SP): Likewise.
	* config/i386/sol2.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
	* config/i386/i386.c (ix86_compute_frame_layout): Force use of push
	instructions to save registers if stack checking with probes is on.
	(get_scratch_register_on_entry): New function.
	(release_scratch_register_on_entry): Likewise.
	(output_probe_op): Likewise.
	(output_adjust_stack_and_probe_op): Likewise.
	(output_adjust_stack_and_probe): Likewise.
	(ix86_gen_adjust_stack_and_probe): Likewise.
	(ix86_adjust_stack_and_probe): Likewise.
	(output_probe_stack_range_op): Likewise.
	(ix86_gen_probe_stack_range): Likewise
	(ix86_emit_probe_stack_range): Likewise.
	(ix86_expand_prologue): Emit stack checking code if static builtin
	stack checking is enabled.
	* config/i386/i386-protos.h (output_adjust_stack_and_probe): Declare.
	(output_probe_stack_range): Likewise.
	* config/i386/i386.md (UNSPECV_STACK_PROBE_INLINE): New constant.
	(probe_stack): New expander.
	(adjust_stack_and_probe): New insn.
	(probe_stack_range): Likewise.
	(logical operation peepholes): Do not split stack checking probes.
ada/
        * gcc-interface/Makefile.in: Pass GNATLIBCFLAGS_FOR_C to recursive
        invocations.


2009-09-29  Eric Botcazou  <ebotcazou@adacore.com>

        * Makefile.in (GNATLIBCFLAGS_FOR_C): New variable.
        (LIBADA_FLAGS_TO_PASS): Add GNATLIBCFLAGS_FOR_C.
        * configure.ac: Include config/unwind_ipinfo.m4.
        Check for _Unwind_GetIPInfo.
        * configure: Regenerate.


2009-09-29  Eric Botcazou  <ebotcazou@adacore.com>

        * gnat.dg/stack_check.adb1: New test.
	* gnat.dg/stack_check.adb2: Likewise.


-- 
Eric Botcazou

[-- Attachment #2: gcc-45_stack-check-2b.diff --]
[-- Type: text/x-diff, Size: 46348 bytes --]

Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	(revision 152265)
+++ gcc/doc/tm.texi	(working copy)
@@ -3536,11 +3536,12 @@ like to do static stack checking in some
 approach.  The default value of this macro is zero.
 @end defmac
 
-@defmac STACK_CHECK_PROBE_INTERVAL
-An integer representing the interval at which GCC must generate stack
-probe instructions.  You will normally define this macro to be no larger
-than the size of the ``guard pages'' at the end of a stack area.  The
-default value of 4096 is suitable for most systems.
+@defmac STACK_CHECK_PROBE_INTERVAL_EXP
+An integer specifying the interval at which GCC must generate stack probe
+instructions, defined as 2 raised to this integer.  You will normally
+define this macro so that the interval be no larger than the size of
+the ``guard pages'' at the end of a stack area.  The default value
+of 12 (4096-byte interval) is suitable for most systems.
 @end defmac
 
 @defmac STACK_CHECK_PROBE_LOAD
@@ -3549,6 +3550,15 @@ as a load instruction and zero if GCC sh
 The default is zero, which is the most efficient choice on most systems.
 @end defmac
 
+@defmac STACK_CHECK_MOVING_SP
+An integer which is nonzero if GCC should move the stack pointer page by page
+when doing probes.  This can be necessary on systems where the stack pointer
+contains the bottom address of the memory area accessible to the executing
+thread at any point in time.  In this situation an alternate signal stack
+is required in order to be able to recover from a stack overflow.  The
+default value of this macro is zero.
+@end defmac
+
 @defmac STACK_CHECK_PROTECT
 The number of bytes of stack needed to recover from a stack overflow,
 for languages where such a recovery is supported.  The default value of
Index: gcc/tree.c
===================================================================
--- gcc/tree.c	(revision 152265)
+++ gcc/tree.c	(working copy)
@@ -8938,7 +8938,8 @@ build_common_builtin_nodes (void)
       tmp = tree_cons (NULL_TREE, size_type_node, void_list_node);
       ftype = build_function_type (ptr_type_node, tmp);
       local_define_builtin ("__builtin_alloca", ftype, BUILT_IN_ALLOCA,
-			    "alloca", ECF_NOTHROW | ECF_MALLOC);
+			    "alloca",
+			    ECF_MALLOC | (flag_stack_check ? 0 : ECF_NOTHROW));
     }
 
   tmp = tree_cons (NULL_TREE, ptr_type_node, void_list_node);
Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c	(revision 152265)
+++ gcc/rtlanal.c	(working copy)
@@ -2252,6 +2252,11 @@ may_trap_p_1 (const_rtx x, unsigned flag
 
       /* Memory ref can trap unless it's a static var or a stack slot.  */
     case MEM:
+      /* Recognize specific pattern of stack checking probes.  */
+      if (flag_stack_check
+	  && MEM_VOLATILE_P (x)
+	  && XEXP (x, 0) == stack_pointer_rtx)
+	return 1;
       if (/* MEM_NOTRAP_P only relates to the actual position of the memory
 	     reference; moving it out of context such as when moving code
 	     when optimizing, might cause its address to become invalid.  */
Index: gcc/expr.h
===================================================================
--- gcc/expr.h	(revision 152265)
+++ gcc/expr.h	(working copy)
@@ -218,9 +218,9 @@ do {								\
 #define STACK_CHECK_STATIC_BUILTIN 0
 #endif
 
-/* The default interval is one page.  */
-#ifndef STACK_CHECK_PROBE_INTERVAL
-#define STACK_CHECK_PROBE_INTERVAL 4096
+/* The default interval is one page (4096 bytes).  */
+#ifndef STACK_CHECK_PROBE_INTERVAL_EXP
+#define STACK_CHECK_PROBE_INTERVAL_EXP 12
 #endif
 
 /* The default is to do a store into the stack.  */
@@ -228,6 +228,11 @@ do {								\
 #define STACK_CHECK_PROBE_LOAD 0
 #endif
 
+/* The default is not to move the stack pointer.  */
+#ifndef STACK_CHECK_MOVING_SP
+#define STACK_CHECK_MOVING_SP 0
+#endif
+
 /* This is a kludge to try to capture the discrepancy between the old
    mechanism (generic stack checking) and the new mechanism (static
    builtin stack checking).  STACK_CHECK_PROTECT needs to be bumped
@@ -252,7 +257,7 @@ do {								\
    one probe per function.  */
 #ifndef STACK_CHECK_MAX_FRAME_SIZE
 #define STACK_CHECK_MAX_FRAME_SIZE \
-  (STACK_CHECK_PROBE_INTERVAL - UNITS_PER_WORD)
+  ((1 << STACK_CHECK_PROBE_INTERVAL_EXP) - UNITS_PER_WORD)
 #endif
 
 /* This is arbitrary, but should be large enough everywhere.  */
Index: gcc/unwind-dw2.c
===================================================================
--- gcc/unwind-dw2.c	(revision 152265)
+++ gcc/unwind-dw2.c	(working copy)
@@ -1559,7 +1559,13 @@ uw_install_context_1 (struct _Unwind_Con
 static inline _Unwind_Ptr
 uw_identify_context (struct _Unwind_Context *context)
 {
-  return _Unwind_GetCFA (context);
+  /* The CFA is not sufficient to disambiguate the context of a function
+     interrupted by a signal before establishing its frame and the context
+     of the signal itself.  */
+  if (STACK_GROWS_DOWNWARD)
+    return _Unwind_GetCFA (context) - _Unwind_IsSignalFrame (context);
+  else
+    return _Unwind_GetCFA (context) + _Unwind_IsSignalFrame (context);
 }
 
 
Index: gcc/ada/gcc-interface/Makefile.in
===================================================================
--- gcc/ada/gcc-interface/Makefile.in	(revision 152265)
+++ gcc/ada/gcc-interface/Makefile.in	(working copy)
@@ -2422,6 +2422,7 @@ gnatlib-shared-default:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS) $(TARGET_LIBGCC2_CFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
              gnatlib
@@ -2447,6 +2448,7 @@ gnatlib-shared-dual:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
              gnatlib-shared-default
@@ -2455,6 +2457,7 @@ gnatlib-shared-dual:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
              gnatlib
@@ -2464,6 +2467,7 @@ gnatlib-shared-dual-win32:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS) $(TARGET_LIBGCC2_CFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
              gnatlib-shared-win32
@@ -2472,6 +2476,7 @@ gnatlib-shared-dual-win32:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
              gnatlib
@@ -2485,6 +2490,7 @@ gnatlib-shared-win32:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS) $(TARGET_LIBGCC2_CFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
              gnatlib
@@ -2503,7 +2509,7 @@ gnatlib-shared-darwin:
 	$(MAKE) $(FLAGS_TO_PASS) \
 	     GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS) $(TARGET_LIBGCC2_CFLAGS) \
-	                    -fno-common" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C) -fno-common" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
 	     gnatlib
@@ -2531,6 +2537,7 @@ gnatlib-shared-vms:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
              gnatlib
@@ -2559,6 +2566,7 @@ gnatlib-shared:
 	$(MAKE) $(FLAGS_TO_PASS) \
              GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
 	     TARGET_LIBGCC2_CFLAGS="$(TARGET_LIBGCC2_CFLAGS)" \
@@ -2572,6 +2580,7 @@ gnatlib-sjlj:
 	     EH_MECHANISM="" \
 	     GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
 	     TARGET_LIBGCC2_CFLAGS="$(TARGET_LIBGCC2_CFLAGS)" gnatlib
@@ -2584,6 +2593,7 @@ gnatlib-zcx:
 	     EH_MECHANISM="-gcc" \
 	     GNATLIBFLAGS="$(GNATLIBFLAGS)" \
 	     GNATLIBCFLAGS="$(GNATLIBCFLAGS)" \
+	     GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C)" \
 	     MULTISUBDIR="$(MULTISUBDIR)" \
 	     THREAD_KIND="$(THREAD_KIND)" \
 	     TARGET_LIBGCC2_CFLAGS="$(TARGET_LIBGCC2_CFLAGS)" gnatlib
Index: gcc/explow.c
===================================================================
--- gcc/explow.c	(revision 152265)
+++ gcc/explow.c	(working copy)
@@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.
 
 static rtx break_out_memory_refs (rtx);
 static void emit_stack_probe (rtx);
+static void anti_adjust_stack_and_probe (rtx);
 
 
 /* Truncate and perhaps sign-extend C as appropriate for MODE.  */
@@ -1227,7 +1228,9 @@ allocate_dynamic_stack_space (rtx size,
 
   /* If needed, check that we have the required amount of stack.
      Take into account what has already been checked.  */
-  if (flag_stack_check == GENERIC_STACK_CHECK)
+  if (STACK_CHECK_MOVING_SP)
+    ;
+  else if (flag_stack_check == GENERIC_STACK_CHECK)
     probe_stack_range (STACK_OLD_CHECK_PROTECT + STACK_CHECK_MAX_FRAME_SIZE,
 		       size);
   else if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
@@ -1296,7 +1299,10 @@ allocate_dynamic_stack_space (rtx size,
 	  emit_label (space_available);
 	}
 
-      anti_adjust_stack (size);
+      if (flag_stack_check && STACK_CHECK_MOVING_SP)
+	anti_adjust_stack_and_probe (size);
+      else
+	anti_adjust_stack (size);
 
 #ifdef STACK_GROWS_DOWNWARD
       emit_move_insn (target, virtual_stack_dynamic_rtx);
@@ -1347,6 +1353,12 @@ emit_stack_probe (rtx address)
 
   MEM_VOLATILE_P (memref) = 1;
 
+  /* See if we have an insn to probe the stack.  */
+#ifdef HAVE_probe_stack
+  if (HAVE_probe_stack)
+    emit_insn (gen_probe_stack (memref));
+  else
+#endif
   if (STACK_CHECK_PROBE_LOAD)
     emit_move_insn (gen_reg_rtx (word_mode), memref);
   else
@@ -1359,10 +1371,18 @@ emit_stack_probe (rtx address)
    subtract from the stack.  If SIZE is constant, this is done
    with a fixed number of probes.  Otherwise, we must make a loop.  */
 
+#define PROBE_INTERVAL (1 << STACK_CHECK_PROBE_INTERVAL_EXP)
+
 #ifdef STACK_GROWS_DOWNWARD
-#define STACK_GROW_OP MINUS
+#define STACK_GROW_OP     MINUS
+#define STACK_GROW_OPTAB  sub_optab
+#define STACK_HIGH(high,low)  low
+#define STACK_LOW(high,low)   high
 #else
-#define STACK_GROW_OP PLUS
+#define STACK_GROW_OP     PLUS
+#define STACK_GROW_OPTAB  add_optab
+#define STACK_HIGH(high,low)  high
+#define STACK_LOW(high,low)   low
 #endif
 
 void
@@ -1386,99 +1406,252 @@ probe_stack_range (HOST_WIDE_INT first,
 			 ptr_mode);
     }
 
-  /* Next see if we have an insn to check the stack.  Use it if so.  */
-#ifdef HAVE_check_stack
-  else if (HAVE_check_stack)
+  /* Otherwise we have to generate explicit probes.  If we have a constant
+     small number of them to generate, that's the easy case.  */
+  else if (CONST_INT_P (size) && INTVAL (size) < 7 * PROBE_INTERVAL)
+    {
+      HOST_WIDE_INT i, offset, size_int = INTVAL (size);
+      rtx addr;
+
+      /* Probe at FIRST + N * PROBE_INTERVAL for values of N from 1 until
+	 it exceeds SIZE.  If only one probe is needed, this will not
+	 generate any code.  Then probe at FIRST + SIZE.  */
+      for (i = PROBE_INTERVAL; i < size_int; i += PROBE_INTERVAL)
+	{
+	  offset = first + i;
+#ifdef STACK_GROWS_DOWNWARD
+	  offset = -offset;
+#endif
+	  addr = memory_address (Pmode,
+				 plus_constant (stack_pointer_rtx, offset));
+	  emit_stack_probe (addr);
+	}
+
+      offset = first + size_int;
+#ifdef STACK_GROWS_DOWNWARD
+      offset = -offset;
+#endif
+      addr = memory_address (Pmode, plus_constant (stack_pointer_rtx, offset));
+      emit_stack_probe (addr);
+    }
+
+  /* In the variable case, do the same as above, but in a loop.  Note that we
+     must be extra careful with variables wrapping around because we might be
+     at the very top (or the very bottom) of the address space and we have to
+     be able to handle this case properly; in particular, we use an equality
+     test for the loop condition.  */
+  else
     {
-      insn_operand_predicate_fn pred;
-      rtx last_addr
-	= force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					 stack_pointer_rtx,
-					 plus_constant (size, first)),
-			 NULL_RTX);
+      rtx rounded_size, rounded_size_op, test_addr, last_addr, temp;
+      rtx loop_lab = gen_label_rtx ();
+      rtx end_lab = gen_label_rtx ();
 
-      pred = insn_data[(int) CODE_FOR_check_stack].operand[0].predicate;
-      if (pred && ! ((*pred) (last_addr, Pmode)))
-	last_addr = copy_to_mode_reg (Pmode, last_addr);
+      /* Step 1: round SIZE to the previous multiple of the interval.  */
 
-      emit_insn (gen_check_stack (last_addr));
-    }
+      /* ROUNDED_SIZE = SIZE & -PROBE_INTERVAL  */
+      rounded_size = simplify_gen_binary (AND, Pmode,
+					  size,
+					  GEN_INT (-PROBE_INTERVAL));
+      rounded_size_op = force_operand (rounded_size, NULL_RTX);
+
+
+      /* Step 2: compute initial and final value of the loop counter.  */
+
+      /* TEST_ADDR = SP + FIRST.  */
+      test_addr = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
+					 	 stack_pointer_rtx,
+					 	 GEN_INT (first)),
+				 NULL_RTX);
+
+      /* LAST_ADDR = SP + FIRST + ROUNDED_SIZE.  */
+      last_addr = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
+						 test_addr,
+						 rounded_size_op),
+				 NULL_RTX);
+
+
+      /* Step 3: the loop
+
+	 while (TEST_ADDR != LAST_ADDR)
+	   {
+	     TEST_ADDR = TEST_ADDR + PROBE_INTERVAL
+	     probe at TEST_ADDR
+	   }
+
+	 probes at FIRST + N * PROBE_INTERVAL for values of N from 1
+	 until it is equal to ROUNDED_SIZE.  */
+
+      emit_label (loop_lab);
+
+      /* Jump to END_LAB if TEST_ADDR == LAST_ADDR.  */
+      emit_cmp_and_jump_insns (test_addr, last_addr, EQ,
+			       NULL_RTX, Pmode, 1, end_lab);
+
+      /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL.  */
+      temp = expand_binop (Pmode, STACK_GROW_OPTAB, test_addr,
+			   GEN_INT (PROBE_INTERVAL), test_addr,
+			   1, OPTAB_WIDEN);
+
+      gcc_assert (temp == test_addr);
+
+      /* Probe at TEST_ADDR.  */
+      emit_stack_probe (test_addr);
+
+      emit_jump (loop_lab);
+
+      emit_label (end_lab);
+
+      /* Step 4: probe at FIRST + SIZE if we cannot assert at compile-time
+	 that SIZE is equal to ROUNDED_SIZE.  */
+
+      /* TEMP = SIZE - ROUNDED_SIZE.  */
+      temp = simplify_gen_binary (MINUS, Pmode, size, rounded_size);
+      if (temp != const0_rtx)
+	{
+	  rtx addr;
+
+	  if (GET_CODE (temp) == CONST_INT)
+	    {
+	      /* Use [base + disp} addressing mode if supported.  */
+	      HOST_WIDE_INT offset = INTVAL (temp);
+#ifdef STACK_GROWS_DOWNWARD
+	      offset = -offset;
 #endif
+	      addr = memory_address (Pmode, plus_constant (last_addr, offset));
+	    }
+	  else
+	    {
+	      /* Manual CSE if the difference is not known at compile-time.  */
+	      temp = gen_rtx_MINUS (Pmode, size, rounded_size_op);
+	      addr = memory_address (Pmode,
+				     gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
+						     last_addr, temp));
+	    }
 
-  /* If we have to generate explicit probes, see if we have a constant
-     small number of them to generate.  If so, that's the easy case.  */
-  else if (CONST_INT_P (size)
-	   && INTVAL (size) < 10 * STACK_CHECK_PROBE_INTERVAL)
-    {
-      HOST_WIDE_INT offset;
+	  emit_stack_probe (addr);
+	}
+    }
+}
+
+/* Adjust the stack by SIZE bytes while probing it.  Note that we skip
+   the probe for the first interval and instead probe one interval past
+   the specified size in order to maintain a protection area.  */
+
+static void
+anti_adjust_stack_and_probe (rtx size)
+{
+  rtx probe_interval = GEN_INT (PROBE_INTERVAL);
 
-      /* Start probing at FIRST + N * STACK_CHECK_PROBE_INTERVAL
-	 for values of N from 1 until it exceeds LAST.  If only one
-	 probe is needed, this will not generate any code.  Then probe
-	 at LAST.  */
-      for (offset = first + STACK_CHECK_PROBE_INTERVAL;
-	   offset < INTVAL (size);
-	   offset = offset + STACK_CHECK_PROBE_INTERVAL)
-	emit_stack_probe (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					  stack_pointer_rtx,
-					  GEN_INT (offset)));
+  /* First ensure SIZE is Pmode.  */
+  if (GET_MODE (size) != VOIDmode && GET_MODE (size) != Pmode)
+    size = convert_to_mode (Pmode, size, 1);
 
-      emit_stack_probe (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					stack_pointer_rtx,
-					plus_constant (size, first)));
+  /* If we have a constant small number of probes to generate, that's the
+     easy case.  */
+  if (GET_CODE (size) == CONST_INT && INTVAL (size) < 7 * PROBE_INTERVAL)
+    {
+      HOST_WIDE_INT i, int_size = INTVAL (size);
+      bool first_probe = true;
+
+      /* Adjust SP and probe to PROBE_INTERVAL + N * PROBE_INTERVAL for
+	 values of N from 1 until it exceeds SIZE.  If only one probe is
+	 needed, this will not generate any code.  Then adjust and probe
+	 to PROBE_INTERVAL + SIZE.  */
+      for (i = PROBE_INTERVAL; i < int_size; i += PROBE_INTERVAL)
+	{
+	  if (first_probe)
+	    {
+	      anti_adjust_stack (GEN_INT (2 * PROBE_INTERVAL));
+	      first_probe = false;
+	    }
+	  else
+	    anti_adjust_stack (probe_interval);
+	  emit_stack_probe (stack_pointer_rtx);
+	}
+
+      if (first_probe)
+	anti_adjust_stack (plus_constant (size, PROBE_INTERVAL));
+      else
+	anti_adjust_stack (plus_constant (size, PROBE_INTERVAL - i));
+      emit_stack_probe (stack_pointer_rtx);
     }
 
-  /* In the variable case, do the same as above, but in a loop.  We emit loop
-     notes so that loop optimization can be done.  */
+  /* In the variable case, do the same as above, but in a loop.  Note that we
+     must be extra careful with variables wrapping around because we might be
+     at the very top (or the very bottom) of the address space and we have to
+     be able to handle this case properly; in particular, we use an equality
+     test for the loop condition.  */
   else
     {
-      rtx test_addr
-	= force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					 stack_pointer_rtx,
-					 GEN_INT (first + STACK_CHECK_PROBE_INTERVAL)),
-			 NULL_RTX);
-      rtx last_addr
-	= force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					 stack_pointer_rtx,
-					 plus_constant (size, first)),
-			 NULL_RTX);
-      rtx incr = GEN_INT (STACK_CHECK_PROBE_INTERVAL);
+      rtx rounded_size, rounded_size_op, last_addr, temp;
       rtx loop_lab = gen_label_rtx ();
-      rtx test_lab = gen_label_rtx ();
       rtx end_lab = gen_label_rtx ();
-      rtx temp;
 
-      if (!REG_P (test_addr)
-	  || REGNO (test_addr) < FIRST_PSEUDO_REGISTER)
-	test_addr = force_reg (Pmode, test_addr);
+      /* Step 1: round SIZE to the previous multiple of the interval.  */
+
+      /* ROUNDED_SIZE = SIZE & -PROBE_INTERVAL  */
+      rounded_size = simplify_gen_binary (AND, Pmode,
+					  size,
+					  GEN_INT (-PROBE_INTERVAL));
+      rounded_size_op = force_operand (rounded_size, NULL_RTX);
 
-      emit_jump (test_lab);
+
+      /* Step 2: compute initial and final value of the loop counter.  */
+
+      /* SP = SP_0 + PROBE_INTERVAL.  */
+      anti_adjust_stack (probe_interval);
+
+      /* LAST_ADDR = SP_0 + PROBE_INTERVAL + ROUNDED_SIZE.  */
+      last_addr = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
+						 stack_pointer_rtx,
+						 rounded_size_op),
+				 NULL_RTX);
+
+
+      /* Step 3: the loop
+
+	  while (SP != LAST_ADDR)
+	    {
+	      SP = SP + PROBE_INTERVAL
+	      probe at SP
+	    }
+
+	 adjusts SP and probes to PROBE_INTERVAL + N * PROBE_INTERVAL for
+	 values of N from 1 until it is equal to ROUNDED_SIZE.  */
 
       emit_label (loop_lab);
-      emit_stack_probe (test_addr);
 
-#ifdef STACK_GROWS_DOWNWARD
-#define CMP_OPCODE GTU
-      temp = expand_binop (Pmode, sub_optab, test_addr, incr, test_addr,
-			   1, OPTAB_WIDEN);
-#else
-#define CMP_OPCODE LTU
-      temp = expand_binop (Pmode, add_optab, test_addr, incr, test_addr,
-			   1, OPTAB_WIDEN);
-#endif
+      /* Jump to END_LAB if SP == LAST_ADDR.  */
+      emit_cmp_and_jump_insns (stack_pointer_rtx, last_addr, EQ,
+			       NULL_RTX, Pmode, 1, end_lab);
+
+      /* SP = SP + PROBE_INTERVAL and probe at SP.  */
+      anti_adjust_stack (probe_interval);
+      emit_stack_probe (stack_pointer_rtx);
 
-      gcc_assert (temp == test_addr);
+      emit_jump (loop_lab);
 
-      emit_label (test_lab);
-      emit_cmp_and_jump_insns (test_addr, last_addr, CMP_OPCODE,
-			       NULL_RTX, Pmode, 1, loop_lab);
-      emit_jump (end_lab);
       emit_label (end_lab);
 
-      emit_stack_probe (last_addr);
+      /* Step 4: adjust SP and probe to PROBE_INTERVAL + SIZE if we cannot
+	 assert at compile-time that SIZE is equal to ROUNDED_SIZE.  */
+
+      /* TEMP = SIZE - ROUNDED_SIZE.  */
+      temp = simplify_gen_binary (MINUS, Pmode, size, rounded_size);
+      if (temp != const0_rtx)
+	{
+	  /* Manual CSE if the difference is not known at compile-time.  */
+	  if (GET_CODE (temp) != CONST_INT)
+	    temp = gen_rtx_MINUS (Pmode, size, rounded_size_op);
+	  anti_adjust_stack (temp);
+	  emit_stack_probe (stack_pointer_rtx);
+	}
     }
+
+  /* Adjust back to account for the additional first interval.  */
+  adjust_stack (probe_interval);
 }
-\f
+
 /* Return an rtx representing the register or memory location
    in which a scalar value of data type VALTYPE
    was returned by a function call to function FUNC.
Index: gcc/ira.c
===================================================================
--- gcc/ira.c	(revision 152265)
+++ gcc/ira.c	(working copy)
@@ -1442,6 +1442,9 @@ ira_setup_eliminable_regset (void)
   int need_fp
     = (! flag_omit_frame_pointer
        || (cfun->calls_alloca && EXIT_IGNORE_STACK)
+       /* We need the frame pointer to catch stack overflow exceptions
+	  if the stack pointer is moving.  */
+       || (flag_stack_check && STACK_CHECK_MOVING_SP)
        || crtl->accesses_prior_frames
        || crtl->stack_realign_needed
        || targetm.frame_pointer_required ());
Index: gcc/system.h
===================================================================
--- gcc/system.h	(revision 152265)
+++ gcc/system.h	(working copy)
@@ -761,7 +761,7 @@ extern void fancy_abort (const char *, i
 	TARGET_ASM_EXCEPTION_SECTION TARGET_ASM_EH_FRAME_SECTION	   \
 	SMALL_ARG_MAX ASM_OUTPUT_SHARED_BSS ASM_OUTPUT_SHARED_COMMON	   \
 	ASM_OUTPUT_SHARED_LOCAL UNALIGNED_WORD_ASM_OP			   \
-	ASM_MAKE_LABEL_LINKONCE
+	ASM_MAKE_LABEL_LINKONCE STACK_CHECK_PROBE_INTERVAL
 
 /* Hooks that are no longer used.  */
  #pragma GCC poison LANG_HOOKS_FUNCTION_MARK LANG_HOOKS_FUNCTION_FREE	\
Index: gcc/config/i386/linux.h
===================================================================
--- gcc/config/i386/linux.h	(revision 152265)
+++ gcc/config/i386/linux.h	(working copy)
@@ -207,6 +207,12 @@ along with GCC; see the file COPYING3.
 
 #define MD_UNWIND_SUPPORT "config/i386/linux-unwind.h"
 
+/* Static stack checking is supported by means of probes.  */
+#define STACK_CHECK_STATIC_BUILTIN 1
+
+/* The stack pointer needs to be moved while checking the stack.  */
+#define STACK_CHECK_MOVING_SP 1
+
 /* This macro may be overridden in i386/k*bsd-gnu.h.  */
 #define REG_NAME(reg) reg
 
Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md	(revision 152265)
+++ gcc/config/i386/i386.md	(working copy)
@@ -219,15 +219,16 @@ (define_constants
 (define_constants
   [(UNSPECV_BLOCKAGE		0)
    (UNSPECV_STACK_PROBE		1)
-   (UNSPECV_EMMS		2)
-   (UNSPECV_LDMXCSR		3)
-   (UNSPECV_STMXCSR		4)
-   (UNSPECV_FEMMS		5)
-   (UNSPECV_CLFLUSH		6)
-   (UNSPECV_ALIGN		7)
-   (UNSPECV_MONITOR		8)
-   (UNSPECV_MWAIT		9)
-   (UNSPECV_CMPXCHG		10)
+   (UNSPECV_STACK_PROBE_INLINE  2)
+   (UNSPECV_EMMS		3)
+   (UNSPECV_LDMXCSR		4)
+   (UNSPECV_STMXCSR		5)
+   (UNSPECV_FEMMS		6)
+   (UNSPECV_CLFLUSH		7)
+   (UNSPECV_ALIGN		8)
+   (UNSPECV_MONITOR		9)
+   (UNSPECV_MWAIT		10)
+   (UNSPECV_CMPXCHG		11)
    (UNSPECV_XCHG		12)
    (UNSPECV_LOCK		13)
    (UNSPECV_PROLOGUE_USE	14)
@@ -20914,6 +20915,38 @@ (define_expand "allocate_stack"
   DONE;
 })
 
+(define_expand "probe_stack"
+  [(match_operand 0 "memory_operand" "")]
+  ""
+{
+  if (GET_MODE (operands[0]) == DImode)
+    emit_insn (gen_iordi3 (operands[0], operands[0], const0_rtx));
+  else
+    emit_insn (gen_iorsi3 (operands[0], operands[0], const0_rtx));
+  DONE;
+})
+
+(define_insn "adjust_stack_and_probe<P:mode>"
+  [(unspec_volatile:P [(match_operand:P 0 "const_int_operand" "n")]
+    UNSPECV_STACK_PROBE_INLINE)
+   (set (reg:P SP_REG) (minus:P (reg:P SP_REG) (match_dup 0)))
+   (clobber (match_operand:P 1 "general_operand" "=rn"))
+   (clobber (reg:CC FLAGS_REG))
+   (clobber (mem:BLK (scratch)))]
+  ""
+  "* return output_adjust_stack_and_probe (operands[0], operands[1]);"
+  [(set_attr "type" "multi")])
+
+(define_insn "probe_stack_range<P:mode>"
+  [(unspec_volatile:P [(match_operand:P 0 "const_int_operand" "n")
+		       (match_operand:P 1 "const_int_operand" "n")]
+    UNSPECV_STACK_PROBE_INLINE)
+   (clobber (match_operand:P 2 "general_operand" "=rn"))
+   (clobber (reg:CC FLAGS_REG))]
+  ""
+  "* return output_probe_stack_range (operands[0], operands[1], operands[2]);"
+  [(set_attr "type" "multi")])
+
 (define_expand "builtin_setjmp_receiver"
   [(label_ref (match_operand 0 "" ""))]
   "!TARGET_64BIT && flag_pic"
@@ -21417,7 +21450,9 @@ (define_peephole2
                      [(match_dup 0)
                       (match_operand:SI 1 "nonmemory_operand" "")]))
               (clobber (reg:CC FLAGS_REG))])]
-  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE"
+  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE
+   /* Do not split stack checking probes.  */
+   && GET_CODE (operands[3]) != IOR && operands[1] != const0_rtx"
   [(set (match_dup 2) (match_dup 0))
    (parallel [(set (match_dup 2)
                    (match_op_dup 3 [(match_dup 2) (match_dup 1)]))
@@ -21432,7 +21467,9 @@ (define_peephole2
                      [(match_operand:SI 1 "nonmemory_operand" "")
                       (match_dup 0)]))
               (clobber (reg:CC FLAGS_REG))])]
-  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE"
+  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE
+   /* Do not split stack checking probes.  */
+   && GET_CODE (operands[3]) != IOR && operands[1] != const0_rtx"
   [(set (match_dup 2) (match_dup 0))
    (parallel [(set (match_dup 2)
                    (match_op_dup 3 [(match_dup 1) (match_dup 2)]))
Index: gcc/config/i386/sol2.h
===================================================================
--- gcc/config/i386/sol2.h	(revision 152265)
+++ gcc/config/i386/sol2.h	(working copy)
@@ -113,6 +113,9 @@ along with GCC; see the file COPYING3.
 #undef X86_FILE_START_VERSION_DIRECTIVE
 #define X86_FILE_START_VERSION_DIRECTIVE false
 
+/* Static stack checking is supported by means of probes.  */
+#define STACK_CHECK_STATIC_BUILTIN 1
+
 /* Only recent versions of Solaris 11 ld properly support hidden .gnu.linkonce
    sections, so don't use them.  */
 #ifndef TARGET_GNU_LD
Index: gcc/config/i386/linux64.h
===================================================================
--- gcc/config/i386/linux64.h	(revision 152265)
+++ gcc/config/i386/linux64.h	(working copy)
@@ -110,6 +110,12 @@ see the files COPYING3 and COPYING.RUNTI
 
 #define MD_UNWIND_SUPPORT "config/i386/linux-unwind.h"
 
+/* Static stack checking is supported by means of probes.  */
+#define STACK_CHECK_STATIC_BUILTIN 1
+
+/* The stack pointer needs to be moved while checking the stack.  */
+#define STACK_CHECK_MOVING_SP 1
+
 /* This macro may be overridden in i386/k*bsd-gnu.h.  */
 #define REG_NAME(reg) reg
 
Index: gcc/config/i386/linux-unwind.h
===================================================================
--- gcc/config/i386/linux-unwind.h	(revision 152265)
+++ gcc/config/i386/linux-unwind.h	(working copy)
@@ -172,6 +172,25 @@ x86_fallback_frame_state (struct _Unwind
   fs->signal_frame = 1;
   return _URC_NO_REASON;
 }
+
+#define MD_FROB_UPDATE_CONTEXT x86_frob_update_context
+
+/* Fix up for kernels that have vDSO, but don't have S flag in it.  */
+
+static void
+x86_frob_update_context (struct _Unwind_Context *context,
+			 _Unwind_FrameState *fs ATTRIBUTE_UNUSED)
+{
+  unsigned char *pc = context->ra;
+
+  /* movl $__NR_rt_sigreturn,%eax ; {int $0x80 | syscall}  */
+  if (*(unsigned char *)(pc+0) == 0xb8
+      && *(unsigned int *)(pc+1) == 173
+      && (*(unsigned short *)(pc+5) == 0x80cd
+	  || *(unsigned short *)(pc+5) == 0x050f))
+    _Unwind_SetSignalFrame (context, 1);
+}
+
 #endif /* not glibc 2.0 */
 #endif /* ifdef __x86_64__  */
 #endif /* ifdef inhibit_libc  */
Index: gcc/config/i386/i386-protos.h
===================================================================
--- gcc/config/i386/i386-protos.h	(revision 152265)
+++ gcc/config/i386/i386-protos.h	(working copy)
@@ -69,6 +69,8 @@ extern const char *output_387_binary_op
 extern const char *output_387_reg_move (rtx, rtx*);
 extern const char *output_fix_trunc (rtx, rtx*, int);
 extern const char *output_fp_compare (rtx, rtx*, int, int);
+extern const char *output_adjust_stack_and_probe (rtx, rtx);
+extern const char *output_probe_stack_range (rtx, rtx, rtx);
 
 extern void ix86_expand_clear (rtx);
 extern void ix86_expand_move (enum machine_mode, rtx[]);
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 152265)
+++ gcc/config/i386/i386.c	(working copy)
@@ -7854,6 +7854,11 @@ ix86_compute_frame_layout (struct ix86_f
   else
     frame->save_regs_using_mov = false;
 
+  /* If static stack checking is enabled and done with probes, the registers
+     need to be saved before allocating the frame.  */
+  if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
+    frame->save_regs_using_mov = false;
+
   /* Skip return address.  */
   offset = UNITS_PER_WORD;
 
@@ -8232,6 +8237,408 @@ ix86_internal_arg_pointer (void)
   return virtual_incoming_args_rtx;
 }
 
+struct scratch_reg {
+  rtx reg;
+  bool saved;
+};
+
+/* Return a short-lived scratch register for use on function entry.
+   In 32-bit mode, it is valid only after the registers are saved
+   in the prologue.  This register must be released by means of
+   release_scratch_register_on_entry once it is dead.  */
+
+static void
+get_scratch_register_on_entry (struct scratch_reg *sr)
+{
+  int regno;
+
+  sr->saved = false;
+
+  if (TARGET_64BIT)
+    regno = FIRST_REX_INT_REG + 3; /* r11 */
+  else
+    {
+      tree decl = current_function_decl, fntype = TREE_TYPE (decl);
+      bool fastcall_p
+	= lookup_attribute ("fastcall", TYPE_ATTRIBUTES (fntype)) != NULL_TREE;
+      int regparm = ix86_function_regparm (fntype, decl);
+      int drap_regno
+	= crtl->drap_reg ? REGNO (crtl->drap_reg) : INVALID_REGNUM;
+
+      /* 'fastcall' sets regparm to 2 and uses ecx+edx.  */
+      if ((regparm < 1 || fastcall_p) && drap_regno != 0)
+	regno = 0;
+      else if (regparm < 2 && drap_regno != 1)
+	regno = 1;
+      else if (regparm < 3 && !fastcall_p && drap_regno != 2
+	       /* ecx is the static chain register.  */
+	       && !DECL_STATIC_CHAIN (decl))
+	regno = 2;
+      else if (ix86_save_reg (3, true))
+	regno = 3;
+      else if (ix86_save_reg (4, true))
+	regno = 4;
+      else if (ix86_save_reg (5, true))
+	regno = 5;
+      else
+	{
+	  regno = (drap_regno == 0 ? 1 : 0);
+	  sr->saved = true;
+	}
+    }
+
+  sr->reg = gen_rtx_REG (Pmode, regno);
+  if (sr->saved)
+    {
+      rtx insn = emit_insn (gen_push (sr->reg));
+      RTX_FRAME_RELATED_P (insn) = 1;
+    }
+}
+
+/* Release a scratch register obtained from the preceding function.  */
+
+static void
+release_scratch_register_on_entry (struct scratch_reg *sr)
+{
+  if (sr->saved)
+    {
+      rtx insn, x;
+
+      if (TARGET_64BIT)
+	insn = emit_insn (gen_popdi1 (sr->reg));
+      else
+	insn = emit_insn (gen_popsi1 (sr->reg));
+      RTX_FRAME_RELATED_P (insn) = 1;
+
+      /* The RTX_FRAME_RELATED_P mechanism doesn't know about pop.  */
+      x = plus_constant (stack_pointer_rtx, UNITS_PER_WORD);
+      x = gen_rtx_SET (VOIDmode, stack_pointer_rtx, x);
+      add_reg_note (insn, REG_CFA_ADJUST_CFA, x);
+    }
+}
+
+/* The run-time loop is made up of 8 insns in the generic case while this
+   compile-time loop is made up of n insns for n # of intervals.  */
+#define PROBE_INTERVAL (1 << STACK_CHECK_PROBE_INTERVAL_EXP)
+#define SMALL_INTERVAL(size) ((size) <= 8 * PROBE_INTERVAL)
+
+/* Output one probe.  */
+
+static inline void
+output_probe_op (void)
+{
+  fputs (TARGET_64BIT ? "\torq\t$0, " : "\torl\t$0, ", asm_out_file);
+}
+
+/* Adjust the stack by SIZE bytes and output one probe.  */
+
+static void
+output_adjust_stack_and_probe_op (HOST_WIDE_INT size)
+{
+  fprintf (asm_out_file, "\tsub\t$"HOST_WIDE_INT_PRINT_DEC",", size);
+  print_reg (stack_pointer_rtx, 0, asm_out_file);
+  fputc ('\n', asm_out_file);
+  output_probe_op ();
+  fputc ('(', asm_out_file);
+  print_reg (stack_pointer_rtx, 0, asm_out_file);
+  fputs (")\n", asm_out_file);
+}
+
+/* Adjust the stack by SIZE bytes while probing it.  Note that we skip
+   the probe for the first interval and instead probe one interval past
+   the specified size in order to maintain a protection area.  */
+
+const char *
+output_adjust_stack_and_probe (rtx size_rtx, rtx reg)
+{
+  static int labelno = 0;
+  HOST_WIDE_INT size = INTVAL (size_rtx);
+  HOST_WIDE_INT rounded_size;
+  char loop_lab[32], end_lab[32];
+
+  /* See if we have a constant small number of probes to generate.  If so,
+     that's the easy case.  */
+  if (SMALL_INTERVAL (size))
+    {
+      HOST_WIDE_INT i;
+      bool first_probe = true;
+
+      /* Adjust SP and probe to PROBE_INTERVAL + N * PROBE_INTERVAL for
+	 values of N from 1 until it exceeds SIZE.  If only one probe is
+	 needed, this will not generate any code.  Then adjust and probe
+	 to PROBE_INTERVAL + SIZE.  */
+      for (i = PROBE_INTERVAL; i < size; i += PROBE_INTERVAL)
+	{
+	  if (first_probe)
+	    {
+	      output_adjust_stack_and_probe_op (2 * PROBE_INTERVAL);
+	      first_probe = false;
+	    }
+	  else
+	    output_adjust_stack_and_probe_op (PROBE_INTERVAL);
+	}
+
+      if (first_probe)
+	output_adjust_stack_and_probe_op (size + PROBE_INTERVAL);
+      else
+	output_adjust_stack_and_probe_op (size + PROBE_INTERVAL - i);
+    }
+
+  /* In the variable case, do the same as above, but in a loop.  Note that we
+     must be extra careful with variables wrapping around because we might be
+     at the very top (or the very bottom) of the address space and we have to
+     be able to handle this case properly; in particular, we use an equality
+     test for the loop condition.  */
+  else
+    {
+      /* Step 1: round SIZE to the previous multiple of the interval.  */
+      rounded_size = size & -PROBE_INTERVAL;
+
+
+      /* Step 2: compute initial and final value of the loop counter.  */
+
+      /* SP = SP_0 + PROBE_INTERVAL.  */
+      fprintf (asm_out_file, "\tsub\t$%d, ", PROBE_INTERVAL);
+      print_reg (stack_pointer_rtx, 0, asm_out_file);
+
+      /* LAST_ADDR = SP_0 + PROBE_INTERVAL + ROUNDED_SIZE.  */
+      fprintf (asm_out_file, "\n\tmov\t$-"HOST_WIDE_INT_PRINT_DEC", ",
+	       rounded_size);
+      print_reg (reg, 0, asm_out_file);
+      fputs ("\n\tadd\t", asm_out_file);
+      print_reg (stack_pointer_rtx, 0, asm_out_file);
+      fputs (", ", asm_out_file);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+
+
+      /* Step 3: the loop
+
+	  while (SP != LAST_ADDR)
+	    {
+	      SP = SP + PROBE_INTERVAL
+	      probe at SP
+	    }
+
+	 adjusts SP and probes to PROBE_INTERVAL + N * PROBE_INTERVAL for
+	 values of N from 1 until it is equal to ROUNDED_SIZE.  */
+
+      ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno);
+      ASM_OUTPUT_LABEL (asm_out_file, loop_lab);
+
+      /* Jump to END_LAB if SP == LAST_ADDR.  */
+      fputs ("\tcmp\t", asm_out_file);
+      print_reg (stack_pointer_rtx, 0, asm_out_file);
+      fputs (", ", asm_out_file);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+      ASM_GENERATE_INTERNAL_LABEL (end_lab, "LPSRE", labelno++);
+      fputs ("\tje\t", asm_out_file); assemble_name (asm_out_file, end_lab);
+      fputc ('\n', asm_out_file);
+
+      /* SP = SP + PROBE_INTERVAL and probe at SP.  */
+      output_adjust_stack_and_probe_op (PROBE_INTERVAL);
+
+      fprintf (asm_out_file, "\tjmp\t"); assemble_name (asm_out_file, loop_lab);
+      fputc ('\n', asm_out_file);
+
+      ASM_OUTPUT_LABEL (asm_out_file, end_lab);
+
+
+      /* Step 4: adjust SP and probe to PROBE_INTERVAL + SIZE if we cannot
+	 assert at compile-time that SIZE is equal to ROUNDED_SIZE.  */
+      if (size != rounded_size)
+	output_adjust_stack_and_probe_op (size - rounded_size);
+    }
+
+  /* Adjust back to account for the additional first interval.  */
+  fprintf (asm_out_file, "\tadd\t$%d, ", PROBE_INTERVAL);
+  print_reg (stack_pointer_rtx, 0, asm_out_file);
+  fputc ('\n', asm_out_file);
+
+  return "";
+}
+
+/* Wrapper around gen_adjust_stack_and_probe.  */
+
+static rtx
+ix86_gen_adjust_stack_and_probe (rtx op0, rtx op1)
+{
+  if (TARGET_64BIT)
+    return gen_adjust_stack_and_probedi (op0, op1);
+  else
+    return gen_adjust_stack_and_probesi (op0, op1);
+}
+
+/* Emit code to adjust the stack by SIZE bytes while probing it.  */
+
+static void
+ix86_adjust_stack_and_probe (HOST_WIDE_INT size)
+{
+  rtx size_rtx = GEN_INT (size);
+
+  if (SMALL_INTERVAL (size))
+    emit_insn (ix86_gen_adjust_stack_and_probe (size_rtx, const0_rtx));
+  else
+    {
+      struct scratch_reg sr;
+      get_scratch_register_on_entry (&sr);
+      emit_insn (ix86_gen_adjust_stack_and_probe (size_rtx, sr.reg));
+      release_scratch_register_on_entry (&sr);
+    }
+
+  gcc_assert (ix86_cfa_state->reg != stack_pointer_rtx);
+
+  /* Make sure nothing is scheduled before we are done.  */
+  emit_insn (gen_blockage ());
+}
+
+/* Output one probe at OFFSET + INDEX from the current stack pointer.  */
+
+static void
+output_probe_stack_range_op (HOST_WIDE_INT offset, rtx index)
+{
+  output_probe_op ();
+  if (offset)
+    fprintf (asm_out_file, "-"HOST_WIDE_INT_PRINT_DEC, offset);
+  fputc ('(', asm_out_file);
+  print_reg (stack_pointer_rtx, 0, asm_out_file);
+  if (index)
+    {
+      fputc (',', asm_out_file);
+      print_reg (index, 0, asm_out_file);
+      fputs (",1", asm_out_file);
+    }
+  fputs (")\n", asm_out_file);
+}
+
+/* Probe a range of stack addresses from FIRST to FIRST+SIZE, inclusive.
+   These are offsets from the current stack pointer.  */
+
+const char *
+output_probe_stack_range (rtx first_rtx, rtx size_rtx, rtx reg)
+{
+  static int labelno = 0;
+  HOST_WIDE_INT first = INTVAL (first_rtx);
+  HOST_WIDE_INT size = INTVAL (size_rtx);
+  HOST_WIDE_INT rounded_size;
+  char loop_lab[32], end_lab[32];
+
+  /* See if we have a constant small number of probes to generate.  If so,
+     that's the easy case.  */
+  if (SMALL_INTERVAL (size))
+    {
+      HOST_WIDE_INT i;
+
+      /* Probe at FIRST + N * PROBE_INTERVAL for values of N from 1 until
+	 it exceeds SIZE.  If only one probe is needed, this will not
+	 generate any code.  Then probe at FIRST + SIZE.  */
+      for (i = PROBE_INTERVAL; i < size; i += PROBE_INTERVAL)
+	output_probe_stack_range_op (first + i, NULL_RTX);
+
+      output_probe_stack_range_op (first + size, NULL_RTX);
+    }
+
+  /* Otherwise, do the same as above, but in a loop.  Note that we must be
+     extra careful with variables wrapping around because we might be at
+     the very top (or the very bottom) of the address space and we have
+     to be able to handle this case properly; in particular, we use an
+     equality test for the loop condition.  */
+  else
+    {
+      /* Step 1: round SIZE to the previous multiple of the interval.  */
+      rounded_size = size & -PROBE_INTERVAL;
+
+
+      /* Step 2: compute initial and final value of the loop counter.  */
+
+      /* TEST_OFFSET = FIRST.  */
+      fprintf (asm_out_file, "\tmov\t$-"HOST_WIDE_INT_PRINT_DEC", ", first);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+
+      /* LAST_OFFSET = FIRST + ROUNDED_SIZE.  */
+
+
+      /* Step 3: the loop
+
+	 while (TEST_ADDR != LAST_ADDR)
+	   {
+	     TEST_ADDR = TEST_ADDR + PROBE_INTERVAL
+	     probe at TEST_ADDR
+	   }
+
+	 probes at FIRST + N * PROBE_INTERVAL for values of N from 1
+	 until it is equal to ROUNDED_SIZE.  */
+
+      ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno);
+      ASM_OUTPUT_LABEL (asm_out_file, loop_lab);
+
+       /* Jump to END_LAB if TEST_ADDR == LAST_ADDR.  */
+      fprintf (asm_out_file, "\tcmp\t$-"HOST_WIDE_INT_PRINT_DEC", ",
+	       first + rounded_size);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+      ASM_GENERATE_INTERNAL_LABEL (end_lab, "LPSRE", labelno++);
+      fputs ("\tje\t", asm_out_file); assemble_name (asm_out_file, end_lab);
+      fputc ('\n', asm_out_file);
+
+      /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL.  */
+      fprintf (asm_out_file, "\tsub\t$%d, ", PROBE_INTERVAL);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+
+      /* Probe at TEST_ADDR.  */
+      output_probe_stack_range_op (0, reg);
+
+      fprintf (asm_out_file, "\tjmp\t"); assemble_name (asm_out_file, loop_lab);
+      fputc ('\n', asm_out_file);
+
+      ASM_OUTPUT_LABEL (asm_out_file, end_lab);
+
+
+      /* Step 4: probe at FIRST + SIZE if we cannot assert at compile-time
+	 that SIZE is equal to ROUNDED_SIZE.  */
+      if (size != rounded_size)
+	output_probe_stack_range_op (size - rounded_size, reg);
+    }
+
+  return "";
+}
+
+/* Wrapper around gen_probe_stack_range.  */
+
+static rtx
+ix86_gen_probe_stack_range (rtx op0, rtx op1, rtx op2)
+{
+  if (TARGET_64BIT)
+    return gen_probe_stack_rangedi (op0, op1, op2);
+  else
+    return gen_probe_stack_rangesi (op0, op1, op2);
+}
+
+/* Emit code to probe a range of stack addresses from FIRST to FIRST+SIZE,
+   inclusive.  These are offsets from the current stack pointer.  */
+
+static void
+ix86_emit_probe_stack_range (HOST_WIDE_INT first, HOST_WIDE_INT size)
+{
+  if (SMALL_INTERVAL (size))
+     emit_insn (ix86_gen_probe_stack_range (GEN_INT (first), GEN_INT (size),
+					    const0_rtx));
+  else
+    {
+      struct scratch_reg sr;
+      get_scratch_register_on_entry (&sr);
+      emit_insn (ix86_gen_probe_stack_range (GEN_INT (first), GEN_INT (size),
+					     sr.reg));
+      release_scratch_register_on_entry (&sr);
+    }
+
+  /* Make sure nothing is scheduled before we are done.  */
+  emit_insn (gen_blockage ());
+}
+
 /* Finalize stack_realign_needed flag, which will guide prologue/epilogue
    to be generated in correct form.  */
 static void 
@@ -8383,6 +8790,31 @@ ix86_expand_prologue (void)
   else
     allocate += frame.nregs * UNITS_PER_WORD;
 
+  /* The stack has already been decremented by the instruction calling us
+     so we need to probe unconditionally to preserve the protection area.  */
+  if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
+    {
+      /* We expect the registers to be saved when probes are used.  */
+      gcc_assert (!frame.save_regs_using_mov);
+
+      if (STACK_CHECK_MOVING_SP)
+	{
+	  ix86_adjust_stack_and_probe (allocate);
+	  allocate = 0;
+	}
+      else
+	{
+	  const HOST_WIDE_INT max_size = 0x7fffffff - STACK_CHECK_PROTECT;
+	  HOST_WIDE_INT size = allocate;
+
+	  /* Don't bother probing more than 2 GB, this is easier.  */
+	  if (size > max_size)
+	    size = max_size;
+
+	  ix86_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
+	}
+    }
+
   /* When using red zone we may start register saving before allocating
      the stack frame saving one cycle of the prologue. However I will
      avoid doing this if I am going to have to probe the stack since
Index: libada/Makefile.in
===================================================================
--- libada/Makefile.in	(revision 152265)
+++ libada/Makefile.in	(working copy)
@@ -58,6 +58,8 @@ WARN_CFLAGS = @warn_cflags@
 
 TARGET_LIBGCC2_CFLAGS=
 GNATLIBCFLAGS= -g -O2
+GNATLIBCFLAGS_FOR_C = $(GNATLIBCFLAGS) $(TARGET_LIBGCC2_CFLAGS) -fexceptions \
+	-DIN_RTS @have_getipinfo@
 
 # Get target-specific overrides for TARGET_LIBGCC2_CFLAGS.
 host_subdir = @host_subdir@
@@ -80,6 +82,7 @@ LIBADA_FLAGS_TO_PASS = \
         "SHELL=$(SHELL)" \
         "GNATLIBFLAGS=$(GNATLIBFLAGS) $(MULTIFLAGS)" \
         "GNATLIBCFLAGS=$(GNATLIBCFLAGS) $(MULTIFLAGS)" \
+        "GNATLIBCFLAGS_FOR_C=$(GNATLIBCFLAGS_FOR_C) $(MULTIFLAGS)" \
         "TARGET_LIBGCC2_CFLAGS=$(TARGET_LIBGCC2_CFLAGS)" \
         "THREAD_KIND=$(THREAD_KIND)" \
         "TRACE=$(TRACE)" \
Index: libada/configure.ac
===================================================================
--- libada/configure.ac	(revision 152265)
+++ libada/configure.ac	(working copy)
@@ -18,6 +18,7 @@
 sinclude(../config/acx.m4)
 sinclude(../config/multi.m4)
 sinclude(../config/override.m4)
+sinclude(../config/unwind_ipinfo.m4)
 
 AC_INIT
 AC_PREREQ([2.64])
@@ -131,6 +132,14 @@ else
 fi
 AC_SUBST([default_gnatlib_target])
 
+# Check for _Unwind_GetIPInfo
+GCC_CHECK_UNWIND_GETIPINFO
+have_getipinfo=
+if test x$have_unwind_getipinfo = xyes; then
+  have_getipinfo=-DHAVE_GETIPINFO
+fi
+AC_SUBST(have_getipinfo)
+
 warn_cflags=
 if test "x$GCC" = "xyes"; then
   warn_cflags='$(GCC_WARN_CFLAGS)'

[-- Attachment #3: stack_check1.adb --]
[-- Type: text/x-adasrc, Size: 511 bytes --]

-- { dg-do run { target i?86-*-linux* x86_64-*-linux* i?86-*-solaris2.* } }
-- { dg-options "-fstack-check" }

procedure Stack_Check1 is

  type A is Array (1..2048) of Integer;

  procedure Consume_Stack (N : Integer) is
    My_A : A; -- 8 KB static
  begin
    My_A (1) := 0;
    if N <= 0 then
      return;
    end if;
    Consume_Stack (N-1);
  end;

begin

  begin
    Consume_Stack (Integer'Last);
    raise Program_Error;
  exception
    when Storage_Error => null;
  end;

  Consume_Stack (128);

end;

[-- Attachment #4: stack_check2.adb --]
[-- Type: text/x-adasrc, Size: 595 bytes --]

-- { dg-do run { target i?86-*-linux* x86_64-*-linux* i?86-*-solaris2.* } }
-- { dg-options "-fstack-check" }

procedure Stack_Check2 is

  function UB return Integer is
  begin
    return 2048;
  end;

  type A is Array (Positive range <>) of Integer;

  procedure Consume_Stack (N : Integer) is
    My_A : A (1..UB); -- 8 KB dynamic
  begin
    My_A (1) := 0;
    if N <= 0 then
      return;
    end if;
    Consume_Stack (N-1);
  end;

begin

  begin
    Consume_Stack (Integer'Last);
    raise Program_Error;
  exception
    when Storage_Error => null;
  end;

  Consume_Stack (128);

end;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch] New -fstack-check implementation (2/n)
  2009-09-29 23:25   ` Eric Botcazou
@ 2009-10-29 17:18     ` Eric Botcazou
  2009-10-31  3:07       ` Ian Lance Taylor
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Botcazou @ 2009-10-29 17:18 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3636 bytes --]

> OK, I've removed all the non-essential bits from the patch and left some
> broken stuff in the compiler as-is.  This implements working stack checking
> for x86/x86-64, Linux and Solaris only.  Full ACATS passes with
> -fstack-check as well as with -O2 -fstack-check.

Updated patch, the Ada bits have been installed independently in-between.

Re-tested on i586-suse-linux, OK for mainline?


2009-10-29  Eric Botcazou  <ebotcazou@adacore.com>

        PR target/10127
        PR ada/20548
        * expr.h (STACK_CHECK_PROBE_INTERVAL): Delete.
        (STACK_CHECK_PROBE_INTERVAL_EXP): New macro.
        (STACK_CHECK_MOVING_SP): Likewise.
        * system.h (STACK_CHECK_PROBE_INTERVAL): Poison it.
        * doc/tm.texi (Stack Checking): Delete STACK_CHECK_PROBE_INTERVAL.
        Document STACK_CHECK_PROBE_INTERVAL_EXP and STACK_CHECK_MOVING_SP.
        * explow.c (anti_adjust_stack_and_probe): New function.
        (allocate_dynamic_stack_space): Do not directly allocate space if
        STACK_CHECK_MOVING_SP, instead invoke above function.
        (emit_stack_probe): Handle probe_stack insn.
        (PROBE_INTERVAL): New macro.
        (STACK_GROW_OPTAB): Likewise.
        (STACK_HIGH, STACK_LOW): Likewise.
        (probe_stack_range): Remove support code for dedicated pattern.  Fix
        loop condition in the small constant case.  Rewrite in the general
        case to be immune to wraparounds.  Make sure the address of probes
        is valid.  Try to use [base + disp] addressing mode if possible.
        * ira.c (setup_eliminable_regset): Set frame_pointer_needed if stack
        checking is enabled and STACK_CHECK_MOVING_SP.
        * rtlanal.c (may_trap_p_1) <MEM>: If stack checking is enabled,
        return 1 for volatile references to the stack pointer.
        * tree.c (build_common_builtin_nodes): Do not set ECF_NOTHROW on
        __builtin_alloca if stack checking is enabled.
        * unwind-dw2.c (uw_identify_context): Take into account whether the
        context is that of a signal frame or not.
        * config/i386/linux-unwind.h (x86_frob_update_context): New function.
        (MD_FROB_UPDATE_CONTEXT): Define.
        * config/i386/linux.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
        (STACK_CHECK_MOVING_SP): Likewise.
        * config/i386/linux64.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
        (STACK_CHECK_MOVING_SP): Likewise.
        * config/i386/sol2.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
        * config/i386/i386.c (ix86_compute_frame_layout): Force use of push
        instructions to save registers if stack checking with probes is on.
        (get_scratch_register_on_entry): New function.
        (release_scratch_register_on_entry): Likewise.
        (output_probe_op): Likewise.
        (output_adjust_stack_and_probe_op): Likewise.
        (output_adjust_stack_and_probe): Likewise.
        (ix86_gen_adjust_stack_and_probe): Likewise.
        (ix86_adjust_stack_and_probe): Likewise.
        (output_probe_stack_range_op): Likewise.
        (ix86_gen_probe_stack_range): Likewise
        (ix86_emit_probe_stack_range): Likewise.
        (ix86_expand_prologue): Emit stack checking code if static builtin
        stack checking is enabled.
        * config/i386/i386-protos.h (output_adjust_stack_and_probe): Declare.
        (output_probe_stack_range): Likewise.
        * config/i386/i386.md (UNSPECV_STACK_PROBE_INLINE): New constant.
        (probe_stack): New expander.
        (adjust_stack_and_probe): New insn.
        (probe_stack_range): Likewise.
        (logical operation peepholes): Do not split stack checking probes.


-- 
Eric Botcazou

[-- Attachment #2: gcc-45_stack-check-2c.diff --]
[-- Type: text/x-diff, Size: 40704 bytes --]

Index: doc/tm.texi
===================================================================
--- doc/tm.texi	(revision 153694)
+++ doc/tm.texi	(working copy)
@@ -3542,11 +3542,12 @@ like to do static stack checking in some
 approach.  The default value of this macro is zero.
 @end defmac
 
-@defmac STACK_CHECK_PROBE_INTERVAL
-An integer representing the interval at which GCC must generate stack
-probe instructions.  You will normally define this macro to be no larger
-than the size of the ``guard pages'' at the end of a stack area.  The
-default value of 4096 is suitable for most systems.
+@defmac STACK_CHECK_PROBE_INTERVAL_EXP
+An integer specifying the interval at which GCC must generate stack probe
+instructions, defined as 2 raised to this integer.  You will normally
+define this macro so that the interval be no larger than the size of
+the ``guard pages'' at the end of a stack area.  The default value
+of 12 (4096-byte interval) is suitable for most systems.
 @end defmac
 
 @defmac STACK_CHECK_PROBE_LOAD
@@ -3555,6 +3556,15 @@ as a load instruction and zero if GCC sh
 The default is zero, which is the most efficient choice on most systems.
 @end defmac
 
+@defmac STACK_CHECK_MOVING_SP
+An integer which is nonzero if GCC should move the stack pointer page by page
+when doing probes.  This can be necessary on systems where the stack pointer
+contains the bottom address of the memory area accessible to the executing
+thread at any point in time.  In this situation an alternate signal stack
+is required in order to be able to recover from a stack overflow.  The
+default value of this macro is zero.
+@end defmac
+
 @defmac STACK_CHECK_PROTECT
 The number of bytes of stack needed to recover from a stack overflow,
 for languages where such a recovery is supported.  The default value of
Index: tree.c
===================================================================
--- tree.c	(revision 153694)
+++ tree.c	(working copy)
@@ -9009,7 +9009,8 @@ build_common_builtin_nodes (void)
       tmp = tree_cons (NULL_TREE, size_type_node, void_list_node);
       ftype = build_function_type (ptr_type_node, tmp);
       local_define_builtin ("__builtin_alloca", ftype, BUILT_IN_ALLOCA,
-			    "alloca", ECF_NOTHROW | ECF_MALLOC);
+			    "alloca",
+			    ECF_MALLOC | (flag_stack_check ? 0 : ECF_NOTHROW));
     }
 
   tmp = tree_cons (NULL_TREE, ptr_type_node, void_list_node);
Index: rtlanal.c
===================================================================
--- rtlanal.c	(revision 153694)
+++ rtlanal.c	(working copy)
@@ -2252,6 +2252,11 @@ may_trap_p_1 (const_rtx x, unsigned flag
 
       /* Memory ref can trap unless it's a static var or a stack slot.  */
     case MEM:
+      /* Recognize specific pattern of stack checking probes.  */
+      if (flag_stack_check
+	  && MEM_VOLATILE_P (x)
+	  && XEXP (x, 0) == stack_pointer_rtx)
+	return 1;
       if (/* MEM_NOTRAP_P only relates to the actual position of the memory
 	     reference; moving it out of context such as when moving code
 	     when optimizing, might cause its address to become invalid.  */
Index: expr.h
===================================================================
--- expr.h	(revision 153694)
+++ expr.h	(working copy)
@@ -218,9 +218,9 @@ do {								\
 #define STACK_CHECK_STATIC_BUILTIN 0
 #endif
 
-/* The default interval is one page.  */
-#ifndef STACK_CHECK_PROBE_INTERVAL
-#define STACK_CHECK_PROBE_INTERVAL 4096
+/* The default interval is one page (4096 bytes).  */
+#ifndef STACK_CHECK_PROBE_INTERVAL_EXP
+#define STACK_CHECK_PROBE_INTERVAL_EXP 12
 #endif
 
 /* The default is to do a store into the stack.  */
@@ -228,6 +228,11 @@ do {								\
 #define STACK_CHECK_PROBE_LOAD 0
 #endif
 
+/* The default is not to move the stack pointer.  */
+#ifndef STACK_CHECK_MOVING_SP
+#define STACK_CHECK_MOVING_SP 0
+#endif
+
 /* This is a kludge to try to capture the discrepancy between the old
    mechanism (generic stack checking) and the new mechanism (static
    builtin stack checking).  STACK_CHECK_PROTECT needs to be bumped
@@ -252,7 +257,7 @@ do {								\
    one probe per function.  */
 #ifndef STACK_CHECK_MAX_FRAME_SIZE
 #define STACK_CHECK_MAX_FRAME_SIZE \
-  (STACK_CHECK_PROBE_INTERVAL - UNITS_PER_WORD)
+  ((1 << STACK_CHECK_PROBE_INTERVAL_EXP) - UNITS_PER_WORD)
 #endif
 
 /* This is arbitrary, but should be large enough everywhere.  */
Index: unwind-dw2.c
===================================================================
--- unwind-dw2.c	(revision 153694)
+++ unwind-dw2.c	(working copy)
@@ -1559,7 +1559,13 @@ uw_install_context_1 (struct _Unwind_Con
 static inline _Unwind_Ptr
 uw_identify_context (struct _Unwind_Context *context)
 {
-  return _Unwind_GetCFA (context);
+  /* The CFA is not sufficient to disambiguate the context of a function
+     interrupted by a signal before establishing its frame and the context
+     of the signal itself.  */
+  if (STACK_GROWS_DOWNWARD)
+    return _Unwind_GetCFA (context) - _Unwind_IsSignalFrame (context);
+  else
+    return _Unwind_GetCFA (context) + _Unwind_IsSignalFrame (context);
 }
 
 
Index: explow.c
===================================================================
--- explow.c	(revision 153694)
+++ explow.c	(working copy)
@@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.
 
 static rtx break_out_memory_refs (rtx);
 static void emit_stack_probe (rtx);
+static void anti_adjust_stack_and_probe (rtx);
 
 
 /* Truncate and perhaps sign-extend C as appropriate for MODE.  */
@@ -1235,7 +1236,9 @@ allocate_dynamic_stack_space (rtx size,
 
   /* If needed, check that we have the required amount of stack.
      Take into account what has already been checked.  */
-  if (flag_stack_check == GENERIC_STACK_CHECK)
+  if (STACK_CHECK_MOVING_SP)
+    ;
+  else if (flag_stack_check == GENERIC_STACK_CHECK)
     probe_stack_range (STACK_OLD_CHECK_PROTECT + STACK_CHECK_MAX_FRAME_SIZE,
 		       size);
   else if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
@@ -1304,7 +1307,10 @@ allocate_dynamic_stack_space (rtx size,
 	  emit_label (space_available);
 	}
 
-      anti_adjust_stack (size);
+      if (flag_stack_check && STACK_CHECK_MOVING_SP)
+	anti_adjust_stack_and_probe (size);
+      else
+	anti_adjust_stack (size);
 
 #ifdef STACK_GROWS_DOWNWARD
       emit_move_insn (target, virtual_stack_dynamic_rtx);
@@ -1355,6 +1361,12 @@ emit_stack_probe (rtx address)
 
   MEM_VOLATILE_P (memref) = 1;
 
+  /* See if we have an insn to probe the stack.  */
+#ifdef HAVE_probe_stack
+  if (HAVE_probe_stack)
+    emit_insn (gen_probe_stack (memref));
+  else
+#endif
   if (STACK_CHECK_PROBE_LOAD)
     emit_move_insn (gen_reg_rtx (word_mode), memref);
   else
@@ -1367,10 +1379,18 @@ emit_stack_probe (rtx address)
    subtract from the stack.  If SIZE is constant, this is done
    with a fixed number of probes.  Otherwise, we must make a loop.  */
 
+#define PROBE_INTERVAL (1 << STACK_CHECK_PROBE_INTERVAL_EXP)
+
 #ifdef STACK_GROWS_DOWNWARD
-#define STACK_GROW_OP MINUS
+#define STACK_GROW_OP     MINUS
+#define STACK_GROW_OPTAB  sub_optab
+#define STACK_HIGH(high,low)  low
+#define STACK_LOW(high,low)   high
 #else
-#define STACK_GROW_OP PLUS
+#define STACK_GROW_OP     PLUS
+#define STACK_GROW_OPTAB  add_optab
+#define STACK_HIGH(high,low)  high
+#define STACK_LOW(high,low)   low
 #endif
 
 void
@@ -1394,99 +1414,252 @@ probe_stack_range (HOST_WIDE_INT first,
 			 ptr_mode);
     }
 
-  /* Next see if we have an insn to check the stack.  Use it if so.  */
-#ifdef HAVE_check_stack
-  else if (HAVE_check_stack)
+  /* Otherwise we have to generate explicit probes.  If we have a constant
+     small number of them to generate, that's the easy case.  */
+  else if (CONST_INT_P (size) && INTVAL (size) < 7 * PROBE_INTERVAL)
+    {
+      HOST_WIDE_INT i, offset, size_int = INTVAL (size);
+      rtx addr;
+
+      /* Probe at FIRST + N * PROBE_INTERVAL for values of N from 1 until
+	 it exceeds SIZE.  If only one probe is needed, this will not
+	 generate any code.  Then probe at FIRST + SIZE.  */
+      for (i = PROBE_INTERVAL; i < size_int; i += PROBE_INTERVAL)
+	{
+	  offset = first + i;
+#ifdef STACK_GROWS_DOWNWARD
+	  offset = -offset;
+#endif
+	  addr = memory_address (Pmode,
+				 plus_constant (stack_pointer_rtx, offset));
+	  emit_stack_probe (addr);
+	}
+
+      offset = first + size_int;
+#ifdef STACK_GROWS_DOWNWARD
+      offset = -offset;
+#endif
+      addr = memory_address (Pmode, plus_constant (stack_pointer_rtx, offset));
+      emit_stack_probe (addr);
+    }
+
+  /* In the variable case, do the same as above, but in a loop.  Note that we
+     must be extra careful with variables wrapping around because we might be
+     at the very top (or the very bottom) of the address space and we have to
+     be able to handle this case properly; in particular, we use an equality
+     test for the loop condition.  */
+  else
     {
-      insn_operand_predicate_fn pred;
-      rtx last_addr
-	= force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					 stack_pointer_rtx,
-					 plus_constant (size, first)),
-			 NULL_RTX);
+      rtx rounded_size, rounded_size_op, test_addr, last_addr, temp;
+      rtx loop_lab = gen_label_rtx ();
+      rtx end_lab = gen_label_rtx ();
 
-      pred = insn_data[(int) CODE_FOR_check_stack].operand[0].predicate;
-      if (pred && ! ((*pred) (last_addr, Pmode)))
-	last_addr = copy_to_mode_reg (Pmode, last_addr);
+      /* Step 1: round SIZE to the previous multiple of the interval.  */
 
-      emit_insn (gen_check_stack (last_addr));
-    }
+      /* ROUNDED_SIZE = SIZE & -PROBE_INTERVAL  */
+      rounded_size = simplify_gen_binary (AND, Pmode,
+					  size,
+					  GEN_INT (-PROBE_INTERVAL));
+      rounded_size_op = force_operand (rounded_size, NULL_RTX);
+
+
+      /* Step 2: compute initial and final value of the loop counter.  */
+
+      /* TEST_ADDR = SP + FIRST.  */
+      test_addr = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
+					 	 stack_pointer_rtx,
+					 	 GEN_INT (first)),
+				 NULL_RTX);
+
+      /* LAST_ADDR = SP + FIRST + ROUNDED_SIZE.  */
+      last_addr = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
+						 test_addr,
+						 rounded_size_op),
+				 NULL_RTX);
+
+
+      /* Step 3: the loop
+
+	 while (TEST_ADDR != LAST_ADDR)
+	   {
+	     TEST_ADDR = TEST_ADDR + PROBE_INTERVAL
+	     probe at TEST_ADDR
+	   }
+
+	 probes at FIRST + N * PROBE_INTERVAL for values of N from 1
+	 until it is equal to ROUNDED_SIZE.  */
+
+      emit_label (loop_lab);
+
+      /* Jump to END_LAB if TEST_ADDR == LAST_ADDR.  */
+      emit_cmp_and_jump_insns (test_addr, last_addr, EQ,
+			       NULL_RTX, Pmode, 1, end_lab);
+
+      /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL.  */
+      temp = expand_binop (Pmode, STACK_GROW_OPTAB, test_addr,
+			   GEN_INT (PROBE_INTERVAL), test_addr,
+			   1, OPTAB_WIDEN);
+
+      gcc_assert (temp == test_addr);
+
+      /* Probe at TEST_ADDR.  */
+      emit_stack_probe (test_addr);
+
+      emit_jump (loop_lab);
+
+      emit_label (end_lab);
+
+      /* Step 4: probe at FIRST + SIZE if we cannot assert at compile-time
+	 that SIZE is equal to ROUNDED_SIZE.  */
+
+      /* TEMP = SIZE - ROUNDED_SIZE.  */
+      temp = simplify_gen_binary (MINUS, Pmode, size, rounded_size);
+      if (temp != const0_rtx)
+	{
+	  rtx addr;
+
+	  if (GET_CODE (temp) == CONST_INT)
+	    {
+	      /* Use [base + disp} addressing mode if supported.  */
+	      HOST_WIDE_INT offset = INTVAL (temp);
+#ifdef STACK_GROWS_DOWNWARD
+	      offset = -offset;
 #endif
+	      addr = memory_address (Pmode, plus_constant (last_addr, offset));
+	    }
+	  else
+	    {
+	      /* Manual CSE if the difference is not known at compile-time.  */
+	      temp = gen_rtx_MINUS (Pmode, size, rounded_size_op);
+	      addr = memory_address (Pmode,
+				     gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
+						     last_addr, temp));
+	    }
 
-  /* If we have to generate explicit probes, see if we have a constant
-     small number of them to generate.  If so, that's the easy case.  */
-  else if (CONST_INT_P (size)
-	   && INTVAL (size) < 10 * STACK_CHECK_PROBE_INTERVAL)
-    {
-      HOST_WIDE_INT offset;
+	  emit_stack_probe (addr);
+	}
+    }
+}
+
+/* Adjust the stack by SIZE bytes while probing it.  Note that we skip
+   the probe for the first interval and instead probe one interval past
+   the specified size in order to maintain a protection area.  */
+
+static void
+anti_adjust_stack_and_probe (rtx size)
+{
+  rtx probe_interval = GEN_INT (PROBE_INTERVAL);
 
-      /* Start probing at FIRST + N * STACK_CHECK_PROBE_INTERVAL
-	 for values of N from 1 until it exceeds LAST.  If only one
-	 probe is needed, this will not generate any code.  Then probe
-	 at LAST.  */
-      for (offset = first + STACK_CHECK_PROBE_INTERVAL;
-	   offset < INTVAL (size);
-	   offset = offset + STACK_CHECK_PROBE_INTERVAL)
-	emit_stack_probe (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					  stack_pointer_rtx,
-					  GEN_INT (offset)));
+  /* First ensure SIZE is Pmode.  */
+  if (GET_MODE (size) != VOIDmode && GET_MODE (size) != Pmode)
+    size = convert_to_mode (Pmode, size, 1);
 
-      emit_stack_probe (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					stack_pointer_rtx,
-					plus_constant (size, first)));
+  /* If we have a constant small number of probes to generate, that's the
+     easy case.  */
+  if (GET_CODE (size) == CONST_INT && INTVAL (size) < 7 * PROBE_INTERVAL)
+    {
+      HOST_WIDE_INT i, int_size = INTVAL (size);
+      bool first_probe = true;
+
+      /* Adjust SP and probe to PROBE_INTERVAL + N * PROBE_INTERVAL for
+	 values of N from 1 until it exceeds SIZE.  If only one probe is
+	 needed, this will not generate any code.  Then adjust and probe
+	 to PROBE_INTERVAL + SIZE.  */
+      for (i = PROBE_INTERVAL; i < int_size; i += PROBE_INTERVAL)
+	{
+	  if (first_probe)
+	    {
+	      anti_adjust_stack (GEN_INT (2 * PROBE_INTERVAL));
+	      first_probe = false;
+	    }
+	  else
+	    anti_adjust_stack (probe_interval);
+	  emit_stack_probe (stack_pointer_rtx);
+	}
+
+      if (first_probe)
+	anti_adjust_stack (plus_constant (size, PROBE_INTERVAL));
+      else
+	anti_adjust_stack (plus_constant (size, PROBE_INTERVAL - i));
+      emit_stack_probe (stack_pointer_rtx);
     }
 
-  /* In the variable case, do the same as above, but in a loop.  We emit loop
-     notes so that loop optimization can be done.  */
+  /* In the variable case, do the same as above, but in a loop.  Note that we
+     must be extra careful with variables wrapping around because we might be
+     at the very top (or the very bottom) of the address space and we have to
+     be able to handle this case properly; in particular, we use an equality
+     test for the loop condition.  */
   else
     {
-      rtx test_addr
-	= force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					 stack_pointer_rtx,
-					 GEN_INT (first + STACK_CHECK_PROBE_INTERVAL)),
-			 NULL_RTX);
-      rtx last_addr
-	= force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
-					 stack_pointer_rtx,
-					 plus_constant (size, first)),
-			 NULL_RTX);
-      rtx incr = GEN_INT (STACK_CHECK_PROBE_INTERVAL);
+      rtx rounded_size, rounded_size_op, last_addr, temp;
       rtx loop_lab = gen_label_rtx ();
-      rtx test_lab = gen_label_rtx ();
       rtx end_lab = gen_label_rtx ();
-      rtx temp;
 
-      if (!REG_P (test_addr)
-	  || REGNO (test_addr) < FIRST_PSEUDO_REGISTER)
-	test_addr = force_reg (Pmode, test_addr);
+      /* Step 1: round SIZE to the previous multiple of the interval.  */
+
+      /* ROUNDED_SIZE = SIZE & -PROBE_INTERVAL  */
+      rounded_size = simplify_gen_binary (AND, Pmode,
+					  size,
+					  GEN_INT (-PROBE_INTERVAL));
+      rounded_size_op = force_operand (rounded_size, NULL_RTX);
 
-      emit_jump (test_lab);
+
+      /* Step 2: compute initial and final value of the loop counter.  */
+
+      /* SP = SP_0 + PROBE_INTERVAL.  */
+      anti_adjust_stack (probe_interval);
+
+      /* LAST_ADDR = SP_0 + PROBE_INTERVAL + ROUNDED_SIZE.  */
+      last_addr = force_operand (gen_rtx_fmt_ee (STACK_GROW_OP, Pmode,
+						 stack_pointer_rtx,
+						 rounded_size_op),
+				 NULL_RTX);
+
+
+      /* Step 3: the loop
+
+	  while (SP != LAST_ADDR)
+	    {
+	      SP = SP + PROBE_INTERVAL
+	      probe at SP
+	    }
+
+	 adjusts SP and probes to PROBE_INTERVAL + N * PROBE_INTERVAL for
+	 values of N from 1 until it is equal to ROUNDED_SIZE.  */
 
       emit_label (loop_lab);
-      emit_stack_probe (test_addr);
 
-#ifdef STACK_GROWS_DOWNWARD
-#define CMP_OPCODE GTU
-      temp = expand_binop (Pmode, sub_optab, test_addr, incr, test_addr,
-			   1, OPTAB_WIDEN);
-#else
-#define CMP_OPCODE LTU
-      temp = expand_binop (Pmode, add_optab, test_addr, incr, test_addr,
-			   1, OPTAB_WIDEN);
-#endif
+      /* Jump to END_LAB if SP == LAST_ADDR.  */
+      emit_cmp_and_jump_insns (stack_pointer_rtx, last_addr, EQ,
+			       NULL_RTX, Pmode, 1, end_lab);
+
+      /* SP = SP + PROBE_INTERVAL and probe at SP.  */
+      anti_adjust_stack (probe_interval);
+      emit_stack_probe (stack_pointer_rtx);
 
-      gcc_assert (temp == test_addr);
+      emit_jump (loop_lab);
 
-      emit_label (test_lab);
-      emit_cmp_and_jump_insns (test_addr, last_addr, CMP_OPCODE,
-			       NULL_RTX, Pmode, 1, loop_lab);
-      emit_jump (end_lab);
       emit_label (end_lab);
 
-      emit_stack_probe (last_addr);
+      /* Step 4: adjust SP and probe to PROBE_INTERVAL + SIZE if we cannot
+	 assert at compile-time that SIZE is equal to ROUNDED_SIZE.  */
+
+      /* TEMP = SIZE - ROUNDED_SIZE.  */
+      temp = simplify_gen_binary (MINUS, Pmode, size, rounded_size);
+      if (temp != const0_rtx)
+	{
+	  /* Manual CSE if the difference is not known at compile-time.  */
+	  if (GET_CODE (temp) != CONST_INT)
+	    temp = gen_rtx_MINUS (Pmode, size, rounded_size_op);
+	  anti_adjust_stack (temp);
+	  emit_stack_probe (stack_pointer_rtx);
+	}
     }
+
+  /* Adjust back to account for the additional first interval.  */
+  adjust_stack (probe_interval);
 }
-\f
+
 /* Return an rtx representing the register or memory location
    in which a scalar value of data type VALTYPE
    was returned by a function call to function FUNC.
Index: ira.c
===================================================================
--- ira.c	(revision 153694)
+++ ira.c	(working copy)
@@ -1442,6 +1442,9 @@ ira_setup_eliminable_regset (void)
   int need_fp
     = (! flag_omit_frame_pointer
        || (cfun->calls_alloca && EXIT_IGNORE_STACK)
+       /* We need the frame pointer to catch stack overflow exceptions
+	  if the stack pointer is moving.  */
+       || (flag_stack_check && STACK_CHECK_MOVING_SP)
        || crtl->accesses_prior_frames
        || crtl->stack_realign_needed
        || targetm.frame_pointer_required ());
Index: system.h
===================================================================
--- system.h	(revision 153694)
+++ system.h	(working copy)
@@ -761,7 +761,7 @@ extern void fancy_abort (const char *, i
 	TARGET_ASM_EXCEPTION_SECTION TARGET_ASM_EH_FRAME_SECTION	   \
 	SMALL_ARG_MAX ASM_OUTPUT_SHARED_BSS ASM_OUTPUT_SHARED_COMMON	   \
 	ASM_OUTPUT_SHARED_LOCAL UNALIGNED_WORD_ASM_OP			   \
-	ASM_MAKE_LABEL_LINKONCE
+	ASM_MAKE_LABEL_LINKONCE STACK_CHECK_PROBE_INTERVAL
 
 /* Hooks that are no longer used.  */
  #pragma GCC poison LANG_HOOKS_FUNCTION_MARK LANG_HOOKS_FUNCTION_FREE	\
Index: config/i386/linux.h
===================================================================
--- config/i386/linux.h	(revision 153694)
+++ config/i386/linux.h	(working copy)
@@ -207,6 +207,12 @@ along with GCC; see the file COPYING3.
 
 #define MD_UNWIND_SUPPORT "config/i386/linux-unwind.h"
 
+/* Static stack checking is supported by means of probes.  */
+#define STACK_CHECK_STATIC_BUILTIN 1
+
+/* The stack pointer needs to be moved while checking the stack.  */
+#define STACK_CHECK_MOVING_SP 1
+
 /* This macro may be overridden in i386/k*bsd-gnu.h.  */
 #define REG_NAME(reg) reg
 
Index: config/i386/i386.md
===================================================================
--- config/i386/i386.md	(revision 153694)
+++ config/i386/i386.md	(working copy)
@@ -223,15 +223,16 @@ (define_constants
 (define_constants
   [(UNSPECV_BLOCKAGE		0)
    (UNSPECV_STACK_PROBE		1)
-   (UNSPECV_EMMS		2)
-   (UNSPECV_LDMXCSR		3)
-   (UNSPECV_STMXCSR		4)
-   (UNSPECV_FEMMS		5)
-   (UNSPECV_CLFLUSH		6)
-   (UNSPECV_ALIGN		7)
-   (UNSPECV_MONITOR		8)
-   (UNSPECV_MWAIT		9)
-   (UNSPECV_CMPXCHG		10)
+   (UNSPECV_STACK_PROBE_INLINE  2)
+   (UNSPECV_EMMS		3)
+   (UNSPECV_LDMXCSR		4)
+   (UNSPECV_STMXCSR		5)
+   (UNSPECV_FEMMS		6)
+   (UNSPECV_CLFLUSH		7)
+   (UNSPECV_ALIGN		8)
+   (UNSPECV_MONITOR		9)
+   (UNSPECV_MWAIT		10)
+   (UNSPECV_CMPXCHG		11)
    (UNSPECV_XCHG		12)
    (UNSPECV_LOCK		13)
    (UNSPECV_PROLOGUE_USE	14)
@@ -19961,6 +19962,38 @@ (define_expand "allocate_stack"
   DONE;
 })
 
+(define_expand "probe_stack"
+  [(match_operand 0 "memory_operand" "")]
+  ""
+{
+  if (GET_MODE (operands[0]) == DImode)
+    emit_insn (gen_iordi3 (operands[0], operands[0], const0_rtx));
+  else
+    emit_insn (gen_iorsi3 (operands[0], operands[0], const0_rtx));
+  DONE;
+})
+
+(define_insn "adjust_stack_and_probe<P:mode>"
+  [(unspec_volatile:P [(match_operand:P 0 "const_int_operand" "n")]
+    UNSPECV_STACK_PROBE_INLINE)
+   (set (reg:P SP_REG) (minus:P (reg:P SP_REG) (match_dup 0)))
+   (clobber (match_operand:P 1 "general_operand" "=rn"))
+   (clobber (reg:CC FLAGS_REG))
+   (clobber (mem:BLK (scratch)))]
+  ""
+  "* return output_adjust_stack_and_probe (operands[0], operands[1]);"
+  [(set_attr "type" "multi")])
+
+(define_insn "probe_stack_range<P:mode>"
+  [(unspec_volatile:P [(match_operand:P 0 "const_int_operand" "n")
+		       (match_operand:P 1 "const_int_operand" "n")]
+    UNSPECV_STACK_PROBE_INLINE)
+   (clobber (match_operand:P 2 "general_operand" "=rn"))
+   (clobber (reg:CC FLAGS_REG))]
+  ""
+  "* return output_probe_stack_range (operands[0], operands[1], operands[2]);"
+  [(set_attr "type" "multi")])
+
 (define_expand "builtin_setjmp_receiver"
   [(label_ref (match_operand 0 "" ""))]
   "!TARGET_64BIT && flag_pic"
@@ -20464,7 +20497,9 @@ (define_peephole2
                      [(match_dup 0)
                       (match_operand:SI 1 "nonmemory_operand" "")]))
               (clobber (reg:CC FLAGS_REG))])]
-  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE"
+  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE
+   /* Do not split stack checking probes.  */
+   && GET_CODE (operands[3]) != IOR && operands[1] != const0_rtx"
   [(set (match_dup 2) (match_dup 0))
    (parallel [(set (match_dup 2)
                    (match_op_dup 3 [(match_dup 2) (match_dup 1)]))
@@ -20479,7 +20514,9 @@ (define_peephole2
                      [(match_operand:SI 1 "nonmemory_operand" "")
                       (match_dup 0)]))
               (clobber (reg:CC FLAGS_REG))])]
-  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE"
+  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE
+   /* Do not split stack checking probes.  */
+   && GET_CODE (operands[3]) != IOR && operands[1] != const0_rtx"
   [(set (match_dup 2) (match_dup 0))
    (parallel [(set (match_dup 2)
                    (match_op_dup 3 [(match_dup 1) (match_dup 2)]))
Index: config/i386/sol2.h
===================================================================
--- config/i386/sol2.h	(revision 153694)
+++ config/i386/sol2.h	(working copy)
@@ -113,6 +113,9 @@ along with GCC; see the file COPYING3.
 #undef X86_FILE_START_VERSION_DIRECTIVE
 #define X86_FILE_START_VERSION_DIRECTIVE false
 
+/* Static stack checking is supported by means of probes.  */
+#define STACK_CHECK_STATIC_BUILTIN 1
+
 /* Only recent versions of Solaris 11 ld properly support hidden .gnu.linkonce
    sections, so don't use them.  */
 #ifndef TARGET_GNU_LD
Index: config/i386/linux64.h
===================================================================
--- config/i386/linux64.h	(revision 153694)
+++ config/i386/linux64.h	(working copy)
@@ -110,6 +110,12 @@ see the files COPYING3 and COPYING.RUNTI
 
 #define MD_UNWIND_SUPPORT "config/i386/linux-unwind.h"
 
+/* Static stack checking is supported by means of probes.  */
+#define STACK_CHECK_STATIC_BUILTIN 1
+
+/* The stack pointer needs to be moved while checking the stack.  */
+#define STACK_CHECK_MOVING_SP 1
+
 /* This macro may be overridden in i386/k*bsd-gnu.h.  */
 #define REG_NAME(reg) reg
 
Index: config/i386/linux-unwind.h
===================================================================
--- config/i386/linux-unwind.h	(revision 153694)
+++ config/i386/linux-unwind.h	(working copy)
@@ -172,6 +172,25 @@ x86_fallback_frame_state (struct _Unwind
   fs->signal_frame = 1;
   return _URC_NO_REASON;
 }
+
+#define MD_FROB_UPDATE_CONTEXT x86_frob_update_context
+
+/* Fix up for kernels that have vDSO, but don't have S flag in it.  */
+
+static void
+x86_frob_update_context (struct _Unwind_Context *context,
+			 _Unwind_FrameState *fs ATTRIBUTE_UNUSED)
+{
+  unsigned char *pc = context->ra;
+
+  /* movl $__NR_rt_sigreturn,%eax ; {int $0x80 | syscall}  */
+  if (*(unsigned char *)(pc+0) == 0xb8
+      && *(unsigned int *)(pc+1) == 173
+      && (*(unsigned short *)(pc+5) == 0x80cd
+	  || *(unsigned short *)(pc+5) == 0x050f))
+    _Unwind_SetSignalFrame (context, 1);
+}
+
 #endif /* not glibc 2.0 */
 #endif /* ifdef __x86_64__  */
 #endif /* ifdef inhibit_libc  */
Index: config/i386/i386-protos.h
===================================================================
--- config/i386/i386-protos.h	(revision 153694)
+++ config/i386/i386-protos.h	(working copy)
@@ -69,6 +69,8 @@ extern const char *output_387_binary_op
 extern const char *output_387_reg_move (rtx, rtx*);
 extern const char *output_fix_trunc (rtx, rtx*, int);
 extern const char *output_fp_compare (rtx, rtx*, int, int);
+extern const char *output_adjust_stack_and_probe (rtx, rtx);
+extern const char *output_probe_stack_range (rtx, rtx, rtx);
 
 extern void ix86_expand_clear (rtx);
 extern void ix86_expand_move (enum machine_mode, rtx[]);
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 153694)
+++ config/i386/i386.c	(working copy)
@@ -7899,6 +7899,11 @@ ix86_compute_frame_layout (struct ix86_f
   else
     frame->save_regs_using_mov = false;
 
+  /* If static stack checking is enabled and done with probes, the registers
+     need to be saved before allocating the frame.  */
+  if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
+    frame->save_regs_using_mov = false;
+
   /* Skip return address.  */
   offset = UNITS_PER_WORD;
 
@@ -8277,6 +8282,408 @@ ix86_internal_arg_pointer (void)
   return virtual_incoming_args_rtx;
 }
 
+struct scratch_reg {
+  rtx reg;
+  bool saved;
+};
+
+/* Return a short-lived scratch register for use on function entry.
+   In 32-bit mode, it is valid only after the registers are saved
+   in the prologue.  This register must be released by means of
+   release_scratch_register_on_entry once it is dead.  */
+
+static void
+get_scratch_register_on_entry (struct scratch_reg *sr)
+{
+  int regno;
+
+  sr->saved = false;
+
+  if (TARGET_64BIT)
+    regno = FIRST_REX_INT_REG + 3; /* r11 */
+  else
+    {
+      tree decl = current_function_decl, fntype = TREE_TYPE (decl);
+      bool fastcall_p
+	= lookup_attribute ("fastcall", TYPE_ATTRIBUTES (fntype)) != NULL_TREE;
+      int regparm = ix86_function_regparm (fntype, decl);
+      int drap_regno
+	= crtl->drap_reg ? REGNO (crtl->drap_reg) : INVALID_REGNUM;
+
+      /* 'fastcall' sets regparm to 2 and uses ecx+edx.  */
+      if ((regparm < 1 || fastcall_p) && drap_regno != 0)
+	regno = 0;
+      else if (regparm < 2 && drap_regno != 1)
+	regno = 1;
+      else if (regparm < 3 && !fastcall_p && drap_regno != 2
+	       /* ecx is the static chain register.  */
+	       && !DECL_STATIC_CHAIN (decl))
+	regno = 2;
+      else if (ix86_save_reg (3, true))
+	regno = 3;
+      else if (ix86_save_reg (4, true))
+	regno = 4;
+      else if (ix86_save_reg (5, true))
+	regno = 5;
+      else
+	{
+	  regno = (drap_regno == 0 ? 1 : 0);
+	  sr->saved = true;
+	}
+    }
+
+  sr->reg = gen_rtx_REG (Pmode, regno);
+  if (sr->saved)
+    {
+      rtx insn = emit_insn (gen_push (sr->reg));
+      RTX_FRAME_RELATED_P (insn) = 1;
+    }
+}
+
+/* Release a scratch register obtained from the preceding function.  */
+
+static void
+release_scratch_register_on_entry (struct scratch_reg *sr)
+{
+  if (sr->saved)
+    {
+      rtx insn, x;
+
+      if (TARGET_64BIT)
+	insn = emit_insn (gen_popdi1 (sr->reg));
+      else
+	insn = emit_insn (gen_popsi1 (sr->reg));
+      RTX_FRAME_RELATED_P (insn) = 1;
+
+      /* The RTX_FRAME_RELATED_P mechanism doesn't know about pop.  */
+      x = plus_constant (stack_pointer_rtx, UNITS_PER_WORD);
+      x = gen_rtx_SET (VOIDmode, stack_pointer_rtx, x);
+      add_reg_note (insn, REG_CFA_ADJUST_CFA, x);
+    }
+}
+
+/* The run-time loop is made up of 8 insns in the generic case while this
+   compile-time loop is made up of n insns for n # of intervals.  */
+#define PROBE_INTERVAL (1 << STACK_CHECK_PROBE_INTERVAL_EXP)
+#define SMALL_INTERVAL(size) ((size) <= 8 * PROBE_INTERVAL)
+
+/* Output one probe.  */
+
+static inline void
+output_probe_op (void)
+{
+  fputs (TARGET_64BIT ? "\torq\t$0, " : "\torl\t$0, ", asm_out_file);
+}
+
+/* Adjust the stack by SIZE bytes and output one probe.  */
+
+static void
+output_adjust_stack_and_probe_op (HOST_WIDE_INT size)
+{
+  fprintf (asm_out_file, "\tsub\t$"HOST_WIDE_INT_PRINT_DEC",", size);
+  print_reg (stack_pointer_rtx, 0, asm_out_file);
+  fputc ('\n', asm_out_file);
+  output_probe_op ();
+  fputc ('(', asm_out_file);
+  print_reg (stack_pointer_rtx, 0, asm_out_file);
+  fputs (")\n", asm_out_file);
+}
+
+/* Adjust the stack by SIZE bytes while probing it.  Note that we skip
+   the probe for the first interval and instead probe one interval past
+   the specified size in order to maintain a protection area.  */
+
+const char *
+output_adjust_stack_and_probe (rtx size_rtx, rtx reg)
+{
+  static int labelno = 0;
+  HOST_WIDE_INT size = INTVAL (size_rtx);
+  HOST_WIDE_INT rounded_size;
+  char loop_lab[32], end_lab[32];
+
+  /* See if we have a constant small number of probes to generate.  If so,
+     that's the easy case.  */
+  if (SMALL_INTERVAL (size))
+    {
+      HOST_WIDE_INT i;
+      bool first_probe = true;
+
+      /* Adjust SP and probe to PROBE_INTERVAL + N * PROBE_INTERVAL for
+	 values of N from 1 until it exceeds SIZE.  If only one probe is
+	 needed, this will not generate any code.  Then adjust and probe
+	 to PROBE_INTERVAL + SIZE.  */
+      for (i = PROBE_INTERVAL; i < size; i += PROBE_INTERVAL)
+	{
+	  if (first_probe)
+	    {
+	      output_adjust_stack_and_probe_op (2 * PROBE_INTERVAL);
+	      first_probe = false;
+	    }
+	  else
+	    output_adjust_stack_and_probe_op (PROBE_INTERVAL);
+	}
+
+      if (first_probe)
+	output_adjust_stack_and_probe_op (size + PROBE_INTERVAL);
+      else
+	output_adjust_stack_and_probe_op (size + PROBE_INTERVAL - i);
+    }
+
+  /* In the variable case, do the same as above, but in a loop.  Note that we
+     must be extra careful with variables wrapping around because we might be
+     at the very top (or the very bottom) of the address space and we have to
+     be able to handle this case properly; in particular, we use an equality
+     test for the loop condition.  */
+  else
+    {
+      /* Step 1: round SIZE to the previous multiple of the interval.  */
+      rounded_size = size & -PROBE_INTERVAL;
+
+
+      /* Step 2: compute initial and final value of the loop counter.  */
+
+      /* SP = SP_0 + PROBE_INTERVAL.  */
+      fprintf (asm_out_file, "\tsub\t$%d, ", PROBE_INTERVAL);
+      print_reg (stack_pointer_rtx, 0, asm_out_file);
+
+      /* LAST_ADDR = SP_0 + PROBE_INTERVAL + ROUNDED_SIZE.  */
+      fprintf (asm_out_file, "\n\tmov\t$-"HOST_WIDE_INT_PRINT_DEC", ",
+	       rounded_size);
+      print_reg (reg, 0, asm_out_file);
+      fputs ("\n\tadd\t", asm_out_file);
+      print_reg (stack_pointer_rtx, 0, asm_out_file);
+      fputs (", ", asm_out_file);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+
+
+      /* Step 3: the loop
+
+	  while (SP != LAST_ADDR)
+	    {
+	      SP = SP + PROBE_INTERVAL
+	      probe at SP
+	    }
+
+	 adjusts SP and probes to PROBE_INTERVAL + N * PROBE_INTERVAL for
+	 values of N from 1 until it is equal to ROUNDED_SIZE.  */
+
+      ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno);
+      ASM_OUTPUT_LABEL (asm_out_file, loop_lab);
+
+      /* Jump to END_LAB if SP == LAST_ADDR.  */
+      fputs ("\tcmp\t", asm_out_file);
+      print_reg (stack_pointer_rtx, 0, asm_out_file);
+      fputs (", ", asm_out_file);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+      ASM_GENERATE_INTERNAL_LABEL (end_lab, "LPSRE", labelno++);
+      fputs ("\tje\t", asm_out_file); assemble_name (asm_out_file, end_lab);
+      fputc ('\n', asm_out_file);
+
+      /* SP = SP + PROBE_INTERVAL and probe at SP.  */
+      output_adjust_stack_and_probe_op (PROBE_INTERVAL);
+
+      fprintf (asm_out_file, "\tjmp\t"); assemble_name (asm_out_file, loop_lab);
+      fputc ('\n', asm_out_file);
+
+      ASM_OUTPUT_LABEL (asm_out_file, end_lab);
+
+
+      /* Step 4: adjust SP and probe to PROBE_INTERVAL + SIZE if we cannot
+	 assert at compile-time that SIZE is equal to ROUNDED_SIZE.  */
+      if (size != rounded_size)
+	output_adjust_stack_and_probe_op (size - rounded_size);
+    }
+
+  /* Adjust back to account for the additional first interval.  */
+  fprintf (asm_out_file, "\tadd\t$%d, ", PROBE_INTERVAL);
+  print_reg (stack_pointer_rtx, 0, asm_out_file);
+  fputc ('\n', asm_out_file);
+
+  return "";
+}
+
+/* Wrapper around gen_adjust_stack_and_probe.  */
+
+static rtx
+ix86_gen_adjust_stack_and_probe (rtx op0, rtx op1)
+{
+  if (TARGET_64BIT)
+    return gen_adjust_stack_and_probedi (op0, op1);
+  else
+    return gen_adjust_stack_and_probesi (op0, op1);
+}
+
+/* Emit code to adjust the stack by SIZE bytes while probing it.  */
+
+static void
+ix86_adjust_stack_and_probe (HOST_WIDE_INT size)
+{
+  rtx size_rtx = GEN_INT (size);
+
+  if (SMALL_INTERVAL (size))
+    emit_insn (ix86_gen_adjust_stack_and_probe (size_rtx, const0_rtx));
+  else
+    {
+      struct scratch_reg sr;
+      get_scratch_register_on_entry (&sr);
+      emit_insn (ix86_gen_adjust_stack_and_probe (size_rtx, sr.reg));
+      release_scratch_register_on_entry (&sr);
+    }
+
+  gcc_assert (ix86_cfa_state->reg != stack_pointer_rtx);
+
+  /* Make sure nothing is scheduled before we are done.  */
+  emit_insn (gen_blockage ());
+}
+
+/* Output one probe at OFFSET + INDEX from the current stack pointer.  */
+
+static void
+output_probe_stack_range_op (HOST_WIDE_INT offset, rtx index)
+{
+  output_probe_op ();
+  if (offset)
+    fprintf (asm_out_file, "-"HOST_WIDE_INT_PRINT_DEC, offset);
+  fputc ('(', asm_out_file);
+  print_reg (stack_pointer_rtx, 0, asm_out_file);
+  if (index)
+    {
+      fputc (',', asm_out_file);
+      print_reg (index, 0, asm_out_file);
+      fputs (",1", asm_out_file);
+    }
+  fputs (")\n", asm_out_file);
+}
+
+/* Probe a range of stack addresses from FIRST to FIRST+SIZE, inclusive.
+   These are offsets from the current stack pointer.  */
+
+const char *
+output_probe_stack_range (rtx first_rtx, rtx size_rtx, rtx reg)
+{
+  static int labelno = 0;
+  HOST_WIDE_INT first = INTVAL (first_rtx);
+  HOST_WIDE_INT size = INTVAL (size_rtx);
+  HOST_WIDE_INT rounded_size;
+  char loop_lab[32], end_lab[32];
+
+  /* See if we have a constant small number of probes to generate.  If so,
+     that's the easy case.  */
+  if (SMALL_INTERVAL (size))
+    {
+      HOST_WIDE_INT i;
+
+      /* Probe at FIRST + N * PROBE_INTERVAL for values of N from 1 until
+	 it exceeds SIZE.  If only one probe is needed, this will not
+	 generate any code.  Then probe at FIRST + SIZE.  */
+      for (i = PROBE_INTERVAL; i < size; i += PROBE_INTERVAL)
+	output_probe_stack_range_op (first + i, NULL_RTX);
+
+      output_probe_stack_range_op (first + size, NULL_RTX);
+    }
+
+  /* Otherwise, do the same as above, but in a loop.  Note that we must be
+     extra careful with variables wrapping around because we might be at
+     the very top (or the very bottom) of the address space and we have
+     to be able to handle this case properly; in particular, we use an
+     equality test for the loop condition.  */
+  else
+    {
+      /* Step 1: round SIZE to the previous multiple of the interval.  */
+      rounded_size = size & -PROBE_INTERVAL;
+
+
+      /* Step 2: compute initial and final value of the loop counter.  */
+
+      /* TEST_OFFSET = FIRST.  */
+      fprintf (asm_out_file, "\tmov\t$-"HOST_WIDE_INT_PRINT_DEC", ", first);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+
+      /* LAST_OFFSET = FIRST + ROUNDED_SIZE.  */
+
+
+      /* Step 3: the loop
+
+	 while (TEST_ADDR != LAST_ADDR)
+	   {
+	     TEST_ADDR = TEST_ADDR + PROBE_INTERVAL
+	     probe at TEST_ADDR
+	   }
+
+	 probes at FIRST + N * PROBE_INTERVAL for values of N from 1
+	 until it is equal to ROUNDED_SIZE.  */
+
+      ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno);
+      ASM_OUTPUT_LABEL (asm_out_file, loop_lab);
+
+       /* Jump to END_LAB if TEST_ADDR == LAST_ADDR.  */
+      fprintf (asm_out_file, "\tcmp\t$-"HOST_WIDE_INT_PRINT_DEC", ",
+	       first + rounded_size);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+      ASM_GENERATE_INTERNAL_LABEL (end_lab, "LPSRE", labelno++);
+      fputs ("\tje\t", asm_out_file); assemble_name (asm_out_file, end_lab);
+      fputc ('\n', asm_out_file);
+
+      /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL.  */
+      fprintf (asm_out_file, "\tsub\t$%d, ", PROBE_INTERVAL);
+      print_reg (reg, 0, asm_out_file);
+      fputc ('\n', asm_out_file);
+
+      /* Probe at TEST_ADDR.  */
+      output_probe_stack_range_op (0, reg);
+
+      fprintf (asm_out_file, "\tjmp\t"); assemble_name (asm_out_file, loop_lab);
+      fputc ('\n', asm_out_file);
+
+      ASM_OUTPUT_LABEL (asm_out_file, end_lab);
+
+
+      /* Step 4: probe at FIRST + SIZE if we cannot assert at compile-time
+	 that SIZE is equal to ROUNDED_SIZE.  */
+      if (size != rounded_size)
+	output_probe_stack_range_op (size - rounded_size, reg);
+    }
+
+  return "";
+}
+
+/* Wrapper around gen_probe_stack_range.  */
+
+static rtx
+ix86_gen_probe_stack_range (rtx op0, rtx op1, rtx op2)
+{
+  if (TARGET_64BIT)
+    return gen_probe_stack_rangedi (op0, op1, op2);
+  else
+    return gen_probe_stack_rangesi (op0, op1, op2);
+}
+
+/* Emit code to probe a range of stack addresses from FIRST to FIRST+SIZE,
+   inclusive.  These are offsets from the current stack pointer.  */
+
+static void
+ix86_emit_probe_stack_range (HOST_WIDE_INT first, HOST_WIDE_INT size)
+{
+  if (SMALL_INTERVAL (size))
+     emit_insn (ix86_gen_probe_stack_range (GEN_INT (first), GEN_INT (size),
+					    const0_rtx));
+  else
+    {
+      struct scratch_reg sr;
+      get_scratch_register_on_entry (&sr);
+      emit_insn (ix86_gen_probe_stack_range (GEN_INT (first), GEN_INT (size),
+					     sr.reg));
+      release_scratch_register_on_entry (&sr);
+    }
+
+  /* Make sure nothing is scheduled before we are done.  */
+  emit_insn (gen_blockage ());
+}
+
 /* Finalize stack_realign_needed flag, which will guide prologue/epilogue
    to be generated in correct form.  */
 static void 
@@ -8469,6 +8876,31 @@ ix86_expand_prologue (void)
   else
     allocate += frame.nregs * UNITS_PER_WORD;
 
+  /* The stack has already been decremented by the instruction calling us
+     so we need to probe unconditionally to preserve the protection area.  */
+  if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
+    {
+      /* We expect the registers to be saved when probes are used.  */
+      gcc_assert (!frame.save_regs_using_mov);
+
+      if (STACK_CHECK_MOVING_SP)
+	{
+	  ix86_adjust_stack_and_probe (allocate);
+	  allocate = 0;
+	}
+      else
+	{
+	  const HOST_WIDE_INT max_size = 0x7fffffff - STACK_CHECK_PROTECT;
+	  HOST_WIDE_INT size = allocate;
+
+	  /* Don't bother probing more than 2 GB, this is easier.  */
+	  if (size > max_size)
+	    size = max_size;
+
+	  ix86_emit_probe_stack_range (STACK_CHECK_PROTECT, size);
+	}
+    }
+
   /* When using red zone we may start register saving before allocating
      the stack frame saving one cycle of the prologue. However I will
      avoid doing this if I am going to have to probe the stack since

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch] New -fstack-check implementation (2/n)
  2009-10-29 17:18     ` Eric Botcazou
@ 2009-10-31  3:07       ` Ian Lance Taylor
  2009-11-04 21:47         ` Eric Botcazou
                           ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Ian Lance Taylor @ 2009-10-31  3:07 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

Eric Botcazou <ebotcazou@adacore.com> writes:

> 2009-10-29  Eric Botcazou  <ebotcazou@adacore.com>
>
>         PR target/10127
>         PR ada/20548
>         * expr.h (STACK_CHECK_PROBE_INTERVAL): Delete.
>         (STACK_CHECK_PROBE_INTERVAL_EXP): New macro.
>         (STACK_CHECK_MOVING_SP): Likewise.
>         * system.h (STACK_CHECK_PROBE_INTERVAL): Poison it.
>         * doc/tm.texi (Stack Checking): Delete STACK_CHECK_PROBE_INTERVAL.
>         Document STACK_CHECK_PROBE_INTERVAL_EXP and STACK_CHECK_MOVING_SP.
>         * explow.c (anti_adjust_stack_and_probe): New function.
>         (allocate_dynamic_stack_space): Do not directly allocate space if
>         STACK_CHECK_MOVING_SP, instead invoke above function.
>         (emit_stack_probe): Handle probe_stack insn.
>         (PROBE_INTERVAL): New macro.
>         (STACK_GROW_OPTAB): Likewise.
>         (STACK_HIGH, STACK_LOW): Likewise.
>         (probe_stack_range): Remove support code for dedicated pattern.  Fix
>         loop condition in the small constant case.  Rewrite in the general
>         case to be immune to wraparounds.  Make sure the address of probes
>         is valid.  Try to use [base + disp] addressing mode if possible.
>         * ira.c (setup_eliminable_regset): Set frame_pointer_needed if stack
>         checking is enabled and STACK_CHECK_MOVING_SP.
>         * rtlanal.c (may_trap_p_1) <MEM>: If stack checking is enabled,
>         return 1 for volatile references to the stack pointer.
>         * tree.c (build_common_builtin_nodes): Do not set ECF_NOTHROW on
>         __builtin_alloca if stack checking is enabled.
>         * unwind-dw2.c (uw_identify_context): Take into account whether the
>         context is that of a signal frame or not.
>         * config/i386/linux-unwind.h (x86_frob_update_context): New function.
>         (MD_FROB_UPDATE_CONTEXT): Define.
>         * config/i386/linux.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
>         (STACK_CHECK_MOVING_SP): Likewise.
>         * config/i386/linux64.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
>         (STACK_CHECK_MOVING_SP): Likewise.
>         * config/i386/sol2.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
>         * config/i386/i386.c (ix86_compute_frame_layout): Force use of push
>         instructions to save registers if stack checking with probes is on.
>         (get_scratch_register_on_entry): New function.
>         (release_scratch_register_on_entry): Likewise.
>         (output_probe_op): Likewise.
>         (output_adjust_stack_and_probe_op): Likewise.
>         (output_adjust_stack_and_probe): Likewise.
>         (ix86_gen_adjust_stack_and_probe): Likewise.
>         (ix86_adjust_stack_and_probe): Likewise.
>         (output_probe_stack_range_op): Likewise.
>         (ix86_gen_probe_stack_range): Likewise
>         (ix86_emit_probe_stack_range): Likewise.
>         (ix86_expand_prologue): Emit stack checking code if static builtin
>         stack checking is enabled.
>         * config/i386/i386-protos.h (output_adjust_stack_and_probe): Declare.
>         (output_probe_stack_range): Likewise.
>         * config/i386/i386.md (UNSPECV_STACK_PROBE_INLINE): New constant.
>         (probe_stack): New expander.
>         (adjust_stack_and_probe): New insn.
>         (probe_stack_range): Likewise.
>         (logical operation peepholes): Do not split stack checking probes.


The middle-end parts are OK.  This is OK if it is OK with the x86
backend maintainers.

Thanks.

Ian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch] New -fstack-check implementation (2/n)
  2009-10-31  3:07       ` Ian Lance Taylor
@ 2009-11-04 21:47         ` Eric Botcazou
  2009-11-10 21:11         ` Eric Botcazou
  2009-12-13 23:25         ` Eric Botcazou
  2 siblings, 0 replies; 11+ messages in thread
From: Eric Botcazou @ 2009-11-04 21:47 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1204 bytes --]

> The middle-end parts are OK.  This is OK if it is OK with the x86
> backend maintainers.

Thanks a lot.  I installed yesterday the middle-end parts as well as the 
couple of testcases.  However, I overlooked that, on x86 and x86-64 Linux, 
the Ada runtime does some pattern matching on the form of probes to make sure 
that the stack pointer points to a valid page by the time execution resumes 
after the stack overflow exception is propagated.

So I've just installed as obvious the small tweak to the x86 back-end that 
changes the form of probes to IOR insns, they are shorter than the current 
probes and have been used in the Windows probing code since the beginning.

I'll re-resubmit the rest of the patch to the x86 back-end for approval.

Tested on i586-suse-linux and x86_64-suse-linux,


2009-11-04  Eric Botcazou  <ebotcazou@adacore.com>

        PR target/10127
        PR ada/20548
	* config/i386/i386.md (probe_stack_range): New expander.
	(logical operation peepholes): Do not split stack checking probes.


2009-11-04  Eric Botcazou  <ebotcazou@adacore.com>

	* ada/acats/norun.lst: Remove the stack checking tests.
	* ada/acats/run_acats: Limit the stack to 8MB.


-- 
Eric Botcazou

[-- Attachment #2: p.diff --]
[-- Type: text/x-diff, Size: 2686 bytes --]

Index: config/i386/i386.md
===================================================================
--- config/i386/i386.md	(revision 153881)
+++ config/i386/i386.md	(working copy)
@@ -19985,6 +19985,18 @@ (define_expand "allocate_stack"
   DONE;
 })
 
+;; Use IOR for stack probes, this is shorter.
+(define_expand "probe_stack"
+  [(match_operand 0 "memory_operand" "")]
+  ""
+{
+  if (GET_MODE (operands[0]) == DImode)
+    emit_insn (gen_iordi3 (operands[0], operands[0], const0_rtx));
+  else
+    emit_insn (gen_iorsi3 (operands[0], operands[0], const0_rtx));
+  DONE;
+})
+
 (define_expand "builtin_setjmp_receiver"
   [(label_ref (match_operand 0 "" ""))]
   "!TARGET_64BIT && flag_pic"
@@ -20488,7 +20500,9 @@ (define_peephole2
                      [(match_dup 0)
                       (match_operand:SI 1 "nonmemory_operand" "")]))
               (clobber (reg:CC FLAGS_REG))])]
-  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE"
+  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE
+   /* Do not split stack checking probes.  */
+   && GET_CODE (operands[3]) != IOR && operands[1] != const0_rtx"
   [(set (match_dup 2) (match_dup 0))
    (parallel [(set (match_dup 2)
                    (match_op_dup 3 [(match_dup 2) (match_dup 1)]))
@@ -20503,7 +20517,9 @@ (define_peephole2
                      [(match_operand:SI 1 "nonmemory_operand" "")
                       (match_dup 0)]))
               (clobber (reg:CC FLAGS_REG))])]
-  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE"
+  "optimize_insn_for_speed_p () && ! TARGET_READ_MODIFY_WRITE
+   /* Do not split stack checking probes.  */
+   && GET_CODE (operands[3]) != IOR && operands[1] != const0_rtx"
   [(set (match_dup 2) (match_dup 0))
    (parallel [(set (match_dup 2)
                    (match_op_dup 3 [(match_dup 1) (match_dup 2)]))
Index: testsuite/ada/acats/norun.lst
===================================================================
--- testsuite/ada/acats/norun.lst	(revision 153881)
+++ testsuite/ada/acats/norun.lst	(working copy)
@@ -1,10 +1,2 @@
-c52103x
-c52104x
-c52104y
-cb1010a
-cb1010c
-cb1010d
 templat
 # Tests must be sorted in alphabetical order
-# c52103x, c52104x, c52104y: -fstack-check doesn't work, PR middle-end/20548
-# cb1010a, cb1010c, cb1010d: likewise
Index: testsuite/ada/acats/run_acats
===================================================================
--- testsuite/ada/acats/run_acats	(revision 153881)
+++ testsuite/ada/acats/run_acats	(working copy)
@@ -52,4 +52,7 @@ echo exec gnatmake '"$@"' >> host_gnatma
 
 chmod +x host_gnatmake
 
+# Limit the stack to 8MB for stack checking
+ulimit -s 8192
+
 exec $testdir/run_all.sh ${1+"$@"}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch] New -fstack-check implementation (2/n)
  2009-10-31  3:07       ` Ian Lance Taylor
  2009-11-04 21:47         ` Eric Botcazou
@ 2009-11-10 21:11         ` Eric Botcazou
  2009-11-11  8:17           ` H.J. Lu
  2009-12-13 23:25         ` Eric Botcazou
  2 siblings, 1 reply; 11+ messages in thread
From: Eric Botcazou @ 2009-11-10 21:11 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 920 bytes --]

> The middle-end parts are OK.  This is OK if it is OK with the x86
> backend maintainers.

Another glitch: installing the middle-end part without the x86 part exposed a 
bug in the former, whereby we still emit probes the regular way on entry to 
the function even though STACK_CHECK_MOVING_SP is defined.

Fixed thusly, tested on i586-suse-linux and x86_64-suse-linux, applied as 
obvious on the mainline.  With the other patch you approved today, all stack 
checking tests now pass at -O2 on both platforms.


2009-11-10  Eric Botcazou  <ebotcazou@adacore.com>

        PR target/10127
        PR ada/20548
	* expr.h (anti_adjust_stack_and_probe): Declare.
	* explow.c (anti_adjust_stack_and_probe): Make global, add ADJUST_BACK
	parameter and rewrite head comment.
	(allocate_dynamic_stack_space): Adjust call to above function.
	* function.c (expand_function_end): Handle STACK_CHECK_MOVING_SP.


-- 
Eric Botcazou

[-- Attachment #2: p.diff --]
[-- Type: text/x-diff, Size: 3386 bytes --]

Index: expr.h
===================================================================
--- expr.h	(revision 154059)
+++ expr.h	(working copy)
@@ -767,6 +767,9 @@ extern void adjust_stack (rtx);
 /* Add some bytes to the stack.  An rtx says how many.  */
 extern void anti_adjust_stack (rtx);
 
+/* Add some bytes to the stack while probing it.  An rtx says how many. */
+extern void anti_adjust_stack_and_probe (rtx, bool);
+
 /* This enum is used for the following two functions.  */
 enum save_level {SAVE_BLOCK, SAVE_FUNCTION, SAVE_NONLOCAL};
 
Index: function.c
===================================================================
--- function.c	(revision 154059)
+++ function.c	(working copy)
@@ -4642,9 +4642,12 @@ expand_function_end (void)
       for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
 	if (CALL_P (insn))
 	  {
+	    rtx max_frame_size = GEN_INT (STACK_CHECK_MAX_FRAME_SIZE);
 	    start_sequence ();
-	    probe_stack_range (STACK_OLD_CHECK_PROTECT,
-			       GEN_INT (STACK_CHECK_MAX_FRAME_SIZE));
+	    if (STACK_CHECK_MOVING_SP)
+	      anti_adjust_stack_and_probe (max_frame_size, true);
+	    else
+	      probe_stack_range (STACK_OLD_CHECK_PROTECT, max_frame_size);
 	    seq = get_insns ();
 	    end_sequence ();
 	    emit_insn_before (seq, stack_check_probe_note);
Index: explow.c
===================================================================
--- explow.c	(revision 154061)
+++ explow.c	(working copy)
@@ -43,7 +43,6 @@ along with GCC; see the file COPYING3.  
 
 static rtx break_out_memory_refs (rtx);
 static void emit_stack_probe (rtx);
-static void anti_adjust_stack_and_probe (rtx);
 
 
 /* Truncate and perhaps sign-extend C as appropriate for MODE.  */
@@ -1308,7 +1307,7 @@ allocate_dynamic_stack_space (rtx size, 
 	}
 
       if (flag_stack_check && STACK_CHECK_MOVING_SP)
-	anti_adjust_stack_and_probe (size);
+	anti_adjust_stack_and_probe (size, false);
       else
 	anti_adjust_stack (size);
 
@@ -1545,13 +1544,17 @@ probe_stack_range (HOST_WIDE_INT first, 
     }
 }
 
-/* Adjust the stack by SIZE bytes while probing it.  Note that we skip the
-   probe for the first interval + a small dope of 4 words and instead probe
-   that many bytes past the specified size to maintain a protection area.  */
+/* Adjust the stack pointer by minus SIZE (an rtx for a number of bytes)
+   while probing it.  This pushes when SIZE is positive.  SIZE need not
+   be constant.  If ADJUST_BACK is true, adjust back the stack pointer
+   by plus SIZE at the end.  */
 
-static void
-anti_adjust_stack_and_probe (rtx size)
+void
+anti_adjust_stack_and_probe (rtx size, bool adjust_back)
 {
+  /* We skip the probe for the first interval + a small dope of 4 words and
+     probe that many bytes past the specified size to maintain a protection
+     area at the botton of the stack.  */
   const int dope = 4 * UNITS_PER_WORD;
 
   /* First ensure SIZE is Pmode.  */
@@ -1660,8 +1663,11 @@ anti_adjust_stack_and_probe (rtx size)
 	}
     }
 
-  /* Adjust back to account for the additional first interval.  */
-  adjust_stack (GEN_INT (PROBE_INTERVAL + dope));
+  /* Adjust back and account for the additional first interval.  */
+  if (adjust_back)
+    adjust_stack (plus_constant (size, PROBE_INTERVAL + dope));
+  else
+    adjust_stack (GEN_INT (PROBE_INTERVAL + dope));
 }
 
 /* Return an rtx representing the register or memory location

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch] New -fstack-check implementation (2/n)
  2009-11-10 21:11         ` Eric Botcazou
@ 2009-11-11  8:17           ` H.J. Lu
  2009-11-11 13:47             ` Eric Botcazou
  0 siblings, 1 reply; 11+ messages in thread
From: H.J. Lu @ 2009-11-11  8:17 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Ian Lance Taylor, gcc-patches

On Tue, Nov 10, 2009 at 12:48 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> The middle-end parts are OK.  This is OK if it is OK with the x86
>> backend maintainers.
>
> Another glitch: installing the middle-end part without the x86 part exposed a
> bug in the former, whereby we still emit probes the regular way on entry to
> the function even though STACK_CHECK_MOVING_SP is defined.
>
> Fixed thusly, tested on i586-suse-linux and x86_64-suse-linux, applied as
> obvious on the mainline.  With the other patch you approved today, all stack
> checking tests now pass at -O2 on both platforms.
>
>
> 2009-11-10  Eric Botcazou  <ebotcazou@adacore.com>
>
>        PR target/10127
>        PR ada/20548
>        * expr.h (anti_adjust_stack_and_probe): Declare.
>        * explow.c (anti_adjust_stack_and_probe): Make global, add ADJUST_BACK
>        parameter and rewrite head comment.
>        (allocate_dynamic_stack_space): Adjust call to above function.
>        * function.c (expand_function_end): Handle STACK_CHECK_MOVING_SP.
>
>

This patch caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42004


-- 
H.J.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch] New -fstack-check implementation (2/n)
  2009-11-11  8:17           ` H.J. Lu
@ 2009-11-11 13:47             ` Eric Botcazou
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Botcazou @ 2009-11-11 13:47 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc-patches, Ian Lance Taylor

> This patch caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42004

No, it didn't, it's the other one.  Will follow up there.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Patch] New -fstack-check implementation (2/n)
  2009-10-31  3:07       ` Ian Lance Taylor
  2009-11-04 21:47         ` Eric Botcazou
  2009-11-10 21:11         ` Eric Botcazou
@ 2009-12-13 23:25         ` Eric Botcazou
  2 siblings, 0 replies; 11+ messages in thread
From: Eric Botcazou @ 2009-12-13 23:25 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 659 bytes --]

> The middle-end parts are OK.  This is OK if it is OK with the x86
> backend maintainers.

It just occured to me that, with the addition of the probe_stack insn, the 
STACK_CHECK_PROBE_LOAD macro has become superfluous since you now can define 
your own stack probing insn.  Moreover no port defines it.  So I've installed 
as obvious the attached patch that gets rid of it.


2009-12-13  Eric Botcazou  <ebotcazou@adacore.com>

	* doc/tm.texi (STACK_CHECK_PROBE_LOAD): Delete.
	* expr.h (STACK_CHECK_PROBE_LOAD): Likewise.
	* explow.c (emit_stack_probe): Do not test STACK_CHECK_PROBE_LOAD.
	* system.h (STACK_CHECK_PROBE_LOAD): Poison.


-- 
Eric Botcazou

[-- Attachment #2: p.diff --]
[-- Type: text/x-diff, Size: 2664 bytes --]

Index: doc/tm.texi
===================================================================
--- doc/tm.texi	(revision 155123)
+++ doc/tm.texi	(working copy)
@@ -3564,12 +3564,6 @@ the ``guard pages'' at the end of a stac
 of 12 (4096-byte interval) is suitable for most systems.
 @end defmac
 
-@defmac STACK_CHECK_PROBE_LOAD
-An integer which is nonzero if GCC should perform the stack probe
-as a load instruction and zero if GCC should use a store instruction.
-The default is zero, which is the most efficient choice on most systems.
-@end defmac
-
 @defmac STACK_CHECK_MOVING_SP
 An integer which is nonzero if GCC should move the stack pointer page by page
 when doing probes.  This can be necessary on systems where the stack pointer
Index: expr.h
===================================================================
--- expr.h	(revision 155123)
+++ expr.h	(working copy)
@@ -223,11 +223,6 @@ do {								\
 #define STACK_CHECK_PROBE_INTERVAL_EXP 12
 #endif
 
-/* The default is to do a store into the stack.  */
-#ifndef STACK_CHECK_PROBE_LOAD
-#define STACK_CHECK_PROBE_LOAD 0
-#endif
-
 /* The default is not to move the stack pointer.  */
 #ifndef STACK_CHECK_MOVING_SP
 #define STACK_CHECK_MOVING_SP 0
Index: explow.c
===================================================================
--- explow.c	(revision 155123)
+++ explow.c	(working copy)
@@ -1366,9 +1366,6 @@ emit_stack_probe (rtx address)
     emit_insn (gen_probe_stack (memref));
   else
 #endif
-  if (STACK_CHECK_PROBE_LOAD)
-    emit_move_insn (gen_reg_rtx (word_mode), memref);
-  else
     emit_move_insn (memref, const0_rtx);
 }
 
Index: system.h
===================================================================
--- system.h	(revision 155123)
+++ system.h	(working copy)
@@ -756,12 +756,12 @@ extern void fancy_abort (const char *, i
         TARGET_ESC TARGET_FF TARGET_NEWLINE TARGET_TAB TARGET_VT	   \
         LINK_LIBGCC_SPECIAL DONT_ACCESS_GBLS_AFTER_EPILOGUE		   \
 	TARGET_OPTIONS TARGET_SWITCHES EXTRA_CC_MODES FINALIZE_PIC	   \
-	PREDICATE_CODES SPECIAL_MODE_PREDICATES				   \
+	PREDICATE_CODES SPECIAL_MODE_PREDICATES	UNALIGNED_WORD_ASM_OP	   \
 	EXTRA_SECTIONS EXTRA_SECTION_FUNCTIONS READONLY_DATA_SECTION	   \
 	TARGET_ASM_EXCEPTION_SECTION TARGET_ASM_EH_FRAME_SECTION	   \
 	SMALL_ARG_MAX ASM_OUTPUT_SHARED_BSS ASM_OUTPUT_SHARED_COMMON	   \
-	ASM_OUTPUT_SHARED_LOCAL UNALIGNED_WORD_ASM_OP			   \
-	ASM_MAKE_LABEL_LINKONCE STACK_CHECK_PROBE_INTERVAL
+	ASM_OUTPUT_SHARED_LOCAL ASM_MAKE_LABEL_LINKONCE			   \
+	STACK_CHECK_PROBE_INTERVAL STACK_CHECK_PROBE_LOAD
 
 /* Hooks that are no longer used.  */
  #pragma GCC poison LANG_HOOKS_FUNCTION_MARK LANG_HOOKS_FUNCTION_FREE	\

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-12-13 23:02 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-04 11:38 [Patch] New -fstack-check implementation (2/n) Eric Botcazou
2009-09-02  0:46 ` Ian Lance Taylor
2009-09-02  7:46   ` Eric Botcazou
2009-09-29 23:25   ` Eric Botcazou
2009-10-29 17:18     ` Eric Botcazou
2009-10-31  3:07       ` Ian Lance Taylor
2009-11-04 21:47         ` Eric Botcazou
2009-11-10 21:11         ` Eric Botcazou
2009-11-11  8:17           ` H.J. Lu
2009-11-11 13:47             ` Eric Botcazou
2009-12-13 23:25         ` Eric Botcazou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).