public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Earnshaw <Richard.Earnshaw@foss.arm.com>
To: Eric Botcazou <ebotcazou@adacore.com>
Cc: gcc-patches@gcc.gnu.org,
	Ramana Radhakrishnan <ramana.radhakrishnan@foss.arm.com>
Subject: Re: [ARM] Fix PR middle-end/65958
Date: Fri, 04 Dec 2015 13:49:00 -0000	[thread overview]
Message-ID: <566199F8.1010403@foss.arm.com> (raw)
In-Reply-To: <1866500.xH01EXcb36@polaris>

> +	(unspec_volatile:PTR [(match_operand:PTR 1 "register_operand" "0")
> +			      (match_operand:PTR 2 "register_operand" "r")]
> +			       UNSPEC_PROBE_STACK_RANGE))]

Minor nit.  Since this is used in an unspec_volatile, the name should be
UNSPECV_ and defined in the unspecv enum.

Otherwise OK once testing is complete.

R.


On 03/12/15 12:17, Eric Botcazou wrote:
>> I can understand this restriction, but...
>>
>>> +  /* See the same assertion on PROBE_INTERVAL above.  */
>>> +  gcc_assert ((first % 4096) == 0);
>>
>> ... why isn't this a test that FIRST is aligned to PROBE_INTERVAL?
> 
> Because that isn't guaranteed, FIRST is related to the size of the protection 
> area while PROBE_INTERVAL is related to the page size.
> 
>> blank line between declarations and code. Also, can we come up with a
>> suitable define for 4096 here that expresses the context and then use
>> that consistently through the remainder of this function?
> 
> OK, let's use ARITH_BASE.
> 
>>> +(define_insn "probe_stack_range"
>>> +  [(set (match_operand:DI 0 "register_operand" "=r")
>>> +	(unspec_volatile:DI [(match_operand:DI 1 "register_operand" "0")
>>> +			     (match_operand:DI 2 "register_operand" "r")]
>>> +			     UNSPEC_PROBE_STACK_RANGE))]
>>
>> I think this should really use PTRmode, so that it's ILP32 ready (I'm
>> not going to ask you to make sure that works though, since I suspect
>> there are still other issues to resolve with ILP32 at this time).
> 
> Done.  Manually tested for now, I'll fully test it if approved.
> 
> 
>         PR middle-end/65958
>         * config/aarch64/aarch64-protos.h (aarch64_output_probe_stack-range):
>         Declare.
>         * config/aarch64/aarch64.md: Declare UNSPECV_BLOCKAGE and
>         UNSPEC_PROBE_STACK_RANGE.
>         (blockage): New instruction.
>         (probe_stack_range_<PTR:mode>): Likewise.
>         * config/aarch64/aarch64.c (aarch64_emit_probe_stack_range): New
>         function.
>         (aarch64_output_probe_stack_range): Likewise.
>         (aarch64_expand_prologue): Invoke aarch64_emit_probe_stack_range if
>         static builtin stack checking is enabled.
>         * config/aarch64/aarch64-linux.h (STACK_CHECK_STATIC_BUILTIN):
>         Define.
> 
> 
> pr65958-2c.diff
> 
> 
> Index: config/aarch64/aarch64-linux.h
> ===================================================================
> --- config/aarch64/aarch64-linux.h	(revision 231206)
> +++ config/aarch64/aarch64-linux.h	(working copy)
> @@ -88,4 +88,7 @@
>  #undef TARGET_BINDS_LOCAL_P
>  #define TARGET_BINDS_LOCAL_P default_binds_local_p_2
>  
> +/* Define this to be nonzero if static stack checking is supported.  */
> +#define STACK_CHECK_STATIC_BUILTIN 1
> +
>  #endif  /* GCC_AARCH64_LINUX_H */
> Index: config/aarch64/aarch64-protos.h
> ===================================================================
> --- config/aarch64/aarch64-protos.h	(revision 231206)
> +++ config/aarch64/aarch64-protos.h	(working copy)
> @@ -340,6 +340,7 @@ void aarch64_asm_output_labelref (FILE *
>  void aarch64_cpu_cpp_builtins (cpp_reader *);
>  void aarch64_elf_asm_named_section (const char *, unsigned, tree);
>  const char * aarch64_gen_far_branch (rtx *, int, const char *, const char *);
> +const char * aarch64_output_probe_stack_range (rtx, rtx);
>  void aarch64_err_no_fpadvsimd (machine_mode, const char *);
>  void aarch64_expand_epilogue (bool);
>  void aarch64_expand_mov_immediate (rtx, rtx);
> Index: config/aarch64/aarch64.c
> ===================================================================
> --- config/aarch64/aarch64.c	(revision 231206)
> +++ config/aarch64/aarch64.c	(working copy)
> @@ -62,6 +62,7 @@
>  #include "sched-int.h"
>  #include "cortex-a57-fma-steering.h"
>  #include "target-globals.h"
> +#include "common/common-target.h"
>  
>  /* This file should be included last.  */
>  #include "target-def.h"
> @@ -2183,6 +2184,179 @@ aarch64_libgcc_cmp_return_mode (void)
>    return SImode;
>  }
>  
> +#define PROBE_INTERVAL (1 << STACK_CHECK_PROBE_INTERVAL_EXP)
> +
> +/* We use the 12-bit shifted immediate arithmetic instructions so values
> +   must be multiple of (1 << 12), i.e. 4096.  */
> +#define ARITH_BASE 4096
> +
> +#if (PROBE_INTERVAL % ARITH_BASE) != 0
> +#error Cannot use simple address calculation for stack probing
> +#endif
> +
> +/* The pair of scratch registers used for stack probing.  */
> +#define PROBE_STACK_FIRST_REG  9
> +#define PROBE_STACK_SECOND_REG 10
> +
> +/* Emit code to probe a range of stack addresses from FIRST to FIRST+SIZE,
> +   inclusive.  These are offsets from the current stack pointer.  */
> +
> +static void
> +aarch64_emit_probe_stack_range (HOST_WIDE_INT first, HOST_WIDE_INT size)
> +{
> +  rtx reg1 = gen_rtx_REG (ptr_mode, PROBE_STACK_FIRST_REG);
> +
> +  /* See the same assertion on PROBE_INTERVAL above.  */
> +  gcc_assert ((first % ARITH_BASE) == 0);
> +
> +  /* See if we have a constant small number of probes to generate.  If so,
> +     that's the easy case.  */
> +  if (size <= PROBE_INTERVAL)
> +    {
> +      const HOST_WIDE_INT base = ROUND_UP (size, ARITH_BASE);
> +
> +      emit_set_insn (reg1,
> +		     plus_constant (ptr_mode,
> +				    stack_pointer_rtx, -(first + base)));
> +      emit_stack_probe (plus_constant (ptr_mode, reg1, base - size));
> +    }
> +
> +  /* The run-time loop is made up of 8 insns in the generic case while the
> +     compile-time loop is made up of 4+2*(n-2) insns for n # of intervals.  */
> +  else if (size <= 4 * PROBE_INTERVAL)
> +    {
> +      HOST_WIDE_INT i, rem;
> +
> +      emit_set_insn (reg1,
> +		     plus_constant (ptr_mode,
> +				    stack_pointer_rtx,
> +				    -(first + PROBE_INTERVAL)));
> +      emit_stack_probe (reg1);
> +
> +      /* Probe at FIRST + N * PROBE_INTERVAL for values of N from 2 until
> +	 it exceeds SIZE.  If only two probes are needed, this will not
> +	 generate any code.  Then probe at FIRST + SIZE.  */
> +      for (i = 2 * PROBE_INTERVAL; i < size; i += PROBE_INTERVAL)
> +	{
> +	  emit_set_insn (reg1,
> +			 plus_constant (ptr_mode, reg1, -PROBE_INTERVAL));
> +	  emit_stack_probe (reg1);
> +	}
> +
> +      rem = size - (i - PROBE_INTERVAL);
> +      if (rem > 256)
> +	{
> +	  const HOST_WIDE_INT base = ROUND_UP (rem, ARITH_BASE);
> +
> +	  emit_set_insn (reg1, plus_constant (ptr_mode, reg1, -base));
> +	  emit_stack_probe (plus_constant (ptr_mode, reg1, base - rem));
> +	}
> +      else
> +	emit_stack_probe (plus_constant (ptr_mode, reg1, -rem));
> +    }
> +
> +  /* Otherwise, do the same as above, but in a loop.  Note that we must be
> +     extra careful with variables wrapping around because we might be at
> +     the very top (or the very bottom) of the address space and we have
> +     to be able to handle this case properly; in particular, we use an
> +     equality test for the loop condition.  */
> +  else
> +    {
> +      rtx reg2 = gen_rtx_REG (ptr_mode, PROBE_STACK_SECOND_REG);
> +
> +      /* Step 1: round SIZE to the previous multiple of the interval.  */
> +
> +      HOST_WIDE_INT rounded_size = size & -PROBE_INTERVAL;
> +
> +
> +      /* Step 2: compute initial and final value of the loop counter.  */
> +
> +      /* TEST_ADDR = SP + FIRST.  */
> +      emit_set_insn (reg1,
> +		     plus_constant (ptr_mode, stack_pointer_rtx, -first));
> +
> +      /* LAST_ADDR = SP + FIRST + ROUNDED_SIZE.  */
> +      emit_set_insn (reg2,
> +		     plus_constant (ptr_mode, stack_pointer_rtx,
> +				    -(first + rounded_size)));
> +
> +
> +      /* Step 3: the loop
> +
> +	 do
> +	   {
> +	     TEST_ADDR = TEST_ADDR + PROBE_INTERVAL
> +	     probe at TEST_ADDR
> +	   }
> +	 while (TEST_ADDR != LAST_ADDR)
> +
> +	 probes at FIRST + N * PROBE_INTERVAL for values of N from 1
> +	 until it is equal to ROUNDED_SIZE.  */
> +
> +      if (ptr_mode == DImode)
> +	emit_insn (gen_probe_stack_range_di (reg1, reg1, reg2));
> +      else
> +	emit_insn (gen_probe_stack_range_si (reg1, reg1, reg2));
> +
> +
> +      /* Step 4: probe at FIRST + SIZE if we cannot assert at compile-time
> +	 that SIZE is equal to ROUNDED_SIZE.  */
> +
> +      if (size != rounded_size)
> +	{
> +	  HOST_WIDE_INT rem = size - rounded_size;
> +
> +	  if (rem > 256)
> +	    {
> +	      const HOST_WIDE_INT base = ROUND_UP (rem, ARITH_BASE);
> +
> +	      emit_set_insn (reg2, plus_constant (ptr_mode, reg2, -base));
> +	      emit_stack_probe (plus_constant (ptr_mode, reg2, base - rem));
> +	    }
> +	  else
> +	    emit_stack_probe (plus_constant (ptr_mode, reg2, -rem));
> +	}
> +    }
> +
> +  /* Make sure nothing is scheduled before we are done.  */
> +  emit_insn (gen_blockage ());
> +}
> +
> +/* Probe a range of stack addresses from REG1 to REG2 inclusive.  These are
> +   absolute addresses.  */
> +
> +const char *
> +aarch64_output_probe_stack_range (rtx reg1, rtx reg2)
> +{
> +  static int labelno = 0;
> +  char loop_lab[32];
> +  rtx xops[2];
> +
> +  ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno++);
> +
> +  /* Loop.  */
> +  ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, loop_lab);
> +
> +  /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL.  */
> +  xops[0] = reg1;
> +  xops[1] = GEN_INT (PROBE_INTERVAL);
> +  output_asm_insn ("sub\t%0, %0, %1", xops);
> +
> +  /* Probe at TEST_ADDR.  */
> +  output_asm_insn ("str\txzr, [%0]", xops);
> +
> +  /* Test if TEST_ADDR == LAST_ADDR.  */
> +  xops[1] = reg2;
> +  output_asm_insn ("cmp\t%0, %1", xops);
> +
> +  /* Branch.  */
> +  fputs ("\tb.ne\t", asm_out_file);
> +  assemble_name_raw (asm_out_file, loop_lab);
> +  fputc ('\n', asm_out_file);
> +
> +  return "";
> +}
> +
>  static bool
>  aarch64_frame_pointer_required (void)
>  {
> @@ -2583,6 +2757,18 @@ aarch64_expand_prologue (void)
>    if (flag_stack_usage_info)
>      current_function_static_stack_size = frame_size;
>  
> +  if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)
> +    {
> +      if (crtl->is_leaf && !cfun->calls_alloca)
> +	{
> +	  if (frame_size > PROBE_INTERVAL && frame_size > STACK_CHECK_PROTECT)
> +	    aarch64_emit_probe_stack_range (STACK_CHECK_PROTECT,
> +					    frame_size - STACK_CHECK_PROTECT);
> +	}
> +      else if (frame_size > 0)
> +	aarch64_emit_probe_stack_range (STACK_CHECK_PROTECT, frame_size);
> +    }
> +
>    /* Store pairs and load pairs have a range only -512 to 504.  */
>    if (offset >= 512)
>      {
> Index: config/aarch64/aarch64.md
> ===================================================================
> --- config/aarch64/aarch64.md	(revision 231206)
> +++ config/aarch64/aarch64.md	(working copy)
> @@ -104,6 +104,7 @@ (define_c_enum "unspec" [
>      UNSPEC_MB
>      UNSPEC_NOP
>      UNSPEC_PRLG_STK
> +    UNSPEC_PROBE_STACK_RANGE
>      UNSPEC_RBIT
>      UNSPEC_SISD_NEG
>      UNSPEC_SISD_SSHL
> @@ -137,6 +138,7 @@ (define_c_enum "unspecv" [
>      UNSPECV_SET_FPCR		; Represent assign of FPCR content.
>      UNSPECV_GET_FPSR		; Represent fetch of FPSR content.
>      UNSPECV_SET_FPSR		; Represent assign of FPSR content.
> +    UNSPECV_BLOCKAGE		; Represent a blockage
>    ]
>  )
>  
> @@ -4951,6 +4953,29 @@ (define_insn "stack_tie"
>    [(set_attr "length" "0")]
>  )
>  
> +;; UNSPEC_VOLATILE is considered to use and clobber all hard registers and
> +;; all of memory.  This blocks insns from being moved across this point.
> +
> +(define_insn "blockage"
> +  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
> +  ""
> +  ""
> +  [(set_attr "length" "0")
> +   (set_attr "type" "block")]
> +)
> +
> +(define_insn "probe_stack_range_<PTR:mode>"
> +  [(set (match_operand:PTR 0 "register_operand" "=r")
> +	(unspec_volatile:PTR [(match_operand:PTR 1 "register_operand" "0")
> +			      (match_operand:PTR 2 "register_operand" "r")]
> +			       UNSPEC_PROBE_STACK_RANGE))]
> +  ""
> +{
> +  return aarch64_output_probe_stack_range (operands[0], operands[2]);
> +}
> +  [(set_attr "length" "32")]
> +)
> +
>  ;; Named pattern for expanding thread pointer reference.
>  (define_expand "get_thread_pointerdi"
>    [(match_operand:DI 0 "register_operand" "=r")]
> 

  parent reply	other threads:[~2015-12-04 13:49 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-11 16:15 Eric Botcazou
2015-06-17 10:41 ` Ramana Radhakrishnan
2015-06-18 19:03   ` Eric Botcazou
2015-07-06 15:46     ` Ramana Radhakrishnan
2015-09-20 21:05       ` Christophe Lyon
2015-09-21  8:18         ` Eric Botcazou
2015-10-06 10:11       ` Eric Botcazou
2015-10-06 13:43         ` Ramana Radhakrishnan
2015-10-28 11:38           ` Eric Botcazou
2015-10-07  8:15         ` Yao Qi
2015-10-07  9:10           ` Eric Botcazou
2015-10-07 10:42             ` Yao Qi
2015-10-07 17:38               ` Eric Botcazou
2015-11-03 17:35         ` Richard Earnshaw
2015-11-03 18:05           ` Eric Botcazou
2015-11-03 21:51             ` Eric Botcazou
2015-11-16 20:01           ` Eric Botcazou
2015-11-25  7:56             ` Eric Botcazou
2015-12-03 11:08             ` Richard Earnshaw
2015-12-03 12:20               ` Eric Botcazou
2015-12-04  9:39                 ` Marcus Shawcroft
2015-12-04 11:58                   ` Eric Botcazou
2015-12-04 13:49                 ` Richard Earnshaw [this message]
2015-12-04 18:26                   ` Eric Botcazou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=566199F8.1010403@foss.arm.com \
    --to=richard.earnshaw@foss.arm.com \
    --cc=ebotcazou@adacore.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=ramana.radhakrishnan@foss.arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).