public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 03/13] Initial asan cleanups
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
                   ` (5 preceding siblings ...)
  2012-11-01 19:53 ` [PATCH 01/13] Initial import of asan from the Google branch dodji
@ 2012-11-01 19:53 ` dodji
  2012-11-01 19:53 ` [PATCH 11/13] Factorize condition insertion code out of build_check_stmt dodji
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 80+ messages in thread
From: dodji @ 2012-11-01 19:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

From: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4>

This patch defines a new asan_shadow_offset target macro, instead of
having a mere macro in the asan.c file.  It becomes thus cleaner to
define the target macro for targets that supports asan, namely x86 for
now.  The ASAN_SHADOW_SHIFT (which, along with the asan_shadow_offset
constant, is used to compute the address of the shadow memory byte for
a given memory address) is defined in asan.h.

	* toplev.c (process_options): Warn and turn off -fasan
	if not supported by target.
	* asan.c: Include target.h.
	(asan_scale, asan_offset_log_32, asan_offset_log_64,
	asan_offset_log): Removed.
	(build_check_stmt): Use ASAN_SHADOW_SHIFT and
	targetm.asan_shadow_offset ().
	(asan_instrument): Don't initialize asan_offset_log.
	* asan.h (ASAN_SHADOW_SHIFT): Define.
	* target.def (TARGET_ASAN_SHADOW_OFFSET): New hook.
	* doc/tm.texi.in (TARGET_ASAN_SHADOW_OFFSET): Add it.
	* doc/tm.texi: Regenerated.
	* Makefile.in (asan.o): Depend on $(TARGET_H).
	* config/i386/i386.c (ix86_asan_shadow_offset): New function.
	(TARGET_ASAN_SHADOW_OFFSET): Define.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192372 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan     | 18 ++++++++++++++++++
 gcc/Makefile.in        |  2 +-
 gcc/asan.c             | 25 ++++++-------------------
 gcc/asan.h             |  6 +++++-
 gcc/config/i386/i386.c | 11 +++++++++++
 gcc/doc/tm.texi        |  6 ++++++
 gcc/doc/tm.texi.in     |  2 ++
 gcc/target.def         | 11 +++++++++++
 gcc/toplev.c           |  7 +++++++
 9 files changed, 67 insertions(+), 21 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index c196bfe..0bc9420 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,3 +1,21 @@
+2012-10-11  Jakub Jelinek  <jakub@redhat.com>
+
+	* toplev.c (process_options): Warn and turn off -fasan
+	if not supported by target.
+	* asan.c: Include target.h.
+	(asan_scale, asan_offset_log_32, asan_offset_log_64,
+	asan_offset_log): Removed.
+	(build_check_stmt): Use ASAN_SHADOW_SHIFT and
+	targetm.asan_shadow_offset ().
+	(asan_instrument): Don't initialize asan_offset_log.
+	* asan.h (ASAN_SHADOW_SHIFT): Define.
+	* target.def (TARGET_ASAN_SHADOW_OFFSET): New hook.
+	* doc/tm.texi.in (TARGET_ASAN_SHADOW_OFFSET): Add it.
+	* doc/tm.texi: Regenerated.
+	* Makefile.in (asan.o): Depend on $(TARGET_H).
+	* config/i386/i386.c (ix86_asan_shadow_offset): New function.
+	(TARGET_ASAN_SHADOW_OFFSET): Define.
+
 2012-10-10  Diego Novillo  <dnovillo@google.com>
 
 	* asan.c: Rename from tree-asan.c.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index a9da161..bdc5afb 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2213,7 +2213,7 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
 asan.o : asan.c asan.h $(CONFIG_H) pointer-set.h \
    $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \
    output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \
-   tree-pretty-print.h
+   tree-pretty-print.h $(TARGET_H)
 tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
    $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
    $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) \
diff --git a/gcc/asan.c b/gcc/asan.c
index a6ceb57..e95be47 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1,5 +1,5 @@
 /* AddressSanitizer, a fast memory error detector.
-   Copyright (C) 2011 Free Software Foundation, Inc.
+   Copyright (C) 2011, 2012 Free Software Foundation, Inc.
    Contributed by Kostya Serebryany <kcc@google.com>
 
 This file is part of GCC.
@@ -42,6 +42,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple.h"
 #include "asan.h"
 #include "gimple-pretty-print.h"
+#include "target.h"
 
 /*
  AddressSanitizer finds out-of-bounds and use-after-free bugs 
@@ -78,15 +79,6 @@ along with GCC; see the file COPYING3.  If not see
  to create redzones for stack and global object and poison them.
 */
 
-/* The shadow address is computed as (X>>asan_scale) + (1<<asan_offset_log).
- We may want to add command line flags to change these values.  */
-
-static const int asan_scale = 3;
-static const int asan_offset_log_32 = 29;
-static const int asan_offset_log_64 = 44;
-static int asan_offset_log;
-
-
 /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
    IS_STORE is either 1 (for a store) or 0 (for a load).
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
@@ -202,15 +194,13 @@ build_check_stmt (tree base,
   gimple_set_location (g, location);
   gimple_seq_add_stmt (&seq, g);
 
-  /* Build (base_addr >> asan_scale) + (1 << asan_offset_log).  */
+  /* Build
+     (base_addr >> ASAN_SHADOW_SHIFT) | targetm.asan_shadow_offset ().  */
 
   t = build2 (RSHIFT_EXPR, uintptr_type, base_addr,
-              build_int_cst (uintptr_type, asan_scale));
+	      build_int_cst (uintptr_type, ASAN_SHADOW_SHIFT));
   t = build2 (PLUS_EXPR, uintptr_type, t,
-              build2 (LSHIFT_EXPR, uintptr_type,
-                      build_int_cst (uintptr_type, 1),
-                      build_int_cst (uintptr_type, asan_offset_log)
-                     ));
+	      build_int_cst (uintptr_type, targetm.asan_shadow_offset ()));
   t = build1 (INDIRECT_REF, shadow_type,
               build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t));
   t = force_gimple_operand (t, &stmts, false, NULL_TREE);
@@ -367,9 +357,6 @@ static unsigned int
 asan_instrument (void)
 {
   struct gimplify_ctx gctx;
-  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode, true);
-  int is_64 = tree_low_cst (TYPE_SIZE (uintptr_type), 0) == 64;
-  asan_offset_log = is_64 ? asan_offset_log_64 : asan_offset_log_32;
   push_gimplify_context (&gctx);
   transform_statements ();
   pop_gimplify_context (NULL);
diff --git a/gcc/asan.h b/gcc/asan.h
index 590cf35..699820b 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -1,5 +1,5 @@
 /* AddressSanitizer, a fast memory error detector.
-   Copyright (C) 2011 Free Software Foundation, Inc.
+   Copyright (C) 2011, 2012 Free Software Foundation, Inc.
    Contributed by Kostya Serebryany <kcc@google.com>
 
 This file is part of GCC.
@@ -23,4 +23,8 @@ along with GCC; see the file COPYING3.  If not see
 
 extern void asan_finish_file(void);
 
+/* Shadow memory is found at
+   (address >> ASAN_SHADOW_SHIFT) | targetm.asan_shadow_offset ().  */
+#define ASAN_SHADOW_SHIFT	3
+
 #endif /* TREE_ASAN */
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1c34bb2..bf84b65 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5647,6 +5647,14 @@ ix86_legitimate_combined_insn (rtx insn)
   return true;
 }
 \f
+/* Implement the TARGET_ASAN_SHADOW_OFFSET hook.  */
+
+static unsigned HOST_WIDE_INT
+ix86_asan_shadow_offset (void)
+{
+  return (unsigned HOST_WIDE_INT) 1 << (TARGET_LP64 ? 44 : 29);
+}
+\f
 /* Argument support functions.  */
 
 /* Return true when register may be used to pass function parameters.  */
@@ -41379,6 +41387,9 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val)
 #undef TARGET_LEGITIMATE_COMBINED_INSN
 #define TARGET_LEGITIMATE_COMBINED_INSN ix86_legitimate_combined_insn
 
+#undef TARGET_ASAN_SHADOW_OFFSET
+#define TARGET_ASAN_SHADOW_OFFSET ix86_asan_shadow_offset
+
 #undef TARGET_GIMPLIFY_VA_ARG_EXPR
 #define TARGET_GIMPLIFY_VA_ARG_EXPR ix86_gimplify_va_arg
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 665c5b1..908ddbf 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -11326,6 +11326,12 @@ MIPS, where add-immediate takes a 16-bit signed value,
 is zero, which disables this optimization.
 @end deftypevr
 
+@deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_ASAN_SHADOW_OFFSET (void)
+Return the offset bitwise ored into shifted address to get corresponding
+Address Sanitizer shadow memory address.  NULL if Address Sanitizer is not
+supported by the target.
+@end deftypefn
+
 @deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_MEMMODEL_CHECK (unsigned HOST_WIDE_INT @var{val})
 Validate target specific memory model mask bits. When NULL no target specific
 memory model bits are allowed.
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 289934b..0786691 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -11168,6 +11168,8 @@ MIPS, where add-immediate takes a 16-bit signed value,
 is zero, which disables this optimization.
 @end deftypevr
 
+@hook TARGET_ASAN_SHADOW_OFFSET
+
 @hook TARGET_MEMMODEL_CHECK
 Validate target specific memory model mask bits. When NULL no target specific
 memory model bits are allowed.
diff --git a/gcc/target.def b/gcc/target.def
index 5865224..f8781a8 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2025,6 +2025,17 @@ DEFHOOK
  "",
  unsigned HOST_WIDE_INT, (unsigned HOST_WIDE_INT val), NULL)
 
+/* Defines an offset bitwise ored into shifted address to get corresponding
+   Address Sanitizer shadow address, or -1 if Address Sanitizer is not
+   supported by the target.  */
+DEFHOOK
+(asan_shadow_offset,
+ "Return the offset bitwise ored into shifted address to get corresponding\n\
+Address Sanitizer shadow memory address.  NULL if Address Sanitizer is not\n\
+supported by the target.",
+ unsigned HOST_WIDE_INT, (void),
+ NULL)
+
 /* Functions relating to calls - argument passing, returns, etc.  */
 /* Members of struct call have no special macro prefix.  */
 HOOK_VECTOR (TARGET_CALLS, calls)
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 3ca0736..68849f5 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1541,6 +1541,13 @@ process_options (void)
       flag_omit_frame_pointer = 0;
     }
 
+  /* Address Sanitizer needs porting to each target architecture.  */
+  if (flag_asan && targetm.asan_shadow_offset == NULL)
+    {
+      warning (0, "-fasan not supported for this target");
+      flag_asan = 0;
+    }
+
   /* Enable -Werror=coverage-mismatch when -Werror and -Wno-error
      have not been set.  */
   if (!global_options_set.x_warnings_are_errors
-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 12/13] Instrument built-in memory access function calls
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
                   ` (9 preceding siblings ...)
  2012-11-01 19:53 ` [PATCH 05/13] Allow asan at -O0 dodji
@ 2012-11-01 19:53 ` dodji
  2012-11-01 19:54 ` [PATCH 04/13] Emit GIMPLE directly instead of gimplifying GENERIC dodji
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 80+ messages in thread
From: dodji @ 2012-11-01 19:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

From: dodji <dodji@138bc75d-0d04-0410-961f-82ee72b054a4>

This patch instruments many memory access patterns through builtins.

Basically, for a call like:

     __builtin_memset (from, 0, n_bytes);

the patch would only instrument the accesses at the beginning and at
the end of the memory region [from, from + n_bytes].  This is the
strategy used by the llvm implementation of asan.

This instrumentation is done for all the memory access builtin
functions that expose a well specified memory region -- one that
explicitly states the number of bytes accessed in the region.

A special treatment is used for __builtin_strlen.  The patch
instruments the access to the first byte of its argument, as well as
the access to the byte (of the argument) at the offset returned by
strlen.

For the __sync_* and __atomic* calls the patch instruments the access
to the bytes pointed to by the argument.

While doing this, I have added a new parameter to build_check_stmt to
decide whether to insert the instrumentation code before or after the
statement iterator.  This allows us to do away with the
gsi_{next,prev} dance we were doing in the callers of this function.

Tested by running cc1 -fasan on variations of simple programs like:

    int
    foo ()
    {
      char foo[10] = {0};

      foo[0] = 't';
      foo[1] = 'e';
      foo[2] = 's';
      foo[3] = 't';
      int l = __builtin_strlen (foo);
      int n = sizeof (foo);
      __builtin_memset (&foo[4], 0, n - 4);
      __sync_fetch_and_add (&foo[11], 1);

      return l;
    }

and by starring at the gimple output which for this function is:

    ;; Function foo (foo, funcdef_no=0, decl_uid=1714, cgraph_uid=0)

    foo ()
    {
      int n;
      int l;
      char foo[10];
      int D.1725;
      char * D.1724;
      int D.1723;
      long unsigned int D.1722;
      int D.1721;
      long unsigned int D.1720;
      long unsigned int _1;
      int _4;
      long unsigned int _5;
      int _6;
      char * _7;
      int _8;
      char * _9;
      unsigned long _10;
      unsigned long _11;
      unsigned long _12;
      signed char * _13;
      signed char _14;
      _Bool _15;
      unsigned long _16;
      signed char _17;
      _Bool _18;
      _Bool _19;
      char * _20;
      unsigned long _21;
      unsigned long _22;
      unsigned long _23;
      signed char * _24;
      signed char _25;
      _Bool _26;
      unsigned long _27;
      signed char _28;
      _Bool _29;
      _Bool _30;
      char * _31;
      unsigned long _32;
      unsigned long _33;
      unsigned long _34;
      signed char * _35;
      signed char _36;
      _Bool _37;
      unsigned long _38;
      signed char _39;
      _Bool _40;
      _Bool _41;
      char * _42;
      unsigned long _43;
      unsigned long _44;
      unsigned long _45;
      signed char * _46;
      signed char _47;
      _Bool _48;
      unsigned long _49;
      signed char _50;
      _Bool _51;
      _Bool _52;
      char * _53;
      unsigned long _54;
      unsigned long _55;
      unsigned long _56;
      signed char * _57;
      signed char _58;
      _Bool _59;
      unsigned long _60;
      signed char _61;
      _Bool _62;
      _Bool _63;
      char[10] * _64;
      unsigned long _65;
      unsigned long _66;
      unsigned long _67;
      signed char * _68;
      signed char _69;
      _Bool _70;
      unsigned long _71;
      signed char _72;
      _Bool _73;
      _Bool _74;
      unsigned long _75;
      unsigned long _76;
      unsigned long _77;
      signed char * _78;
      signed char _79;
      _Bool _80;
      unsigned long _81;
      signed char _82;
      _Bool _83;
      _Bool _84;
      long unsigned int _85;
      long unsigned int _86;
      char * _87;
      char * _88;
      unsigned long _89;
      unsigned long _90;
      unsigned long _91;
      signed char * _92;
      signed char _93;
      _Bool _94;
      unsigned long _95;
      signed char _96;
      _Bool _97;
      _Bool _98;
      char * _99;
      unsigned long _100;
      unsigned long _101;
      unsigned long _102;
      signed char * _103;
      signed char _104;
      _Bool _105;
      unsigned long _106;
      signed char _107;
      _Bool _108;
      _Bool _109;

      <bb 2>:
      foo = {};
      _9 = &foo[0];
      _10 = (unsigned long) _9;
      _11 = _10 >> 3;
      _12 = _11 + 17592186044416;
      _13 = (signed char *) _12;
      _14 = *_13;
      _15 = _14 != 0;
      _16 = _10 & 7;
      _17 = (signed char) _16;
      _18 = _17 >= _14;
      _19 = _15 & _18;
      if (_19 != 0)
	goto <bb 5>;
      else
	goto <bb 4>;

      <bb 5>:
      __asan_report_store1 (_10);

      <bb 4>:
      foo[0] = 116;
      _20 = &foo[1];
      _21 = (unsigned long) _20;
      _22 = _21 >> 3;
      _23 = _22 + 17592186044416;
      _24 = (signed char *) _23;
      _25 = *_24;
      _26 = _25 != 0;
      _27 = _21 & 7;
      _28 = (signed char) _27;
      _29 = _28 >= _25;
      _30 = _26 & _29;
      if (_30 != 0)
	goto <bb 7>;
      else
	goto <bb 6>;

      <bb 7>:
      __asan_report_store1 (_21);

      <bb 6>:
      foo[1] = 101;
      _31 = &foo[2];
      _32 = (unsigned long) _31;
      _33 = _32 >> 3;
      _34 = _33 + 17592186044416;
      _35 = (signed char *) _34;
      _36 = *_35;
      _37 = _36 != 0;
      _38 = _32 & 7;
      _39 = (signed char) _38;
      _40 = _39 >= _36;
      _41 = _37 & _40;
      if (_41 != 0)
	goto <bb 9>;
      else
	goto <bb 8>;

      <bb 9>:
      __asan_report_store1 (_32);

      <bb 8>:
      foo[2] = 115;
      _42 = &foo[3];
      _43 = (unsigned long) _42;
      _44 = _43 >> 3;
      _45 = _44 + 17592186044416;
      _46 = (signed char *) _45;
      _47 = *_46;
      _48 = _47 != 0;
      _49 = _43 & 7;
      _50 = (signed char) _49;
      _51 = _50 >= _47;
      _52 = _48 & _51;
      if (_52 != 0)
	goto <bb 11>;
      else
	goto <bb 10>;

      <bb 11>:
      __asan_report_store1 (_43);

      <bb 10>:
      foo[3] = 116;
      _53 = (char *) &foo;
      _54 = (unsigned long) _53;
      _55 = _54 >> 3;
      _56 = _55 + 17592186044416;
      _57 = (signed char *) _56;
      _58 = *_57;
      _59 = _58 != 0;
      _60 = _54 & 7;
      _61 = (signed char) _60;
      _62 = _61 >= _58;
      _63 = _59 & _62;
      if (_63 != 0)
	goto <bb 13>;
      else
	goto <bb 12>;

      <bb 13>:
      __asan_report_load1 (_54);

      <bb 12>:
      _1 = __builtin_strlen (&foo);
      _64 = _53 + _1;
      _65 = (unsigned long) _64;
      _66 = _65 >> 3;
      _67 = _66 + 17592186044416;
      _68 = (signed char *) _67;
      _69 = *_68;
      _70 = _69 != 0;
      _71 = _65 & 7;
      _72 = (signed char) _71;
      _73 = _72 >= _69;
      _74 = _70 & _73;
      if (_74 != 0)
	goto <bb 15>;
      else
	goto <bb 14>;

      <bb 15>:
      __asan_report_load1 (_65);

      <bb 14>:
      l_2 = (int) _1;
      n_3 = 10;
      _4 = n_3 + -4;
      _5 = (long unsigned int) _4;
      _6 = l_2 + 1;
      _7 = &foo[_6];
      if (_5 != 0)
	goto <bb 17>;
      else
	goto <bb 16>;

      <bb 17>:
      _75 = (unsigned long) _7;
      _76 = _75 >> 3;
      _77 = _76 + 17592186044416;
      _78 = (signed char *) _77;
      _79 = *_78;
      _80 = _79 != 0;
      _81 = _75 & 7;
      _82 = (signed char) _81;
      _83 = _82 >= _79;
      _84 = _80 & _83;
      _85 = _5;
      _86 = _85 - 1;
      _87 = _7;
      _88 = _87 + _86;
      _89 = (unsigned long) _88;
      _90 = _89 >> 3;
      _91 = _90 + 17592186044416;
      _92 = (signed char *) _91;
      _93 = *_92;
      _94 = _93 != 0;
      _95 = _89 & 7;
      _96 = (signed char) _95;
      _97 = _96 >= _93;
      _98 = _94 & _97;
      if (_98 != 0)
	goto <bb 21>;
      else
	goto <bb 20>;

      <bb 21>:
      __asan_report_store1 (_89);

      <bb 20>:
      if (_84 != 0)
	goto <bb 19>;
      else
	goto <bb 18>;

      <bb 19>:
      __asan_report_store1 (_75);

      <bb 18>:

      <bb 16>:
      __builtin_memset (_7, 0, _5);
      _99 = &foo[11];
      _100 = (unsigned long) _99;
      _101 = _100 >> 3;
      _102 = _101 + 17592186044416;
      _103 = (signed char *) _102;
      _104 = *_103;
      _105 = _104 != 0;
      _106 = _100 & 7;
      _107 = (signed char) _106;
      _108 = _107 >= _104;
      _109 = _105 & _108;
      if (_109 != 0)
	goto <bb 23>;
      else
	goto <bb 22>;

      <bb 23>:
      __asan_report_store1 (_100);

      <bb 22>:
      __sync_fetch_and_add_1 (&foo[11], 1);
      _8 = l_2;
      foo ={v} {CLOBBER};

    <L1>:
      return _8;

    }

    ;; Function _GLOBAL__sub_I_00099_0_foo (_GLOBAL__sub_I_00099_0_foo, funcdef_no=1, decl_uid=1752, cgraph_uid=4)

    _GLOBAL__sub_I_00099_0_foo ()
    {
      <bb 2>:
      __asan_init ();
      return;

    }

gcc/
	* asan.c (insert_if_then_before_iter, instrument_mem_region_access,
	(instrument_strlen_call, maybe_instrument_builtin_call,
	(maybe_instrument_call): New static functions.
	(create_cond_insert_point): Renamed
	create_cond_insert_point_before_iter into this.  Add a new
	parameter to decide whether to insert the condition before or
	after the statement iterator.
	(build_check_stmt): Adjust for the new create_cond_insert_point.
	Add a new parameter to decide whether to add the instrumentation
	code before or after the statement iterator.
	(instrument_assignment): Factorize from ...
	(transform_statements): ... here.  Use maybe_instrument_call to
	instrument builtin function calls as well.
	(instrument_derefs): Adjust for the new parameter of
	build_check_stmt.  Fix detection of bit-field access.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192845 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |  16 ++
 gcc/asan.c         | 612 ++++++++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 599 insertions(+), 29 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 0e0b9b8..c5cf908 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,5 +1,21 @@
 2012-10-26  Dodji Seketeli  <dodji@redhat.com>
 
+	* asan.c (insert_if_then_before_iter, instrument_mem_region_access,
+	(instrument_strlen_call, maybe_instrument_builtin_call,
+	(maybe_instrument_call): New static functions.
+	(create_cond_insert_point): Renamed
+	create_cond_insert_point_before_iter into this.  Add a new
+	parameter to decide whether to insert the condition before or
+	after the statement iterator.
+	(build_check_stmt): Adjust for the new create_cond_insert_point.
+	Add a new parameter to decide whether to add the instrumentation
+	code before or after the statement iterator.
+	(instrument_assignment): Factorize from ...
+	(transform_statements): ... here.  Use maybe_instrument_call to
+	instrument builtin function calls as well.
+	(instrument_derefs): Adjust for the new parameter of
+	build_check_stmt.  Fix detection of bit-field access.
+
 	* asan.c (create_cond_insert_point_before_iter): Factorize out of ...
 	(build_check_stmt): ... here.
 
diff --git a/gcc/asan.c b/gcc/asan.c
index 736286e..5d92e43 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -398,9 +398,9 @@ asan_init_func (void)
 #define PROB_ALWAYS		(REG_BR_PROB_BASE)
 
 /* Split the current basic block and create a condition statement
-   insertion point right before the statement pointed to by ITER.
-   Return an iterator to the point at which the caller might safely
-   insert the condition statement.
+   insertion point right before or after the statement pointed to by
+   ITER.  Return an iterator to the point at which the caller might
+   safely insert the condition statement.
 
    THEN_BLOCK must be set to the address of an uninitialized instance
    of basic_block.  The function will then set *THEN_BLOCK to the
@@ -414,18 +414,21 @@ asan_init_func (void)
    statements starting from *ITER, and *THEN_BLOCK is a new empty
    block.
 
-   *ITER is adjusted to still point to the same statement it was
-   *pointing to initially.  */
+   *ITER is adjusted to point to always point to the first statement
+    of the basic block * FALLTHROUGH_BLOCK.  That statement is the
+    same as what ITER was pointing to prior to calling this function,
+    if BEFORE_P is true; otherwise, it is its following statement.  */
 
 static gimple_stmt_iterator
-create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
-				      bool then_more_likely_p,
-				      basic_block *then_block,
-				      basic_block *fallthrough_block)
+create_cond_insert_point (gimple_stmt_iterator *iter,
+			  bool before_p,
+			  bool then_more_likely_p,
+			  basic_block *then_block,
+			  basic_block *fallthrough_block)
 {
   gimple_stmt_iterator gsi = *iter;
 
-  if (!gsi_end_p (gsi))
+  if (!gsi_end_p (gsi) && before_p)
     gsi_prev (&gsi);
 
   basic_block cur_bb = gsi_bb (*iter);
@@ -466,18 +469,58 @@ create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
   return gsi_last_bb (cond_bb);
 }
 
+/* Insert an if condition followed by a 'then block' right before the
+   statement pointed to by ITER.  The fallthrough block -- which is the
+   else block of the condition as well as the destination of the
+   outcoming edge of the 'then block' -- starts with the statement
+   pointed to by ITER.
+
+   COND is the condition of the if.  
+
+   If THEN_MORE_LIKELY_P is true, the probability of the edge to the
+   'then block' is higher than the probability of the edge to the
+   fallthrough block.
+
+   Upon completion of the function, *THEN_BB is set to the newly
+   inserted 'then block' and similarly, *FALLTHROUGH_BB is set to the
+   fallthrough block.
+
+   *ITER is adjusted to still point to the same statement it was
+   pointing to initially.  */
+
+static void
+insert_if_then_before_iter (gimple cond,
+			    gimple_stmt_iterator *iter,
+			    bool then_more_likely_p,
+			    basic_block *then_bb,
+			    basic_block *fallthrough_bb)
+{
+  gimple_stmt_iterator cond_insert_point =
+    create_cond_insert_point (iter,
+			      /*before_p=*/true,
+			      then_more_likely_p,
+			      then_bb,
+			      fallthrough_bb);
+  gsi_insert_after (&cond_insert_point, cond, GSI_NEW_STMT);
+}
+
 /* Instrument the memory access instruction BASE.  Insert new
-   statements before ITER.
+   statements before or after ITER.
 
    Note that the memory access represented by BASE can be either an
    SSA_NAME, or a non-SSA expression.  LOCATION is the source code
    location.  IS_STORE is TRUE for a store, FALSE for a load.
-   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
+   BEFORE_P is TRUE for inserting the instrumentation code before
+   ITER, FALSE for inserting it after ITER.  SIZE_IN_BYTES is one of
+   1, 2, 4, 8, 16.
+
+   If BEFORE_P is TRUE, *ITER is arranged to still point to the
+   statement it was pointing to prior to calling this function,
+   otherwise, it points to the statement logically following it.  */
 
 static void
-build_check_stmt (tree base, gimple_stmt_iterator *iter,
-                  location_t location, bool is_store,
-		  int size_in_bytes)
+build_check_stmt (location_t location, tree base, gimple_stmt_iterator *iter,
+		  bool before_p, bool is_store, int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
   basic_block then_bb, else_bb;
@@ -491,10 +534,10 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 
   /* Get an iterator on the point where we can add the condition
      statement for the instrumentation.  */
-  gsi = create_cond_insert_point_before_iter (iter,
-					      /*then_more_likely_p=*/false,
-					      &then_bb,
-					      &else_bb);
+  gsi = create_cond_insert_point (iter, before_p,
+				  /*then_more_likely_p=*/false,
+				  &then_bb,
+				  &else_bb);
 
   base = unshare_expr (base);
 
@@ -626,7 +669,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 
 /* If T represents a memory access, add instrumentation code before ITER.
    LOCATION is source code location.
-   IS_STORE is either 1 (for a store) or 0 (for a load).  */
+   IS_STORE is either TRUE (for a store) or FALSE (for a load).  */
 
 static void
 instrument_derefs (gimple_stmt_iterator *iter, tree t,
@@ -661,11 +704,523 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
   int volatilep = 0, unsignedp = 0;
   get_inner_reference (t, &bitsize, &bitpos, &offset,
 		       &mode, &unsignedp, &volatilep, false);
-  if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
+  if (bitpos % (size_in_bytes * BITS_PER_UNIT)
+      || bitsize != size_in_bytes * BITS_PER_UNIT)
     return;
 
   base = build_fold_addr_expr (t);
-  build_check_stmt (base, iter, location, is_store, size_in_bytes);
+  build_check_stmt (location, base, iter, /*before_p=*/true,
+		    is_store, size_in_bytes);
+}
+
+/* Instrument an access to a contiguous memory region that starts at
+   the address pointed to by BASE, over a length of LEN (expressed in
+   the sizeof (*BASE) bytes).  ITER points to the instruction before
+   which the instrumentation instructions must be inserted.  LOCATION
+   is the source location that the instrumentation instructions must
+   have.  If IS_STORE is true, then the memory access is a store;
+   otherwise, it's a load.  */
+
+static void
+instrument_mem_region_access (tree base, tree len,
+			      gimple_stmt_iterator *iter,
+			      location_t location, bool is_store)
+{
+  if (integer_zerop (len))
+    return;
+
+  gimple_stmt_iterator gsi = *iter;
+
+  basic_block fallthrough_bb = NULL, then_bb = NULL;
+  if (!is_gimple_constant (len))
+    {
+      /* So, the length of the memory area to asan-protect is
+	 non-constant.  Let's guard the generated instrumentation code
+	 like:
+
+	 if (len != 0)
+	   {
+	     //asan instrumentation code goes here.
+           }
+	   // falltrough instructions, starting with *ITER.  */
+
+      gimple g = gimple_build_cond (NE_EXPR,
+				    len,
+				    build_int_cst (TREE_TYPE (len), 0),
+				    NULL_TREE, NULL_TREE);
+      gimple_set_location (g, location);
+      insert_if_then_before_iter (g, iter, /*then_more_likely_p=*/true,
+				  &then_bb, &fallthrough_bb);
+      /* Note that fallthrough_bb starts with the statement that was
+	 pointed to by ITER.  */
+
+      /* The 'then block' of the 'if (len != 0) condition is where
+	 we'll generate the asan instrumentation code now.  */
+      gsi = gsi_start_bb (then_bb);
+    }
+
+  /* Instrument the beginning of the memory region to be accessed,
+     and arrange for the rest of the intrumentation code to be
+     inserted in the then block *after* the current gsi.  */
+  build_check_stmt (location, base, &gsi, /*before_p=*/true, is_store, 1);
+
+  if (then_bb)
+    /* We are in the case where the length of the region is not
+       constant; so instrumentation code is being generated in the
+       'then block' of the 'if (len != 0) condition.  Let's arrange
+       for the subsequent instrumentation statements to go in the
+       'then block'.  */
+    gsi = gsi_last_bb (then_bb);
+  else
+    *iter = gsi;
+
+  /* We want to instrument the access at the end of the memory region,
+     which is at (base + len - 1).  */
+
+  /* offset = len - 1;  */
+  len = unshare_expr (len);
+  gimple offset =
+    gimple_build_assign_with_ops (TREE_CODE (len),
+				  make_ssa_name (TREE_TYPE (len), NULL),
+				  len, NULL);
+  gimple_set_location (offset, location);
+  gsi_insert_before (&gsi, offset, GSI_NEW_STMT);
+
+  offset =
+    gimple_build_assign_with_ops (MINUS_EXPR,
+				  make_ssa_name (size_type_node, NULL),
+				  gimple_assign_lhs (offset),
+				  build_int_cst (size_type_node, 1));
+  gimple_set_location (offset, location);
+  gsi_insert_after (&gsi, offset, GSI_NEW_STMT);
+
+  /* _1 = base;  */
+  base = unshare_expr (base);
+  gimple region_end =
+    gimple_build_assign_with_ops (TREE_CODE (base),
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  base, NULL);
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* _2 = _1 + offset;  */
+  region_end =
+    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  gimple_assign_lhs (region_end), 
+				  gimple_assign_lhs (offset));
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* instrument access at _2;  */
+  build_check_stmt (location, gimple_assign_lhs (region_end),
+		    &gsi, /*before_p=*/false, is_store, 1);
+}
+
+/* Instrument the strlen builtin call pointed to by ITER.
+
+   This function instruments the access to the first byte of the
+   argument, right before the call.  After the call it instruments the
+   access to the last byte of the argument; it uses the result of the
+   call to deduce the offset of that last byte.  */
+
+static void
+instrument_strlen_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+  gcc_assert (is_gimple_call (call));
+
+  tree callee = gimple_call_fndecl (call);
+  gcc_assert (is_builtin_fn (callee)
+	      && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL
+	      && DECL_FUNCTION_CODE (callee) == BUILT_IN_STRLEN);
+
+  tree len = gimple_call_lhs (call);
+  if (len == NULL)
+    /* Some passes might clear the return value of the strlen call;
+       bail out in that case.  */
+    return;
+  gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (len)));
+
+  location_t loc = gimple_location (call);
+  tree str_arg = gimple_call_arg (call, 0);
+
+  /* Instrument the access to the first byte of str_arg.  i.e:
+
+     _1 = str_arg; instrument (_1); */
+  gimple str_arg_ssa =
+    gimple_build_assign_with_ops (NOP_EXPR,
+				  make_ssa_name (build_pointer_type
+						 (char_type_node), NULL),
+				  str_arg, NULL);
+  gimple_set_location (str_arg_ssa, loc);
+  gimple_stmt_iterator gsi = *iter;
+  gsi_insert_before (&gsi, str_arg_ssa, GSI_NEW_STMT);
+  build_check_stmt (loc, gimple_assign_lhs (str_arg_ssa), &gsi,
+		    /*before_p=*/false, /*is_store=*/false, 1);
+
+  /* If we initially had an instruction like:
+
+	 int n = strlen (str)
+
+     we now want to instrument the access to str[n], after the
+     instruction above.*/
+
+  /* So let's build the access to str[n] that is, access through the
+     pointer_plus expr: (_1 + len).  */
+  gimple stmt =
+    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+				  make_ssa_name (TREE_TYPE (str_arg),
+						 NULL),
+				  gimple_assign_lhs (str_arg_ssa),
+				  len);
+  gimple_set_location (stmt, loc);
+  gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
+
+  build_check_stmt (loc, gimple_assign_lhs (stmt), &gsi,
+		    /*before_p=*/false, /*is_store=*/false, 1);
+
+  /* Ensure that iter points to the statement logically following the
+     one it was initially pointing to.  */
+  *iter = gsi;
+}
+
+/* if the statement pointed to by the iterator iter is a call to a
+   builtin memory access function, instrument it and return true.
+   otherwise, return false.  */
+
+static bool
+maybe_instrument_builtin_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+  location_t loc = gimple_location (call);
+
+  if (!is_gimple_call (call))
+    return false;
+
+  tree callee = gimple_call_fndecl (call);
+
+  if (!is_builtin_fn (callee)
+      || DECL_BUILT_IN_CLASS (callee) != BUILT_IN_NORMAL)
+    return false;
+
+  tree source0 = NULL_TREE, source1 = NULL_TREE,
+    dest = NULL_TREE, len = NULL_TREE;
+  bool is_store = true;
+
+  switch (DECL_FUNCTION_CODE (callee))
+    {
+      /* (s, s, n) style memops.  */
+    case BUILT_IN_BCMP:
+    case BUILT_IN_MEMCMP:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      source1 = gimple_call_arg (call, 1);
+      break;
+
+      /* (src, dest, n) style memops.  */
+    case BUILT_IN_BCOPY:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      dest = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, src, n) style memops.  */
+    case BUILT_IN_MEMCPY:
+    case BUILT_IN_MEMCPY_CHK:
+    case BUILT_IN_MEMMOVE:
+    case BUILT_IN_MEMMOVE_CHK:
+    case BUILT_IN_MEMPCPY:
+    case BUILT_IN_MEMPCPY_CHK:
+      dest = gimple_call_arg (call, 0);
+      source0 = gimple_call_arg (call, 1);
+      len = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, n) style memops.  */
+    case BUILT_IN_BZERO:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 1);
+      break;
+
+      /* (dest, x, n) style memops*/
+    case BUILT_IN_MEMSET:
+    case BUILT_IN_MEMSET_CHK:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 2);
+      break;
+
+    case BUILT_IN_STRLEN:
+      instrument_strlen_call (iter);
+      return true;
+
+    /* And now the __atomic* and __sync builtins.
+       These are handled differently from the classical memory memory
+       access builtins above.  */
+
+    case BUILT_IN_ATOMIC_LOAD:
+    case BUILT_IN_ATOMIC_LOAD_1:
+    case BUILT_IN_ATOMIC_LOAD_2:
+    case BUILT_IN_ATOMIC_LOAD_4:
+    case BUILT_IN_ATOMIC_LOAD_8:
+    case BUILT_IN_ATOMIC_LOAD_16:
+      is_store = false;
+      /* fall through.  */
+
+    case BUILT_IN_SYNC_FETCH_AND_ADD_1:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_2:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_4:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_8:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_SUB_1:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_2:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_4:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_8:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_OR_1:
+    case BUILT_IN_SYNC_FETCH_AND_OR_2:
+    case BUILT_IN_SYNC_FETCH_AND_OR_4:
+    case BUILT_IN_SYNC_FETCH_AND_OR_8:
+    case BUILT_IN_SYNC_FETCH_AND_OR_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_AND_1:
+    case BUILT_IN_SYNC_FETCH_AND_AND_2:
+    case BUILT_IN_SYNC_FETCH_AND_AND_4:
+    case BUILT_IN_SYNC_FETCH_AND_AND_8:
+    case BUILT_IN_SYNC_FETCH_AND_AND_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_XOR_1:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_2:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_4:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_8:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_NAND_1:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_2:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_4:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_8:
+
+    case BUILT_IN_SYNC_ADD_AND_FETCH_1:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_2:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_4:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_8:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_SUB_AND_FETCH_1:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_2:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_4:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_8:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_OR_AND_FETCH_1:
+    case BUILT_IN_SYNC_OR_AND_FETCH_2:
+    case BUILT_IN_SYNC_OR_AND_FETCH_4:
+    case BUILT_IN_SYNC_OR_AND_FETCH_8:
+    case BUILT_IN_SYNC_OR_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_AND_AND_FETCH_1:
+    case BUILT_IN_SYNC_AND_AND_FETCH_2:
+    case BUILT_IN_SYNC_AND_AND_FETCH_4:
+    case BUILT_IN_SYNC_AND_AND_FETCH_8:
+    case BUILT_IN_SYNC_AND_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_XOR_AND_FETCH_1:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_2:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_4:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_8:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_NAND_AND_FETCH_1:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_2:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_4:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_8:
+
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_1:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_2:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_4:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_8:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_16:
+
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_1:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_2:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_4:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_8:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_16:
+
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_1:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_2:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_4:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_8:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_16:
+
+    case BUILT_IN_SYNC_LOCK_RELEASE_1:
+    case BUILT_IN_SYNC_LOCK_RELEASE_2:
+    case BUILT_IN_SYNC_LOCK_RELEASE_4:
+    case BUILT_IN_SYNC_LOCK_RELEASE_8:
+    case BUILT_IN_SYNC_LOCK_RELEASE_16:
+
+    case BUILT_IN_ATOMIC_TEST_AND_SET:
+    case BUILT_IN_ATOMIC_CLEAR:
+    case BUILT_IN_ATOMIC_EXCHANGE:
+    case BUILT_IN_ATOMIC_EXCHANGE_1:
+    case BUILT_IN_ATOMIC_EXCHANGE_2:
+    case BUILT_IN_ATOMIC_EXCHANGE_4:
+    case BUILT_IN_ATOMIC_EXCHANGE_8:
+    case BUILT_IN_ATOMIC_EXCHANGE_16:
+
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_2:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_8:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_16:
+
+    case BUILT_IN_ATOMIC_STORE:
+    case BUILT_IN_ATOMIC_STORE_1:
+    case BUILT_IN_ATOMIC_STORE_2:
+    case BUILT_IN_ATOMIC_STORE_4:
+    case BUILT_IN_ATOMIC_STORE_8:
+    case BUILT_IN_ATOMIC_STORE_16:
+
+    case BUILT_IN_ATOMIC_ADD_FETCH_1:
+    case BUILT_IN_ATOMIC_ADD_FETCH_2:
+    case BUILT_IN_ATOMIC_ADD_FETCH_4:
+    case BUILT_IN_ATOMIC_ADD_FETCH_8:
+    case BUILT_IN_ATOMIC_ADD_FETCH_16:
+
+    case BUILT_IN_ATOMIC_SUB_FETCH_1:
+    case BUILT_IN_ATOMIC_SUB_FETCH_2:
+    case BUILT_IN_ATOMIC_SUB_FETCH_4:
+    case BUILT_IN_ATOMIC_SUB_FETCH_8:
+    case BUILT_IN_ATOMIC_SUB_FETCH_16:
+
+    case BUILT_IN_ATOMIC_AND_FETCH_1:
+    case BUILT_IN_ATOMIC_AND_FETCH_2:
+    case BUILT_IN_ATOMIC_AND_FETCH_4:
+    case BUILT_IN_ATOMIC_AND_FETCH_8:
+    case BUILT_IN_ATOMIC_AND_FETCH_16:
+
+    case BUILT_IN_ATOMIC_NAND_FETCH_1:
+    case BUILT_IN_ATOMIC_NAND_FETCH_2:
+    case BUILT_IN_ATOMIC_NAND_FETCH_4:
+    case BUILT_IN_ATOMIC_NAND_FETCH_8:
+    case BUILT_IN_ATOMIC_NAND_FETCH_16:
+
+    case BUILT_IN_ATOMIC_XOR_FETCH_1:
+    case BUILT_IN_ATOMIC_XOR_FETCH_2:
+    case BUILT_IN_ATOMIC_XOR_FETCH_4:
+    case BUILT_IN_ATOMIC_XOR_FETCH_8:
+    case BUILT_IN_ATOMIC_XOR_FETCH_16:
+
+    case BUILT_IN_ATOMIC_OR_FETCH_1:
+    case BUILT_IN_ATOMIC_OR_FETCH_2:
+    case BUILT_IN_ATOMIC_OR_FETCH_4:
+    case BUILT_IN_ATOMIC_OR_FETCH_8:
+    case BUILT_IN_ATOMIC_OR_FETCH_16:
+
+    case BUILT_IN_ATOMIC_FETCH_ADD_1:
+    case BUILT_IN_ATOMIC_FETCH_ADD_2:
+    case BUILT_IN_ATOMIC_FETCH_ADD_4:
+    case BUILT_IN_ATOMIC_FETCH_ADD_8:
+    case BUILT_IN_ATOMIC_FETCH_ADD_16:
+
+    case BUILT_IN_ATOMIC_FETCH_SUB_1:
+    case BUILT_IN_ATOMIC_FETCH_SUB_2:
+    case BUILT_IN_ATOMIC_FETCH_SUB_4:
+    case BUILT_IN_ATOMIC_FETCH_SUB_8:
+    case BUILT_IN_ATOMIC_FETCH_SUB_16:
+
+    case BUILT_IN_ATOMIC_FETCH_AND_1:
+    case BUILT_IN_ATOMIC_FETCH_AND_2:
+    case BUILT_IN_ATOMIC_FETCH_AND_4:
+    case BUILT_IN_ATOMIC_FETCH_AND_8:
+    case BUILT_IN_ATOMIC_FETCH_AND_16:
+
+    case BUILT_IN_ATOMIC_FETCH_NAND_1:
+    case BUILT_IN_ATOMIC_FETCH_NAND_2:
+    case BUILT_IN_ATOMIC_FETCH_NAND_4:
+    case BUILT_IN_ATOMIC_FETCH_NAND_8:
+    case BUILT_IN_ATOMIC_FETCH_NAND_16:
+
+    case BUILT_IN_ATOMIC_FETCH_XOR_1:
+    case BUILT_IN_ATOMIC_FETCH_XOR_2:
+    case BUILT_IN_ATOMIC_FETCH_XOR_4:
+    case BUILT_IN_ATOMIC_FETCH_XOR_8:
+    case BUILT_IN_ATOMIC_FETCH_XOR_16:
+
+    case BUILT_IN_ATOMIC_FETCH_OR_1:
+    case BUILT_IN_ATOMIC_FETCH_OR_2:
+    case BUILT_IN_ATOMIC_FETCH_OR_4:
+    case BUILT_IN_ATOMIC_FETCH_OR_8:
+    case BUILT_IN_ATOMIC_FETCH_OR_16:
+      {
+	dest = gimple_call_arg (call, 0);
+	/* So DEST represents the address of a memory location.
+	   instrument_derefs wants the memory location, so lets
+	   dereference the address DEST before handing it to
+	   instrument_derefs.  */
+	if (TREE_CODE (dest) == ADDR_EXPR)
+	  dest = TREE_OPERAND (dest, 0);
+	else if (TREE_CODE (dest) == SSA_NAME)
+	  dest = build2 (MEM_REF, TREE_TYPE (TREE_TYPE (dest)),
+			 dest, build_int_cst (TREE_TYPE (dest), 0));
+	else
+	  gcc_unreachable ();
+
+	instrument_derefs (iter, dest, loc, is_store);
+	return true;
+      }
+
+    default:
+      /* The other builtins memory access are not instrumented in this
+	 function because they either don't have any length parameter,
+	 or their length parameter is just a limit.  */
+      break;
+    }
+
+  if (len != NULL_TREE)
+    {
+      if (source0 != NULL_TREE)
+	instrument_mem_region_access (source0, len, iter,
+				      loc, /*is_store=*/false);
+      if (source1 != NULL_TREE)
+	instrument_mem_region_access (source1, len, iter,
+				      loc, /*is_store=*/false);
+      else if (dest != NULL_TREE)
+	instrument_mem_region_access (dest, len, iter,
+				      loc, /*is_store=*/true);
+      return true;
+    }
+  return false;
+}
+
+/*  Instrument the assignment statement ITER if it is subject to
+    instrumentation.  */
+
+static void
+instrument_assignment (gimple_stmt_iterator *iter)
+{
+  gimple s = gsi_stmt (*iter);
+
+  gcc_assert (gimple_assign_single_p (s));
+
+  instrument_derefs (iter, gimple_assign_lhs (s),
+		     gimple_location (s), true);
+  instrument_derefs (iter, gimple_assign_rhs1 (s),
+		     gimple_location (s), false);
+}
+
+/* Instrument the function call pointed to by the iterator ITER, if it
+   is subject to instrumentation.  At the moment, the only function
+   calls that are instrumented are some built-in functions that access
+   memory.  Look at maybe_instrument_builtin_call to learn more.  */
+
+static void
+maybe_instrument_call (gimple_stmt_iterator *iter)
+{
+  maybe_instrument_builtin_call (iter);
 }
 
 /* asan: this looks too complex. Can this be done simpler? */
@@ -686,13 +1241,12 @@ transform_statements (void)
       if (bb->index >= saved_last_basic_block) continue;
       for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
         {
-          gimple s = gsi_stmt (i);
-          if (!gimple_assign_single_p (s))
-	    continue;
-          instrument_derefs (&i, gimple_assign_lhs (s),
-                             gimple_location (s), true);
-          instrument_derefs (&i, gimple_assign_rhs1 (s),
-                             gimple_location (s), false);
+	  gimple s = gsi_stmt (i);
+
+	  if (gimple_assign_single_p (s))
+	    instrument_assignment (&i);
+	  else if (is_gimple_call (s))
+	    maybe_instrument_call (&i);
         }
     }
 }
-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 05/13] Allow asan at -O0
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
                   ` (8 preceding siblings ...)
  2012-11-01 19:53 ` [PATCH 07/13] Implement protection of global variables dodji
@ 2012-11-01 19:53 ` dodji
  2012-11-01 19:53 ` [PATCH 12/13] Instrument built-in memory access function calls dodji
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 80+ messages in thread
From: dodji @ 2012-11-01 19:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

From: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4>

This patch defines a new asan pass gate that is activated at -O0, in addition
to the -O3 level at which it was initially activated.  The patch also
does some comment cleanups here and there.

	* asan.c (build_check_stmt): Rename join_bb variable to else_bb.
	(gate_asan_O0): New function.
	(pass_asan_O0): New variable.
	* passes.c (init_optimization_passes): Add pass_asan_O0.
	* tree-pass.h (pass_asan_O0): New declaration.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192415 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |  8 ++++++++
 gcc/asan.c         | 44 +++++++++++++++++++++++++++++++++++---------
 gcc/passes.c       |  1 +
 gcc/tree-pass.h    |  1 +
 4 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 9bfccd7..505bce9 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,3 +1,11 @@
+2012-10-12  Jakub Jelinek  <jakub@redhat.com>
+
+	* asan.c (build_check_stmt): Rename join_bb variable to else_bb.
+	(gate_asan_O0): New function.
+	(pass_asan_O0): New variable.
+	* passes.c (init_optimization_passes): Add pass_asan_O0.
+	* tree-pass.h (pass_asan_O0): New declaration.
+
 2012-10-11  Jakub Jelinek  <jakub@redhat.com>
 
 	* Makefile.in (GTFILES): Add $(srcdir)/asan.c.
diff --git a/gcc/asan.c b/gcc/asan.c
index 2e7d4d6..66dc571 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -137,7 +137,7 @@ build_check_stmt (tree base,
                   location_t location, bool is_store, int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
-  basic_block cond_bb, then_bb, join_bb;
+  basic_block cond_bb, then_bb, else_bb;
   edge e;
   tree t, base_addr, shadow;
   gimple g;
@@ -158,23 +158,23 @@ build_check_stmt (tree base,
   else
     e = split_block_after_labels (cond_bb);
   cond_bb = e->src;
-  join_bb = e->dest;
+  else_bb = e->dest;
 
-  /* A recap at this point: join_bb is the basic block at whose head
+  /* A recap at this point: else_bb is the basic block at whose head
      is the gimple statement for which this check expression is being
      built.  cond_bb is the (possibly new, synthetic) basic block the
      end of which will contain the cache-lookup code, and a
      conditional that jumps to the cache-miss code or, much more
-     likely, over to join_bb.  */
+     likely, over to else_bb.  */
 
   /* Create the bb that contains the crash block.  */
   then_bb = create_empty_bb (cond_bb);
   e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
   e->probability = PROB_VERY_UNLIKELY;
-  make_single_succ_edge (then_bb, join_bb, EDGE_FALLTHRU);
+  make_single_succ_edge (then_bb, else_bb, EDGE_FALLTHRU);
 
-  /* Mark the pseudo-fallthrough edge from cond_bb to join_bb.  */
-  e = find_edge (cond_bb, join_bb);
+  /* Mark the pseudo-fallthrough edge from cond_bb to else_bb.  */
+  e = find_edge (cond_bb, else_bb);
   e->flags = EDGE_FALSE_VALUE;
   e->count = cond_bb->count;
   e->probability = PROB_ALWAYS - PROB_VERY_UNLIKELY;
@@ -184,7 +184,7 @@ build_check_stmt (tree base,
   if (dom_info_available_p (CDI_DOMINATORS))
     {
       set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
-      set_immediate_dominator (CDI_DOMINATORS, join_bb, cond_bb);
+      set_immediate_dominator (CDI_DOMINATORS, else_bb, cond_bb);
     }
 
   gsi = gsi_last_bb (cond_bb);
@@ -305,7 +305,7 @@ build_check_stmt (tree base,
   gimple_set_location (g, location);
   gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
-  *iter = gsi_start_bb (join_bb);
+  *iter = gsi_start_bb (else_bb);
 }
 
 /* If T represents a memory access, add instrumentation code before ITER.
@@ -447,4 +447,30 @@ struct gimple_opt_pass pass_asan =
  }
 };
 
+static bool
+gate_asan_O0 (void)
+{
+  return flag_asan != 0 && !optimize;
+}
+
+struct gimple_opt_pass pass_asan_O0 =
+{
+ {
+  GIMPLE_PASS,
+  "asan0",				/* name  */
+  gate_asan_O0,				/* gate  */
+  asan_instrument,			/* execute  */
+  NULL,					/* sub  */
+  NULL,					/* next  */
+  0,					/* static_pass_number  */
+  TV_NONE,				/* tv_id  */
+  PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required  */
+  0,					/* properties_provided  */
+  0,					/* properties_destroyed  */
+  0,					/* todo_flags_start  */
+  TODO_verify_flow | TODO_verify_stmts
+  | TODO_update_ssa			/* todo_flags_finish  */
+ }
+};
+
 #include "gt-asan.h"
diff --git a/gcc/passes.c b/gcc/passes.c
index 66a2f74..d4115b3 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1562,6 +1562,7 @@ init_optimization_passes (void)
       NEXT_PASS (pass_tm_edges);
     }
   NEXT_PASS (pass_lower_complex_O0);
+  NEXT_PASS (pass_asan_O0);
   NEXT_PASS (pass_cleanup_eh);
   NEXT_PASS (pass_lower_resx);
   NEXT_PASS (pass_nrv);
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 73c5886..69baa0d 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -260,6 +260,7 @@ struct register_pass_info
 extern struct gimple_opt_pass pass_mudflap_1;
 extern struct gimple_opt_pass pass_mudflap_2;
 extern struct gimple_opt_pass pass_asan;
+extern struct gimple_opt_pass pass_asan_O0;
 extern struct gimple_opt_pass pass_lower_cf;
 extern struct gimple_opt_pass pass_refactor_eh;
 extern struct gimple_opt_pass pass_lower_eh;
-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 06/13] Implement protection of stack variables
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
                   ` (2 preceding siblings ...)
  2012-11-01 19:53 ` [PATCH 09/13] Don't forget to protect 32 bytes aligned global variables dodji
@ 2012-11-01 19:53 ` dodji
       [not found]   ` <CAGQ9bdweH8Pn=8vLTNa8FSzAh92OYrWScxK78n9znCodADJUvw@mail.gmail.com>
  2012-11-01 19:53 ` [PATCH 02/13] Rename tree-asan.[ch] to asan.[ch] dodji
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 80+ messages in thread
From: dodji @ 2012-11-01 19:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

From: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4>

This patch implements the protection of stack variables.

To understand how this works, lets look at this example on x86_64
where the stack grows downward:

 int
 foo ()
 {
   char a[23] = {0};
   int b[2] = {0};

   a[5] = 1;
   b[1] = 2;

   return a[5] + b[1];
 }

For this function, the stack protected by asan will be organized as
follows, from the top of the stack to the bottom:

Slot 1/ [red zone of 32 bytes called 'RIGHT RedZone']

Slot 2/ [24 bytes for variable 'a']

Slot 3/ [8 bytes of red zone, that adds up to the space of 'a' to make
         the next slot be 32 bytes aligned; this one is called Partial
         Redzone; this 32 bytes alignment is an asan constraint]

Slot 4/ [red zone of 32 bytes called 'Middle RedZone']

Slot 5/ [8 bytes for variable 'b']

Slot 6/ [24 bytes of Partial Red Zone (similar to slot 3]

Slot 7/ [32 bytes of Red Zone at the bottom of the stack, called 'LEFT
         RedZone']

[A cultural question I've kept asking myself is Why has address
 sanitizer authors called these red zones (LEFT, MIDDLE, RIGHT)
 instead of e.g, (BOTTOM, MIDDLE, TOP).  Maybe they can step up and
 educate me so that I get less confused in the future.  :-)]

The 32 bytes of LEFT red zone at the bottom of the stack can be
decomposed as such:

    1/ The first 8 bytes contain a magical asan number that is always
    0x41B58AB3.

    2/ The following 8 bytes contains a pointer to a string (to be
    parsed at runtime by the runtime asan library), which format is
    the following:

     "<function-name> <space> <num-of-variables-on-the-stack>
     (<32-bytes-aligned-offset-in-bytes-of-variable> <space>
     <length-of-var-in-bytes> ){n} "

	where '(...){n}' means the content inside the parenthesis occurs 'n'
	times, with 'n' being the number of variables on the stack.

     3/ The following 16 bytes of the red zone have no particular
     format.

The shadow memory for that stack layout is going to look like this:

    - content of shadow memory 8 bytes for slot 7: 0xFFFFFFFFF1F1F1F1.
      The F1 byte pattern is a magic number called
      ASAN_STACK_MAGIC_LEFT and is a way for the runtime to know that
      the memory for that shadow byte is part of a the LEFT red zone
      intended to seat at the bottom of the variables on the stack.

    - content of shadow memory 8 bytes for slots 6 and 5:
      0xFFFFFFFFF4F4F400.  The F4 byte pattern is a magic number
      called ASAN_STACK_MAGIC_PARTIAL.  It flags the fact that the
      memory region for this shadow byte is a PARTIAL red zone
      intended to pad a variable A, so that the slot following
      {A,padding} is 32 bytes aligned.

      Note that the fact that the least significant byte of this
      shadow memory content is 00 means that 8 bytes of its
      corresponding memory (which corresponds to the memory of
      variable 'b') is addressable.

    - content of shadow memory 8 bytes for slot 4: 0xFFFFFFFFF2F2F2F2.
      The F2 byte pattern is a magic number called
      ASAN_STACK_MAGIC_MIDDLE.  It flags the fact that the memory
      region for this shadow byte is a MIDDLE red zone intended to
      seat between two 32 aligned slots of {variable,padding}.

    - content of shadow memory 8 bytes for slot 3 and 2:
      0xFFFFFFFFF4000000.  This represents is the concatenation of
      variable 'a' and the partial red zone following it, like what we
      had for variable 'b'.  The least significant 3 bytes being 00
      means that the 3 bytes of variable 'a' are addressable.

    - content of shadow memory 8 bytes for slot 1: 0xFFFFFFFFF3F3F3F3.
      The F3 byte pattern is a magic number called
      ASAN_STACK_MAGIC_RIGHT.  It flags the fact that the memory
      region for this shadow byte is a RIGHT red zone intended to seat
      at the top of the variables of the stack.

So, the patch lays out stack variables as well as the different red
zones, emits some prologue code to populate the shadow memory as to
poison (mark as non-accessible) the regions of the red zones and mark
the regions of stack variables as accessible, and emit some epilogue
code to un-poison (mark as accessible) the regions of red zones right
before the function exits.

	* Makefile.in (asan.o): Depend on $(EXPR_H) $(OPTABS_H).
	(cfgexpand.o): Depend on asan.h.
	* asan.c: Include expr.h and optabs.h.
	(asan_shadow_set): New variable.
	(asan_shadow_cst, asan_emit_stack_protection): New functions.
	(asan_init_shadow_ptr_types): Initialize also asan_shadow_set.
	* cfgexpand.c: Include asan.h.  Define HOST_WIDE_INT heap vector.
	(partition_stack_vars): If i is large alignment and j small
	alignment or vice versa, break out of the loop instead of continue,
	and put the test earlier.  If flag_asan, break out of the loop
	if for small alignment size is different.
	(struct stack_vars_data): New type.
	(expand_stack_vars): Add DATA argument.  Change PRED type to
	function taking size_t argument instead of tree.  Adjust pred calls.
	Fill DATA in and add needed padding in between variables if -fasan.
	(defer_stack_allocation): Defer everything for flag_asan.
	(stack_protect_decl_phase_1, stack_protect_decl_phase_2): Take
	size_t index into stack_vars array instead of the decl directly.
	(asan_decl_phase_3): New function.
	(expand_used_vars): Return var destruction sequence.  Adjust
	expand_stack_vars calls, add another one for flag_asan.  Call
	asan_emit_stack_protection if expand_stack_vars added anything
	to the vectors.
	(expand_gimple_basic_block): Add disable_tail_calls argument.
	(gimple_expand_cfg): Pass true to it if expand_used_vars returned
	non-NULL.  Emit the sequence returned by expand_used_vars after
	return_label.
	* asan.h (asan_emit_stack_protection): New prototype.
	(asan_shadow_set): New decl.
	(ASAN_RED_ZONE_SIZE, ASAN_STACK_MAGIC_LEFT, ASAN_STACK_MAGIC_MIDDLE,
	ASAN_STACK_MAGIC_RIGHT, ASAN_STACK_FRAME_MAGIC): Define.
	(asan_protect_stack_decl): New inline.
	* toplev.c (process_options): Also disable -fasan on
	!FRAME_GROWS_DOWNWARDS targets.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192540 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |  37 ++++++++++
 gcc/Makefile.in    |   4 +-
 gcc/asan.c         | 193 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 gcc/asan.h         |  31 ++++++++-
 gcc/cfgexpand.c    | 159 +++++++++++++++++++++++++++++++++++++------
 gcc/toplev.c       |   4 +-
 6 files changed, 400 insertions(+), 28 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 505bce9..23454f3 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,3 +1,40 @@
+2012-10-17  Jakub Jelinek  <jakub@redhat.com>
+
+	* Makefile.in (asan.o): Depend on $(EXPR_H) $(OPTABS_H).
+	(cfgexpand.o): Depend on asan.h.
+	* asan.c: Include expr.h and optabs.h.
+	(asan_shadow_set): New variable.
+	(asan_shadow_cst, asan_emit_stack_protection): New functions.
+	(asan_init_shadow_ptr_types): Initialize also asan_shadow_set.
+	* cfgexpand.c: Include asan.h.  Define HOST_WIDE_INT heap vector.
+	(partition_stack_vars): If i is large alignment and j small
+	alignment or vice versa, break out of the loop instead of continue,
+	and put the test earlier.  If flag_asan, break out of the loop
+	if for small alignment size is different.
+	(struct stack_vars_data): New type.
+	(expand_stack_vars): Add DATA argument.  Change PRED type to
+	function taking size_t argument instead of tree.  Adjust pred calls.
+	Fill DATA in and add needed padding in between variables if -fasan.
+	(defer_stack_allocation): Defer everything for flag_asan.
+	(stack_protect_decl_phase_1, stack_protect_decl_phase_2): Take
+	size_t index into stack_vars array instead of the decl directly.
+	(asan_decl_phase_3): New function.
+	(expand_used_vars): Return var destruction sequence.  Adjust
+	expand_stack_vars calls, add another one for flag_asan.  Call
+	asan_emit_stack_protection if expand_stack_vars added anything
+	to the vectors.
+	(expand_gimple_basic_block): Add disable_tail_calls argument.
+	(gimple_expand_cfg): Pass true to it if expand_used_vars returned
+	non-NULL.  Emit the sequence returned by expand_used_vars after
+	return_label.
+	* asan.h (asan_emit_stack_protection): New prototype.
+	(asan_shadow_set): New decl.
+	(ASAN_RED_ZONE_SIZE, ASAN_STACK_MAGIC_LEFT, ASAN_STACK_MAGIC_MIDDLE,
+	ASAN_STACK_MAGIC_RIGHT, ASAN_STACK_FRAME_MAGIC): Define.
+	(asan_protect_stack_decl): New inline.
+	* toplev.c (process_options): Also disable -fasan on
+	!FRAME_GROWS_DOWNWARDS targets.
+
 2012-10-12  Jakub Jelinek  <jakub@redhat.com>
 
 	* asan.c (build_check_stmt): Rename join_bb variable to else_bb.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 2ab1ca9..2743e24 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2213,7 +2213,7 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
 asan.o : asan.c asan.h $(CONFIG_H) pointer-set.h \
    $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \
    output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \
-   tree-pretty-print.h $(TARGET_H)
+   tree-pretty-print.h $(TARGET_H) $(EXPR_H) $(OPTABS_H)
 tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
    $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
    $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) \
@@ -3083,7 +3083,7 @@ cfgexpand.o : cfgexpand.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
    $(DIAGNOSTIC_H) toplev.h $(DIAGNOSTIC_CORE_H) $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
    value-prof.h $(TREE_INLINE_H) $(TARGET_H) $(SSAEXPAND_H) $(REGS_H) \
    $(GIMPLE_PRETTY_PRINT_H) $(BITMAP_H) sbitmap.h \
-   $(INSN_ATTR_H) $(CFGLOOP_H)
+   $(INSN_ATTR_H) $(CFGLOOP_H) asan.h
 cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_ERROR_H) \
    $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
    $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) $(INSN_ATTR_H) \
diff --git a/gcc/asan.c b/gcc/asan.c
index 66dc571..fe0e9a8 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -43,6 +43,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "asan.h"
 #include "gimple-pretty-print.h"
 #include "target.h"
+#include "expr.h"
+#include "optabs.h"
 
 /*
  AddressSanitizer finds out-of-bounds and use-after-free bugs 
@@ -79,10 +81,195 @@ along with GCC; see the file COPYING3.  If not see
  to create redzones for stack and global object and poison them.
 */
 
+alias_set_type asan_shadow_set = -1;
+
 /* Pointer types to 1 resp. 2 byte integers in shadow memory.  A separate
    alias set is used for all shadow memory accesses.  */
 static GTY(()) tree shadow_ptr_types[2];
 
+/* Return a CONST_INT representing 4 subsequent shadow memory bytes.  */
+
+static rtx
+asan_shadow_cst (unsigned char shadow_bytes[4])
+{
+  int i;
+  unsigned HOST_WIDE_INT val = 0;
+  gcc_assert (WORDS_BIG_ENDIAN == BYTES_BIG_ENDIAN);
+  for (i = 0; i < 4; i++)
+    val |= (unsigned HOST_WIDE_INT) shadow_bytes[BYTES_BIG_ENDIAN ? 3 - i : i]
+	   << (BITS_PER_UNIT * i);
+  return GEN_INT (trunc_int_for_mode (val, SImode));
+}
+
+/* Insert code to protect stack vars.  The prologue sequence should be emitted
+   directly, epilogue sequence returned.  BASE is the register holding the
+   stack base, against which OFFSETS array offsets are relative to, OFFSETS
+   array contains pairs of offsets in reverse order, always the end offset
+   of some gap that needs protection followed by starting offset,
+   and DECLS is an array of representative decls for each var partition.
+   LENGTH is the length of the OFFSETS array, DECLS array is LENGTH / 2 - 1
+   elements long (OFFSETS include gap before the first variable as well
+   as gaps after each stack variable).  */
+
+rtx
+asan_emit_stack_protection (rtx base, HOST_WIDE_INT *offsets, tree *decls,
+			    int length)
+{
+  rtx shadow_base, shadow_mem, ret, mem;
+  unsigned char shadow_bytes[4];
+  HOST_WIDE_INT base_offset = offsets[length - 1], offset, prev_offset;
+  HOST_WIDE_INT last_offset, last_size;
+  int l;
+  unsigned char cur_shadow_byte = ASAN_STACK_MAGIC_LEFT;
+  static pretty_printer pp;
+  static bool pp_initialized;
+  const char *buf;
+  size_t len;
+  tree str_cst;
+
+  /* First of all, prepare the description string.  */
+  if (!pp_initialized)
+    {
+      pp_construct (&pp, /* prefix */NULL, /* line-width */0);
+      pp_initialized = true;
+    }
+  pp_clear_output_area (&pp);
+  if (DECL_NAME (current_function_decl))
+    pp_base_tree_identifier (&pp, DECL_NAME (current_function_decl));
+  else
+    pp_string (&pp, "<unknown>");
+  pp_space (&pp);
+  pp_decimal_int (&pp, length / 2 - 1);
+  pp_space (&pp);
+  for (l = length - 2; l; l -= 2)
+    {
+      tree decl = decls[l / 2 - 1];
+      pp_wide_integer (&pp, offsets[l] - base_offset);
+      pp_space (&pp);
+      pp_wide_integer (&pp, offsets[l - 1] - offsets[l]);
+      pp_space (&pp);
+      if (DECL_P (decl) && DECL_NAME (decl))
+	{
+	  pp_decimal_int (&pp, IDENTIFIER_LENGTH (DECL_NAME (decl)));
+	  pp_space (&pp);
+	  pp_base_tree_identifier (&pp, DECL_NAME (decl));
+	}
+      else
+	pp_string (&pp, "9 <unknown>");
+      pp_space (&pp);
+    }
+  buf = pp_base_formatted_text (&pp);
+  len = strlen (buf);
+  str_cst = build_string (len + 1, buf);
+  TREE_TYPE (str_cst)
+    = build_array_type (char_type_node, build_index_type (size_int (len)));
+  TREE_READONLY (str_cst) = 1;
+  TREE_STATIC (str_cst) = 1;
+  str_cst = build1 (ADDR_EXPR, build_pointer_type (char_type_node), str_cst);
+
+  /* Emit the prologue sequence.  */
+  base = expand_binop (Pmode, add_optab, base, GEN_INT (base_offset),
+		       NULL_RTX, 1, OPTAB_DIRECT);
+  mem = gen_rtx_MEM (ptr_mode, base);
+  emit_move_insn (mem, GEN_INT (ASAN_STACK_FRAME_MAGIC));
+  mem = adjust_address (mem, VOIDmode, GET_MODE_SIZE (ptr_mode));
+  emit_move_insn (mem, expand_normal (str_cst));
+  shadow_base = expand_binop (Pmode, lshr_optab, base,
+			      GEN_INT (ASAN_SHADOW_SHIFT),
+			      NULL_RTX, 1, OPTAB_DIRECT);
+  shadow_base = expand_binop (Pmode, add_optab, shadow_base,
+			      GEN_INT (targetm.asan_shadow_offset ()),
+			      NULL_RTX, 1, OPTAB_DIRECT);
+  gcc_assert (asan_shadow_set != -1
+	      && (ASAN_RED_ZONE_SIZE >> ASAN_SHADOW_SHIFT) == 4);
+  shadow_mem = gen_rtx_MEM (SImode, shadow_base);
+  set_mem_alias_set (shadow_mem, asan_shadow_set);
+  prev_offset = base_offset;
+  for (l = length; l; l -= 2)
+    {
+      if (l == 2)
+	cur_shadow_byte = ASAN_STACK_MAGIC_RIGHT;
+      offset = offsets[l - 1];
+      if ((offset - base_offset) & (ASAN_RED_ZONE_SIZE - 1))
+	{
+	  int i;
+	  HOST_WIDE_INT aoff
+	    = base_offset + ((offset - base_offset)
+			     & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1));
+	  shadow_mem = adjust_address (shadow_mem, VOIDmode,
+				       (aoff - prev_offset)
+				       >> ASAN_SHADOW_SHIFT);
+	  prev_offset = aoff;
+	  for (i = 0; i < 4; i++, aoff += (1 << ASAN_SHADOW_SHIFT))
+	    if (aoff < offset)
+	      {
+		if (aoff < offset - (1 << ASAN_SHADOW_SHIFT) + 1)
+		  shadow_bytes[i] = 0;
+		else
+		  shadow_bytes[i] = offset - aoff;
+	      }
+	    else
+	      shadow_bytes[i] = ASAN_STACK_MAGIC_PARTIAL;
+	  emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
+	  offset = aoff;
+	}
+      while (offset <= offsets[l - 2] - ASAN_RED_ZONE_SIZE)
+	{
+	  shadow_mem = adjust_address (shadow_mem, VOIDmode,
+				       (offset - prev_offset)
+				       >> ASAN_SHADOW_SHIFT);
+	  prev_offset = offset;
+	  memset (shadow_bytes, cur_shadow_byte, 4);
+	  emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
+	  offset += ASAN_RED_ZONE_SIZE;
+	}
+      cur_shadow_byte = ASAN_STACK_MAGIC_MIDDLE;
+    }
+  do_pending_stack_adjust ();
+
+  /* Construct epilogue sequence.  */
+  start_sequence ();
+
+  shadow_mem = gen_rtx_MEM (BLKmode, shadow_base);
+  set_mem_alias_set (shadow_mem, asan_shadow_set);
+  prev_offset = base_offset;
+  last_offset = base_offset;
+  last_size = 0;
+  for (l = length; l; l -= 2)
+    {
+      offset = base_offset + ((offsets[l - 1] - base_offset)
+			     & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1));
+      if (last_offset + last_size != offset)
+	{
+	  shadow_mem = adjust_address (shadow_mem, VOIDmode,
+				       (last_offset - prev_offset)
+				       >> ASAN_SHADOW_SHIFT);
+	  prev_offset = last_offset;
+	  clear_storage (shadow_mem, GEN_INT (last_size >> ASAN_SHADOW_SHIFT),
+			 BLOCK_OP_NORMAL);
+	  last_offset = offset;
+	  last_size = 0;
+	}
+      last_size += base_offset + ((offsets[l - 2] - base_offset)
+				  & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1))
+		   - offset;
+    }
+  if (last_size)
+    {
+      shadow_mem = adjust_address (shadow_mem, VOIDmode,
+				   (last_offset - prev_offset)
+				   >> ASAN_SHADOW_SHIFT);
+      clear_storage (shadow_mem, GEN_INT (last_size >> ASAN_SHADOW_SHIFT),
+		     BLOCK_OP_NORMAL);
+    }
+
+  do_pending_stack_adjust ();
+
+  ret = get_insns ();
+  end_sequence ();
+  return ret;
+}
+
 /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
    IS_STORE is either 1 (for a store) or 0 (for a load).
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
@@ -401,12 +588,12 @@ asan_finish_file (void)
 static void
 asan_init_shadow_ptr_types (void)
 {
-  alias_set_type set = new_alias_set ();
+  asan_shadow_set = new_alias_set ();
   shadow_ptr_types[0] = build_distinct_type_copy (unsigned_char_type_node);
-  TYPE_ALIAS_SET (shadow_ptr_types[0]) = set;
+  TYPE_ALIAS_SET (shadow_ptr_types[0]) = asan_shadow_set;
   shadow_ptr_types[0] = build_pointer_type (shadow_ptr_types[0]);
   shadow_ptr_types[1] = build_distinct_type_copy (short_unsigned_type_node);
-  TYPE_ALIAS_SET (shadow_ptr_types[1]) = set;
+  TYPE_ALIAS_SET (shadow_ptr_types[1]) = asan_shadow_set;
   shadow_ptr_types[1] = build_pointer_type (shadow_ptr_types[1]);
 }
 
diff --git a/gcc/asan.h b/gcc/asan.h
index 0d9ab8b..6f0edbf 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -21,10 +21,39 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef TREE_ASAN
 #define TREE_ASAN
 
-extern void asan_finish_file(void);
+extern void asan_finish_file (void);
+extern rtx asan_emit_stack_protection (rtx, HOST_WIDE_INT *, tree *, int);
+
+/* Alias set for accessing the shadow memory.  */
+extern alias_set_type asan_shadow_set;
 
 /* Shadow memory is found at
    (address >> ASAN_SHADOW_SHIFT) + targetm.asan_shadow_offset ().  */
 #define ASAN_SHADOW_SHIFT	3
 
+/* Red zone size, stack and global variables are padded by ASAN_RED_ZONE_SIZE
+   up to 2 * ASAN_RED_ZONE_SIZE - 1 bytes.  */
+#define ASAN_RED_ZONE_SIZE	32
+
+/* Shadow memory values for stack protection.  Left is below protected vars,
+   the first pointer in stack corresponding to that offset contains
+   ASAN_STACK_FRAME_MAGIC word, the second pointer to a string describing
+   the frame.  Middle is for padding in between variables, right is
+   above the last protected variable and partial immediately after variables
+   up to ASAN_RED_ZONE_SIZE alignment.  */
+#define ASAN_STACK_MAGIC_LEFT		0xf1
+#define ASAN_STACK_MAGIC_MIDDLE		0xf2
+#define ASAN_STACK_MAGIC_RIGHT		0xf3
+#define ASAN_STACK_MAGIC_PARTIAL	0xf4
+
+#define ASAN_STACK_FRAME_MAGIC	0x41b58ab3
+
+/* Return true if DECL should be guarded on the stack.  */
+
+static inline bool
+asan_protect_stack_decl (tree decl)
+{
+  return DECL_P (decl) && !DECL_ARTIFICIAL (decl);
+}
+
 #endif /* TREE_ASAN */
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index e501b4b..67cf902 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -47,6 +47,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "regs.h" /* For reg_renumber.  */
 #include "insn-attr.h" /* For INSN_SCHEDULING.  */
+#include "asan.h"
 
 /* This variable holds information helping the rewriting of SSA trees
    into RTL.  */
@@ -736,6 +737,7 @@ partition_stack_vars (void)
     {
       size_t i = stack_vars_sorted[si];
       unsigned int ialign = stack_vars[i].alignb;
+      HOST_WIDE_INT isize = stack_vars[i].size;
 
       /* Ignore objects that aren't partition representatives. If we
          see a var that is not a partition representative, it must
@@ -747,19 +749,28 @@ partition_stack_vars (void)
 	{
 	  size_t j = stack_vars_sorted[sj];
 	  unsigned int jalign = stack_vars[j].alignb;
+	  HOST_WIDE_INT jsize = stack_vars[j].size;
 
 	  /* Ignore objects that aren't partition representatives.  */
 	  if (stack_vars[j].representative != j)
 	    continue;
 
-	  /* Ignore conflicting objects.  */
-	  if (stack_var_conflict_p (i, j))
-	    continue;
-
 	  /* Do not mix objects of "small" (supported) alignment
 	     and "large" (unsupported) alignment.  */
 	  if ((ialign * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
 	      != (jalign * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT))
+	    break;
+
+	  /* For Address Sanitizer do not mix objects with different
+	     sizes, as the shorter vars wouldn't be adequately protected.
+	     Don't do that for "large" (unsupported) alignment objects,
+	     those aren't protected anyway.  */
+	  if (flag_asan && isize != jsize
+	      && ialign * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
+	    break;
+
+	  /* Ignore conflicting objects.  */
+	  if (stack_var_conflict_p (i, j))
 	    continue;
 
 	  /* UNION the objects, placing J at OFFSET.  */
@@ -837,12 +848,26 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
   set_rtl (decl, x);
 }
 
+DEF_VEC_I(HOST_WIDE_INT);
+DEF_VEC_ALLOC_I(HOST_WIDE_INT,heap);
+
+struct stack_vars_data
+{
+  /* Vector of offset pairs, always end of some padding followed
+     by start of the padding that needs Address Sanitizer protection.
+     The vector is in reversed, highest offset pairs come first.  */
+  VEC(HOST_WIDE_INT, heap) *asan_vec;
+
+  /* Vector of partition representative decls in between the paddings.  */
+  VEC(tree, heap) *asan_decl_vec;
+};
+
 /* A subroutine of expand_used_vars.  Give each partition representative
    a unique location within the stack frame.  Update each partition member
    with that location.  */
 
 static void
-expand_stack_vars (bool (*pred) (tree))
+expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
 {
   size_t si, i, j, n = stack_vars_num;
   HOST_WIDE_INT large_size = 0, large_alloc = 0;
@@ -913,13 +938,45 @@ expand_stack_vars (bool (*pred) (tree))
 
       /* Check the predicate to see whether this variable should be
 	 allocated in this pass.  */
-      if (pred && !pred (decl))
+      if (pred && !pred (i))
 	continue;
 
       alignb = stack_vars[i].alignb;
       if (alignb * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
 	{
-	  offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
+	  if (flag_asan && pred)
+	    {
+	      HOST_WIDE_INT prev_offset = frame_offset;
+	      tree repr_decl = NULL_TREE;
+
+	      offset
+		= alloc_stack_frame_space (stack_vars[i].size
+					   + ASAN_RED_ZONE_SIZE,
+					   MAX (alignb, ASAN_RED_ZONE_SIZE));
+	      VEC_safe_push (HOST_WIDE_INT, heap, data->asan_vec,
+			     prev_offset);
+	      VEC_safe_push (HOST_WIDE_INT, heap, data->asan_vec,
+			     offset + stack_vars[i].size);
+	      /* Find best representative of the partition.
+		 Prefer those with DECL_NAME, even better
+		 satisfying asan_protect_stack_decl predicate.  */
+	      for (j = i; j != EOC; j = stack_vars[j].next)
+		if (asan_protect_stack_decl (stack_vars[j].decl)
+		    && DECL_NAME (stack_vars[j].decl))
+		  {
+		    repr_decl = stack_vars[j].decl;
+		    break;
+		  }
+		else if (repr_decl == NULL_TREE
+			 && DECL_P (stack_vars[j].decl)
+			 && DECL_NAME (stack_vars[j].decl))
+		  repr_decl = stack_vars[j].decl;
+	      if (repr_decl == NULL_TREE)
+		repr_decl = stack_vars[i].decl;
+	      VEC_safe_push (tree, heap, data->asan_decl_vec, repr_decl);
+	    }
+	  else
+	    offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
 	  base = virtual_stack_vars_rtx;
 	  base_align = crtl->max_used_stack_slot_alignment;
 	}
@@ -1057,8 +1114,9 @@ static bool
 defer_stack_allocation (tree var, bool toplevel)
 {
   /* If stack protection is enabled, *all* stack variables must be deferred,
-     so that we can re-order the strings to the top of the frame.  */
-  if (flag_stack_protect)
+     so that we can re-order the strings to the top of the frame.
+     Similarly for Address Sanitizer.  */
+  if (flag_stack_protect || flag_asan)
     return true;
 
   /* We handle "large" alignment via dynamic allocation.  We want to handle
@@ -1329,15 +1387,31 @@ stack_protect_decl_phase (tree decl)
    as callbacks for expand_stack_vars.  */
 
 static bool
-stack_protect_decl_phase_1 (tree decl)
+stack_protect_decl_phase_1 (size_t i)
 {
-  return stack_protect_decl_phase (decl) == 1;
+  return stack_protect_decl_phase (stack_vars[i].decl) == 1;
 }
 
 static bool
-stack_protect_decl_phase_2 (tree decl)
+stack_protect_decl_phase_2 (size_t i)
 {
-  return stack_protect_decl_phase (decl) == 2;
+  return stack_protect_decl_phase (stack_vars[i].decl) == 2;
+}
+
+/* And helper function that checks for asan phase (with stack protector
+   it is phase 3).  This is used as callback for expand_stack_vars.
+   Returns true if any of the vars in the partition need to be protected.  */
+
+static bool
+asan_decl_phase_3 (size_t i)
+{
+  while (i != EOC)
+    {
+      if (asan_protect_stack_decl (stack_vars[i].decl))
+	return true;
+      i = stack_vars[i].next;
+    }
+  return false;
 }
 
 /* Ensure that variables in different stack protection phases conflict
@@ -1448,11 +1522,12 @@ estimated_stack_frame_size (struct cgraph_node *node)
 
 /* Expand all variables used in the function.  */
 
-static void
+static rtx
 expand_used_vars (void)
 {
   tree var, outer_block = DECL_INITIAL (current_function_decl);
   VEC(tree,heap) *maybe_local_decls = NULL;
+  rtx var_end_seq = NULL_RTX;
   struct pointer_map_t *ssa_name_decls;
   unsigned i;
   unsigned len;
@@ -1603,6 +1678,11 @@ expand_used_vars (void)
   /* Assign rtl to each variable based on these partitions.  */
   if (stack_vars_num > 0)
     {
+      struct stack_vars_data data;
+
+      data.asan_vec = NULL;
+      data.asan_decl_vec = NULL;
+
       /* Reorder decls to be protected by iterating over the variables
 	 array multiple times, and allocating out of each phase in turn.  */
       /* ??? We could probably integrate this into the qsort we did
@@ -1611,14 +1691,41 @@ expand_used_vars (void)
       if (has_protected_decls)
 	{
 	  /* Phase 1 contains only character arrays.  */
-	  expand_stack_vars (stack_protect_decl_phase_1);
+	  expand_stack_vars (stack_protect_decl_phase_1, &data);
 
 	  /* Phase 2 contains other kinds of arrays.  */
 	  if (flag_stack_protect == 2)
-	    expand_stack_vars (stack_protect_decl_phase_2);
+	    expand_stack_vars (stack_protect_decl_phase_2, &data);
+	}
+
+      if (flag_asan)
+	/* Phase 3, any partitions that need asan protection
+	   in addition to phase 1 and 2.  */
+	expand_stack_vars (asan_decl_phase_3, &data);
+
+      if (!VEC_empty (HOST_WIDE_INT, data.asan_vec))
+	{
+	  HOST_WIDE_INT prev_offset = frame_offset;
+	  HOST_WIDE_INT offset
+	    = alloc_stack_frame_space (ASAN_RED_ZONE_SIZE,
+				       ASAN_RED_ZONE_SIZE);
+	  VEC_safe_push (HOST_WIDE_INT, heap, data.asan_vec, prev_offset);
+	  VEC_safe_push (HOST_WIDE_INT, heap, data.asan_vec, offset);
+
+	  var_end_seq
+	    = asan_emit_stack_protection (virtual_stack_vars_rtx,
+					  VEC_address (HOST_WIDE_INT,
+						       data.asan_vec),
+					  VEC_address (tree,
+						       data.asan_decl_vec),
+					  VEC_length (HOST_WIDE_INT,
+						      data.asan_vec));
 	}
 
-      expand_stack_vars (NULL);
+      expand_stack_vars (NULL, &data);
+
+      VEC_free (HOST_WIDE_INT, heap, data.asan_vec);
+      VEC_free (tree, heap, data.asan_decl_vec);
     }
 
   fini_vars_expansion ();
@@ -1645,6 +1752,8 @@ expand_used_vars (void)
 	frame_offset += align - 1;
       frame_offset &= -align;
     }
+
+  return var_end_seq;
 }
 
 
@@ -3662,7 +3771,7 @@ expand_debug_locations (void)
 /* Expand basic block BB from GIMPLE trees to RTL.  */
 
 static basic_block
-expand_gimple_basic_block (basic_block bb)
+expand_gimple_basic_block (basic_block bb, bool disable_tail_calls)
 {
   gimple_stmt_iterator gsi;
   gimple_seq stmts;
@@ -3950,6 +4059,11 @@ expand_gimple_basic_block (basic_block bb)
 	}
       else
 	{
+	  if (is_gimple_call (stmt)
+	      && gimple_call_tail_p (stmt)
+	      && disable_tail_calls)
+	    gimple_call_set_tail (stmt, false);
+
 	  if (is_gimple_call (stmt) && gimple_call_tail_p (stmt))
 	    {
 	      bool can_fallthru;
@@ -4309,7 +4423,7 @@ gimple_expand_cfg (void)
   sbitmap blocks;
   edge_iterator ei;
   edge e;
-  rtx var_seq;
+  rtx var_seq, var_ret_seq;
   unsigned i;
 
   timevar_push (TV_OUT_OF_SSA);
@@ -4369,7 +4483,7 @@ gimple_expand_cfg (void)
   timevar_push (TV_VAR_EXPAND);
   start_sequence ();
 
-  expand_used_vars ();
+  var_ret_seq = expand_used_vars ();
 
   var_seq = get_insns ();
   end_sequence ();
@@ -4495,7 +4609,7 @@ gimple_expand_cfg (void)
 
   lab_rtx_for_bb = pointer_map_create ();
   FOR_BB_BETWEEN (bb, init_block->next_bb, EXIT_BLOCK_PTR, next_bb)
-    bb = expand_gimple_basic_block (bb);
+    bb = expand_gimple_basic_block (bb, var_ret_seq != NULL_RTX);
 
   if (MAY_HAVE_DEBUG_INSNS)
     expand_debug_locations ();
@@ -4523,6 +4637,9 @@ gimple_expand_cfg (void)
   construct_exit_block ();
   insn_locations_finalize ();
 
+  if (var_ret_seq)
+    emit_insn_after (var_ret_seq, return_label);
+
   /* Zap the tree EH table.  */
   set_eh_throw_stmt_table (cfun, NULL);
 
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 68849f5..0fa8ce3 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1542,7 +1542,9 @@ process_options (void)
     }
 
   /* Address Sanitizer needs porting to each target architecture.  */
-  if (flag_asan && targetm.asan_shadow_offset == NULL)
+  if (flag_asan
+      && (targetm.asan_shadow_offset == NULL
+	  || !FRAME_GROWS_DOWNWARD))
     {
       warning (0, "-fasan not supported for this target");
       flag_asan = 0;
-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 02/13] Rename tree-asan.[ch] to asan.[ch]
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
                   ` (3 preceding siblings ...)
  2012-11-01 19:53 ` [PATCH 06/13] Implement protection of stack variables dodji
@ 2012-11-01 19:53 ` dodji
  2012-11-01 21:54   ` Joseph S. Myers
  2012-11-01 19:53 ` [PATCH 01/13] Initial import of asan from the Google branch dodji
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 80+ messages in thread
From: dodji @ 2012-11-01 19:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

From: dnovillo <dnovillo@138bc75d-0d04-0410-961f-82ee72b054a4>

Following a discussion we had on this list, this patch renames the
file tree-asan.* into asan.*.

    	* asan.c: Rename from tree-asan.c.
    	Update all users.
    	* asan.h: Rename from tree-asan.h
    	Update all users.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192360 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |   7 +
 gcc/Makefile.in    |   4 +-
 gcc/asan.c         | 403 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/asan.h         |  26 ++++
 gcc/toplev.c       |   2 +-
 gcc/tree-asan.c    | 403 -----------------------------------------------------
 gcc/tree-asan.h    |  26 ----
 7 files changed, 439 insertions(+), 432 deletions(-)
 create mode 100644 gcc/asan.c
 create mode 100644 gcc/asan.h
 delete mode 100644 gcc/tree-asan.c
 delete mode 100644 gcc/tree-asan.h

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 40299e2..c196bfe 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,3 +1,10 @@
+2012-10-10  Diego Novillo  <dnovillo@google.com>
+
+	* asan.c: Rename from tree-asan.c.
+	Update all users.
+	* asan.h: Rename from tree-asan.h
+	Update all users.
+
 2012-10-10  Wei Mi <wmi@google.com>
 
 	* Makefile.in: Add tree-asan.c.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index e8c4a19..a9da161 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1350,7 +1350,7 @@ OBJS = \
 	tracer.o \
 	trans-mem.o \
 	tree-affine.o \
-	tree-asan.o \
+	asan.o \
 	tree-call-cdce.o \
 	tree-cfg.o \
 	tree-cfgcleanup.o \
@@ -2210,7 +2210,7 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
    $(TREE_H) $(PARAMS_H) $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(RTL_H) \
    $(GGC_H) $(TM_P_H) $(TARGET_H) langhooks.h $(REGS_H) gt-stor-layout.h \
    $(DIAGNOSTIC_CORE_H) $(CGRAPH_H) $(TREE_INLINE_H) $(TREE_DUMP_H) $(GIMPLE_H)
-tree-asan.o : tree-asan.c tree-asan.h $(CONFIG_H) pointer-set.h \
+asan.o : asan.c asan.h $(CONFIG_H) pointer-set.h \
    $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \
    output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \
    tree-pretty-print.h
diff --git a/gcc/asan.c b/gcc/asan.c
new file mode 100644
index 0000000..a6ceb57
--- /dev/null
+++ b/gcc/asan.c
@@ -0,0 +1,403 @@
+/* AddressSanitizer, a fast memory error detector.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Kostya Serebryany <kcc@google.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "tm_p.h"
+#include "basic-block.h"
+#include "flags.h"
+#include "function.h"
+#include "tree-inline.h"
+#include "gimple.h"
+#include "tree-iterator.h"
+#include "tree-flow.h"
+#include "tree-dump.h"
+#include "tree-pass.h"
+#include "diagnostic.h"
+#include "demangle.h"
+#include "langhooks.h"
+#include "ggc.h"
+#include "cgraph.h"
+#include "gimple.h"
+#include "asan.h"
+#include "gimple-pretty-print.h"
+
+/*
+ AddressSanitizer finds out-of-bounds and use-after-free bugs 
+ with <2x slowdown on average.
+
+ The tool consists of two parts:
+ instrumentation module (this file) and a run-time library.
+ The instrumentation module adds a run-time check before every memory insn.
+   For a 8- or 16- byte load accessing address X:
+     ShadowAddr = (X >> 3) + Offset
+     ShadowValue = *(char*)ShadowAddr;  // *(short*) for 16-byte access.
+     if (ShadowValue)
+       __asan_report_load8(X);
+   For a load of N bytes (N=1, 2 or 4) from address X:
+     ShadowAddr = (X >> 3) + Offset
+     ShadowValue = *(char*)ShadowAddr;
+     if (ShadowValue)
+       if ((X & 7) + N - 1 > ShadowValue)
+         __asan_report_loadN(X);
+ Stores are instrumented similarly, but using __asan_report_storeN functions.
+ A call too __asan_init() is inserted to the list of module CTORs.
+
+ The run-time library redefines malloc (so that redzone are inserted around
+ the allocated memory) and free (so that reuse of free-ed memory is delayed),
+ provides __asan_report* and __asan_init functions.
+
+ Read more:
+ http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm
+
+ Future work:
+ The current implementation supports only detection of out-of-bounds and
+ use-after-free bugs in heap.
+ In order to support out-of-bounds for stack and globals we will need
+ to create redzones for stack and global object and poison them.
+*/
+
+/* The shadow address is computed as (X>>asan_scale) + (1<<asan_offset_log).
+ We may want to add command line flags to change these values.  */
+
+static const int asan_scale = 3;
+static const int asan_offset_log_32 = 29;
+static const int asan_offset_log_64 = 44;
+static int asan_offset_log;
+
+
+/* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
+   IS_STORE is either 1 (for a store) or 0 (for a load).
+   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
+
+static tree
+report_error_func (int is_store, int size_in_bytes)
+{
+  tree fn_type;
+  tree def;
+  char name[100];
+
+  sprintf (name, "__asan_report_%s%d\n",
+           is_store ? "store" : "load", size_in_bytes);
+  fn_type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
+  def = build_fn_decl (name, fn_type);
+  TREE_NOTHROW (def) = 1;
+  TREE_THIS_VOLATILE (def) = 1;  /* Attribute noreturn. Surprise!  */
+  DECL_ATTRIBUTES (def) = tree_cons (get_identifier ("leaf"), 
+                                     NULL, DECL_ATTRIBUTES (def));
+  DECL_ASSEMBLER_NAME (def);
+  return def;
+}
+
+/* Construct a function tree for __asan_init().  */
+
+static tree
+asan_init_func (void)
+{
+  tree fn_type;
+  tree def;
+
+  fn_type = build_function_type_list (void_type_node, NULL_TREE);
+  def = build_fn_decl ("__asan_init", fn_type);
+  TREE_NOTHROW (def) = 1;
+  DECL_ASSEMBLER_NAME (def);
+  return def;
+}
+
+
+/* Instrument the memory access instruction BASE.
+   Insert new statements before ITER.
+   LOCATION is source code location.
+   IS_STORE is either 1 (for a store) or 0 (for a load).
+   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
+
+static void
+build_check_stmt (tree base,
+                  gimple_stmt_iterator *iter,
+                  location_t location, int is_store, int size_in_bytes)
+{
+  gimple_stmt_iterator gsi;
+  basic_block cond_bb, then_bb, join_bb;
+  edge e;
+  tree cond, t, u;
+  tree base_addr;
+  tree shadow_value;
+  gimple g;
+  gimple_seq seq, stmts;
+  tree shadow_type = size_in_bytes == 16 ?
+      short_integer_type_node : char_type_node;
+  tree shadow_ptr_type = build_pointer_type (shadow_type);
+  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode,
+                                                      /*unsignedp=*/true);
+
+  /* We first need to split the current basic block, and start altering
+     the CFG.  This allows us to insert the statements we're about to
+     construct into the right basic blocks.  */
+
+  cond_bb = gimple_bb (gsi_stmt (*iter));
+  gsi = *iter;
+  gsi_prev (&gsi);
+  if (!gsi_end_p (gsi))
+    e = split_block (cond_bb, gsi_stmt (gsi));
+  else
+    e = split_block_after_labels (cond_bb);
+  cond_bb = e->src;
+  join_bb = e->dest;
+
+  /* A recap at this point: join_bb is the basic block at whose head
+     is the gimple statement for which this check expression is being
+     built.  cond_bb is the (possibly new, synthetic) basic block the
+     end of which will contain the cache-lookup code, and a
+     conditional that jumps to the cache-miss code or, much more
+     likely, over to join_bb.  */
+
+  /* Create the bb that contains the crash block.  */
+  then_bb = create_empty_bb (cond_bb);
+  make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  make_single_succ_edge (then_bb, join_bb, EDGE_FALLTHRU);
+
+  /* Mark the pseudo-fallthrough edge from cond_bb to join_bb.  */
+  e = find_edge (cond_bb, join_bb);
+  e->flags = EDGE_FALSE_VALUE;
+  e->count = cond_bb->count;
+  e->probability = REG_BR_PROB_BASE;
+
+  /* Update dominance info.  Note that bb_join's data was
+     updated by split_block.  */
+  if (dom_info_available_p (CDI_DOMINATORS))
+    {
+      set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
+      set_immediate_dominator (CDI_DOMINATORS, join_bb, cond_bb);
+    }
+
+  base_addr = create_tmp_reg (uintptr_type, "__asan_base_addr");
+
+  seq = NULL; 
+  t = fold_convert_loc (location, uintptr_type,
+                        unshare_expr (base));
+  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
+  gimple_seq_add_seq (&seq, stmts);
+  g = gimple_build_assign (base_addr, t);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+
+  /* Build (base_addr >> asan_scale) + (1 << asan_offset_log).  */
+
+  t = build2 (RSHIFT_EXPR, uintptr_type, base_addr,
+              build_int_cst (uintptr_type, asan_scale));
+  t = build2 (PLUS_EXPR, uintptr_type, t,
+              build2 (LSHIFT_EXPR, uintptr_type,
+                      build_int_cst (uintptr_type, 1),
+                      build_int_cst (uintptr_type, asan_offset_log)
+                     ));
+  t = build1 (INDIRECT_REF, shadow_type,
+              build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t));
+  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
+  gimple_seq_add_seq (&seq, stmts);
+  shadow_value = create_tmp_reg (shadow_type, "__asan_shadow");
+  g = gimple_build_assign (shadow_value, t);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+  t = build2 (NE_EXPR, boolean_type_node, shadow_value,
+              build_int_cst (shadow_type, 0));
+  if (size_in_bytes < 8)
+    {
+
+      /* Slow path for 1-, 2- and 4- byte accesses.
+         Build ((base_addr & 7) + (size_in_bytes - 1)) >= shadow_value.  */
+
+      u = build2 (BIT_AND_EXPR, uintptr_type,
+                  base_addr,
+                  build_int_cst (uintptr_type, 7));
+      u = build1 (CONVERT_EXPR, shadow_type, u);
+      u = build2 (PLUS_EXPR, shadow_type, u,
+                  build_int_cst (shadow_type, size_in_bytes - 1));
+      u = build2 (GE_EXPR, uintptr_type, u, shadow_value);
+    }
+  else
+      u = build_int_cst (boolean_type_node, 1);
+  t = build2 (TRUTH_AND_EXPR, boolean_type_node, t, u);
+  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
+  gimple_seq_add_seq (&seq, stmts);
+  cond = create_tmp_reg (boolean_type_node, "__asan_crash_cond");
+  g = gimple_build_assign  (cond, t);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+  g = gimple_build_cond (NE_EXPR, cond, boolean_false_node, NULL_TREE,
+                         NULL_TREE);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+
+  /* Generate call to the run-time library (e.g. __asan_report_load8).  */
+
+  gsi = gsi_last_bb (cond_bb);
+  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
+  seq = NULL; 
+  g = gimple_build_call (report_error_func (is_store, size_in_bytes),
+                         1, base_addr);
+  gimple_seq_add_stmt (&seq, g);
+
+  /* Insert the check code in the THEN block.  */
+
+  gsi = gsi_start_bb (then_bb);
+  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
+
+  *iter = gsi_start_bb (join_bb);
+}
+
+/* If T represents a memory access, add instrumentation code before ITER.
+   LOCATION is source code location.
+   IS_STORE is either 1 (for a store) or 0 (for a load).  */
+
+static void
+instrument_derefs (gimple_stmt_iterator *iter, tree t,
+                  location_t location, int is_store)
+{
+  tree type, base;
+  int size_in_bytes;
+
+  type = TREE_TYPE (t);
+  if (type == error_mark_node)
+    return;
+  switch (TREE_CODE (t))
+    {
+    case ARRAY_REF:
+    case COMPONENT_REF:
+    case INDIRECT_REF:
+    case MEM_REF:
+      break;
+    default:
+      return;
+    }
+  size_in_bytes = tree_low_cst (TYPE_SIZE (type), 0) / BITS_PER_UNIT;
+  if (size_in_bytes != 1 && size_in_bytes != 2 &&
+      size_in_bytes != 4 && size_in_bytes != 8 && size_in_bytes != 16)
+      return;
+  {
+    /* For now just avoid instrumenting bit field acceses.
+     Fixing it is doable, but expected to be messy.  */
+
+    HOST_WIDE_INT bitsize, bitpos;
+    tree offset;
+    enum machine_mode mode;
+    int volatilep = 0, unsignedp = 0;
+    get_inner_reference (t, &bitsize, &bitpos, &offset,
+                         &mode, &unsignedp, &volatilep, false);
+    if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
+        return;
+  }
+
+  base = build_addr (t, current_function_decl);
+  build_check_stmt (base, iter, location, is_store, size_in_bytes);
+}
+
+/* asan: this looks too complex. Can this be done simpler? */
+/* Transform
+   1) Memory references.
+   2) BUILTIN_ALLOCA calls.
+*/
+
+static void
+transform_statements (void)
+{
+  basic_block bb;
+  gimple_stmt_iterator i;
+  int saved_last_basic_block = last_basic_block;
+  enum gimple_rhs_class grhs_class;
+
+  FOR_EACH_BB (bb)
+    {
+      if (bb->index >= saved_last_basic_block) continue;
+      for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+        {
+          gimple s = gsi_stmt (i);
+          if (gimple_code (s) != GIMPLE_ASSIGN)
+              continue;
+          instrument_derefs (&i, gimple_assign_lhs (s),
+                             gimple_location (s), 1);
+          instrument_derefs (&i, gimple_assign_rhs1 (s),
+                             gimple_location (s), 0);
+          grhs_class = get_gimple_rhs_class (gimple_assign_rhs_code (s));
+          if (grhs_class == GIMPLE_BINARY_RHS)
+            instrument_derefs (&i, gimple_assign_rhs2 (s),
+                               gimple_location (s), 0);
+        }
+    }
+}
+
+/* Module-level instrumentation.
+   - Insert __asan_init() into the list of CTORs.
+   - TODO: insert redzones around globals.
+ */
+
+void
+asan_finish_file (void)
+{
+  tree ctor_statements = NULL_TREE;
+  append_to_statement_list (build_call_expr (asan_init_func (), 0),
+                            &ctor_statements);
+  cgraph_build_static_cdtor ('I', ctor_statements,
+                             MAX_RESERVED_INIT_PRIORITY - 1);
+}
+
+/* Instrument the current function.  */
+
+static unsigned int
+asan_instrument (void)
+{
+  struct gimplify_ctx gctx;
+  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode, true);
+  int is_64 = tree_low_cst (TYPE_SIZE (uintptr_type), 0) == 64;
+  asan_offset_log = is_64 ? asan_offset_log_64 : asan_offset_log_32;
+  push_gimplify_context (&gctx);
+  transform_statements ();
+  pop_gimplify_context (NULL);
+  return 0;
+}
+
+static bool
+gate_asan (void)
+{
+  return flag_asan != 0;
+}
+
+struct gimple_opt_pass pass_asan =
+{
+ {
+  GIMPLE_PASS,
+  "asan",                               /* name  */
+  gate_asan,                            /* gate  */
+  asan_instrument,                      /* execute  */
+  NULL,                                 /* sub  */
+  NULL,                                 /* next  */
+  0,                                    /* static_pass_number  */
+  TV_NONE,                              /* tv_id  */
+  PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required  */
+  0,                                    /* properties_provided  */
+  0,                                    /* properties_destroyed  */
+  0,                                    /* todo_flags_start  */
+  TODO_verify_flow | TODO_verify_stmts
+  | TODO_update_ssa    /* todo_flags_finish  */
+ }
+};
diff --git a/gcc/asan.h b/gcc/asan.h
new file mode 100644
index 0000000..590cf35
--- /dev/null
+++ b/gcc/asan.h
@@ -0,0 +1,26 @@
+/* AddressSanitizer, a fast memory error detector.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Kostya Serebryany <kcc@google.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef TREE_ASAN
+#define TREE_ASAN
+
+extern void asan_finish_file(void);
+
+#endif /* TREE_ASAN */
diff --git a/gcc/toplev.c b/gcc/toplev.c
index b1aff0c..3ca0736 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -72,7 +72,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "value-prof.h"
 #include "alloc-pool.h"
 #include "tree-mudflap.h"
-#include "tree-asan.h"
+#include "asan.h"
 #include "gimple.h"
 #include "tree-ssa-alias.h"
 #include "plugin.h"
diff --git a/gcc/tree-asan.c b/gcc/tree-asan.c
deleted file mode 100644
index a8841d6..0000000
--- a/gcc/tree-asan.c
+++ /dev/null
@@ -1,403 +0,0 @@
-/* AddressSanitizer, a fast memory error detector.
-   Copyright (C) 2011 Free Software Foundation, Inc.
-   Contributed by Kostya Serebryany <kcc@google.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify it under
-the terms of the GNU General Public License as published by the Free
-Software Foundation; either version 3, or (at your option) any later
-version.
-
-GCC is distributed in the hope that it will be useful, but WITHOUT ANY
-WARRANTY; without even the implied warranty of MERCHANTABILITY or
-FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
-for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-<http://www.gnu.org/licenses/>.  */
-
-
-#include "config.h"
-#include "system.h"
-#include "coretypes.h"
-#include "tm.h"
-#include "tree.h"
-#include "tm_p.h"
-#include "basic-block.h"
-#include "flags.h"
-#include "function.h"
-#include "tree-inline.h"
-#include "gimple.h"
-#include "tree-iterator.h"
-#include "tree-flow.h"
-#include "tree-dump.h"
-#include "tree-pass.h"
-#include "diagnostic.h"
-#include "demangle.h"
-#include "langhooks.h"
-#include "ggc.h"
-#include "cgraph.h"
-#include "gimple.h"
-#include "tree-asan.h"
-#include "gimple-pretty-print.h"
-
-/*
- AddressSanitizer finds out-of-bounds and use-after-free bugs 
- with <2x slowdown on average.
-
- The tool consists of two parts:
- instrumentation module (this file) and a run-time library.
- The instrumentation module adds a run-time check before every memory insn.
-   For a 8- or 16- byte load accessing address X:
-     ShadowAddr = (X >> 3) + Offset
-     ShadowValue = *(char*)ShadowAddr;  // *(short*) for 16-byte access.
-     if (ShadowValue)
-       __asan_report_load8(X);
-   For a load of N bytes (N=1, 2 or 4) from address X:
-     ShadowAddr = (X >> 3) + Offset
-     ShadowValue = *(char*)ShadowAddr;
-     if (ShadowValue)
-       if ((X & 7) + N - 1 > ShadowValue)
-         __asan_report_loadN(X);
- Stores are instrumented similarly, but using __asan_report_storeN functions.
- A call too __asan_init() is inserted to the list of module CTORs.
-
- The run-time library redefines malloc (so that redzone are inserted around
- the allocated memory) and free (so that reuse of free-ed memory is delayed),
- provides __asan_report* and __asan_init functions.
-
- Read more:
- http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm
-
- Future work:
- The current implementation supports only detection of out-of-bounds and
- use-after-free bugs in heap.
- In order to support out-of-bounds for stack and globals we will need
- to create redzones for stack and global object and poison them.
-*/
-
-/* The shadow address is computed as (X>>asan_scale) + (1<<asan_offset_log).
- We may want to add command line flags to change these values.  */
-
-static const int asan_scale = 3;
-static const int asan_offset_log_32 = 29;
-static const int asan_offset_log_64 = 44;
-static int asan_offset_log;
-
-
-/* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
-   IS_STORE is either 1 (for a store) or 0 (for a load).
-   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
-
-static tree
-report_error_func (int is_store, int size_in_bytes)
-{
-  tree fn_type;
-  tree def;
-  char name[100];
-
-  sprintf (name, "__asan_report_%s%d\n",
-           is_store ? "store" : "load", size_in_bytes);
-  fn_type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
-  def = build_fn_decl (name, fn_type);
-  TREE_NOTHROW (def) = 1;
-  TREE_THIS_VOLATILE (def) = 1;  /* Attribute noreturn. Surprise!  */
-  DECL_ATTRIBUTES (def) = tree_cons (get_identifier ("leaf"), 
-                                     NULL, DECL_ATTRIBUTES (def));
-  DECL_ASSEMBLER_NAME (def);
-  return def;
-}
-
-/* Construct a function tree for __asan_init().  */
-
-static tree
-asan_init_func (void)
-{
-  tree fn_type;
-  tree def;
-
-  fn_type = build_function_type_list (void_type_node, NULL_TREE);
-  def = build_fn_decl ("__asan_init", fn_type);
-  TREE_NOTHROW (def) = 1;
-  DECL_ASSEMBLER_NAME (def);
-  return def;
-}
-
-
-/* Instrument the memory access instruction BASE.
-   Insert new statements before ITER.
-   LOCATION is source code location.
-   IS_STORE is either 1 (for a store) or 0 (for a load).
-   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
-
-static void
-build_check_stmt (tree base,
-                  gimple_stmt_iterator *iter,
-                  location_t location, int is_store, int size_in_bytes)
-{
-  gimple_stmt_iterator gsi;
-  basic_block cond_bb, then_bb, join_bb;
-  edge e;
-  tree cond, t, u;
-  tree base_addr;
-  tree shadow_value;
-  gimple g;
-  gimple_seq seq, stmts;
-  tree shadow_type = size_in_bytes == 16 ?
-      short_integer_type_node : char_type_node;
-  tree shadow_ptr_type = build_pointer_type (shadow_type);
-  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode,
-                                                      /*unsignedp=*/true);
-
-  /* We first need to split the current basic block, and start altering
-     the CFG.  This allows us to insert the statements we're about to
-     construct into the right basic blocks.  */
-
-  cond_bb = gimple_bb (gsi_stmt (*iter));
-  gsi = *iter;
-  gsi_prev (&gsi);
-  if (!gsi_end_p (gsi))
-    e = split_block (cond_bb, gsi_stmt (gsi));
-  else
-    e = split_block_after_labels (cond_bb);
-  cond_bb = e->src;
-  join_bb = e->dest;
-
-  /* A recap at this point: join_bb is the basic block at whose head
-     is the gimple statement for which this check expression is being
-     built.  cond_bb is the (possibly new, synthetic) basic block the
-     end of which will contain the cache-lookup code, and a
-     conditional that jumps to the cache-miss code or, much more
-     likely, over to join_bb.  */
-
-  /* Create the bb that contains the crash block.  */
-  then_bb = create_empty_bb (cond_bb);
-  make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
-  make_single_succ_edge (then_bb, join_bb, EDGE_FALLTHRU);
-
-  /* Mark the pseudo-fallthrough edge from cond_bb to join_bb.  */
-  e = find_edge (cond_bb, join_bb);
-  e->flags = EDGE_FALSE_VALUE;
-  e->count = cond_bb->count;
-  e->probability = REG_BR_PROB_BASE;
-
-  /* Update dominance info.  Note that bb_join's data was
-     updated by split_block.  */
-  if (dom_info_available_p (CDI_DOMINATORS))
-    {
-      set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
-      set_immediate_dominator (CDI_DOMINATORS, join_bb, cond_bb);
-    }
-
-  base_addr = create_tmp_reg (uintptr_type, "__asan_base_addr");
-
-  seq = NULL; 
-  t = fold_convert_loc (location, uintptr_type,
-                        unshare_expr (base));
-  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
-  gimple_seq_add_seq (&seq, stmts);
-  g = gimple_build_assign (base_addr, t);
-  gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
-
-  /* Build (base_addr >> asan_scale) + (1 << asan_offset_log).  */
-
-  t = build2 (RSHIFT_EXPR, uintptr_type, base_addr,
-              build_int_cst (uintptr_type, asan_scale));
-  t = build2 (PLUS_EXPR, uintptr_type, t,
-              build2 (LSHIFT_EXPR, uintptr_type,
-                      build_int_cst (uintptr_type, 1),
-                      build_int_cst (uintptr_type, asan_offset_log)
-                     ));
-  t = build1 (INDIRECT_REF, shadow_type,
-              build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t));
-  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
-  gimple_seq_add_seq (&seq, stmts);
-  shadow_value = create_tmp_reg (shadow_type, "__asan_shadow");
-  g = gimple_build_assign (shadow_value, t);
-  gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
-  t = build2 (NE_EXPR, boolean_type_node, shadow_value,
-              build_int_cst (shadow_type, 0));
-  if (size_in_bytes < 8)
-    {
-
-      /* Slow path for 1-, 2- and 4- byte accesses.
-         Build ((base_addr & 7) + (size_in_bytes - 1)) >= shadow_value.  */
-
-      u = build2 (BIT_AND_EXPR, uintptr_type,
-                  base_addr,
-                  build_int_cst (uintptr_type, 7));
-      u = build1 (CONVERT_EXPR, shadow_type, u);
-      u = build2 (PLUS_EXPR, shadow_type, u,
-                  build_int_cst (shadow_type, size_in_bytes - 1));
-      u = build2 (GE_EXPR, uintptr_type, u, shadow_value);
-    }
-  else
-      u = build_int_cst (boolean_type_node, 1);
-  t = build2 (TRUTH_AND_EXPR, boolean_type_node, t, u);
-  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
-  gimple_seq_add_seq (&seq, stmts);
-  cond = create_tmp_reg (boolean_type_node, "__asan_crash_cond");
-  g = gimple_build_assign  (cond, t);
-  gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
-  g = gimple_build_cond (NE_EXPR, cond, boolean_false_node, NULL_TREE,
-                         NULL_TREE);
-  gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
-
-  /* Generate call to the run-time library (e.g. __asan_report_load8).  */
-
-  gsi = gsi_last_bb (cond_bb);
-  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
-  seq = NULL; 
-  g = gimple_build_call (report_error_func (is_store, size_in_bytes),
-                         1, base_addr);
-  gimple_seq_add_stmt (&seq, g);
-
-  /* Insert the check code in the THEN block.  */
-
-  gsi = gsi_start_bb (then_bb);
-  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
-
-  *iter = gsi_start_bb (join_bb);
-}
-
-/* If T represents a memory access, add instrumentation code before ITER.
-   LOCATION is source code location.
-   IS_STORE is either 1 (for a store) or 0 (for a load).  */
-
-static void
-instrument_derefs (gimple_stmt_iterator *iter, tree t,
-                  location_t location, int is_store)
-{
-  tree type, base;
-  int size_in_bytes;
-
-  type = TREE_TYPE (t);
-  if (type == error_mark_node)
-    return;
-  switch (TREE_CODE (t))
-    {
-    case ARRAY_REF:
-    case COMPONENT_REF:
-    case INDIRECT_REF:
-    case MEM_REF:
-      break;
-    default:
-      return;
-    }
-  size_in_bytes = tree_low_cst (TYPE_SIZE (type), 0) / BITS_PER_UNIT;
-  if (size_in_bytes != 1 && size_in_bytes != 2 &&
-      size_in_bytes != 4 && size_in_bytes != 8 && size_in_bytes != 16)
-      return;
-  {
-    /* For now just avoid instrumenting bit field acceses.
-     Fixing it is doable, but expected to be messy.  */
-
-    HOST_WIDE_INT bitsize, bitpos;
-    tree offset;
-    enum machine_mode mode;
-    int volatilep = 0, unsignedp = 0;
-    get_inner_reference (t, &bitsize, &bitpos, &offset,
-                         &mode, &unsignedp, &volatilep, false);
-    if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
-        return;
-  }
-
-  base = build_addr (t, current_function_decl);
-  build_check_stmt (base, iter, location, is_store, size_in_bytes);
-}
-
-/* asan: this looks too complex. Can this be done simpler? */
-/* Transform
-   1) Memory references.
-   2) BUILTIN_ALLOCA calls.
-*/
-
-static void
-transform_statements (void)
-{
-  basic_block bb;
-  gimple_stmt_iterator i;
-  int saved_last_basic_block = last_basic_block;
-  enum gimple_rhs_class grhs_class;
-
-  FOR_EACH_BB (bb)
-    {
-      if (bb->index >= saved_last_basic_block) continue;
-      for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
-        {
-          gimple s = gsi_stmt (i);
-          if (gimple_code (s) != GIMPLE_ASSIGN)
-              continue;
-          instrument_derefs (&i, gimple_assign_lhs (s),
-                             gimple_location (s), 1);
-          instrument_derefs (&i, gimple_assign_rhs1 (s),
-                             gimple_location (s), 0);
-          grhs_class = get_gimple_rhs_class (gimple_assign_rhs_code (s));
-          if (grhs_class == GIMPLE_BINARY_RHS)
-            instrument_derefs (&i, gimple_assign_rhs2 (s),
-                               gimple_location (s), 0);
-        }
-    }
-}
-
-/* Module-level instrumentation.
-   - Insert __asan_init() into the list of CTORs.
-   - TODO: insert redzones around globals.
- */
-
-void
-asan_finish_file (void)
-{
-  tree ctor_statements = NULL_TREE;
-  append_to_statement_list (build_call_expr (asan_init_func (), 0),
-                            &ctor_statements);
-  cgraph_build_static_cdtor ('I', ctor_statements,
-                             MAX_RESERVED_INIT_PRIORITY - 1);
-}
-
-/* Instrument the current function.  */
-
-static unsigned int
-asan_instrument (void)
-{
-  struct gimplify_ctx gctx;
-  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode, true);
-  int is_64 = tree_low_cst (TYPE_SIZE (uintptr_type), 0) == 64;
-  asan_offset_log = is_64 ? asan_offset_log_64 : asan_offset_log_32;
-  push_gimplify_context (&gctx);
-  transform_statements ();
-  pop_gimplify_context (NULL);
-  return 0;
-}
-
-static bool
-gate_asan (void)
-{
-  return flag_asan != 0;
-}
-
-struct gimple_opt_pass pass_asan =
-{
- {
-  GIMPLE_PASS,
-  "asan",                               /* name  */
-  gate_asan,                            /* gate  */
-  asan_instrument,                      /* execute  */
-  NULL,                                 /* sub  */
-  NULL,                                 /* next  */
-  0,                                    /* static_pass_number  */
-  TV_NONE,                              /* tv_id  */
-  PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required  */
-  0,                                    /* properties_provided  */
-  0,                                    /* properties_destroyed  */
-  0,                                    /* todo_flags_start  */
-  TODO_verify_flow | TODO_verify_stmts
-  | TODO_update_ssa    /* todo_flags_finish  */
- }
-};
diff --git a/gcc/tree-asan.h b/gcc/tree-asan.h
deleted file mode 100644
index 590cf35..0000000
--- a/gcc/tree-asan.h
+++ /dev/null
@@ -1,26 +0,0 @@
-/* AddressSanitizer, a fast memory error detector.
-   Copyright (C) 2011 Free Software Foundation, Inc.
-   Contributed by Kostya Serebryany <kcc@google.com>
-
-This file is part of GCC.
-
-GCC is free software; you can redistribute it and/or modify it under
-the terms of the GNU General Public License as published by the Free
-Software Foundation; either version 3, or (at your option) any later
-version.
-
-GCC is distributed in the hope that it will be useful, but WITHOUT ANY
-WARRANTY; without even the implied warranty of MERCHANTABILITY or
-FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
-for more details.
-
-You should have received a copy of the GNU General Public License
-along with GCC; see the file COPYING3.  If not see
-<http://www.gnu.org/licenses/>.  */
-
-#ifndef TREE_ASAN
-#define TREE_ASAN
-
-extern void asan_finish_file(void);
-
-#endif /* TREE_ASAN */
-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 01/13] Initial import of asan from the Google branch
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
                   ` (4 preceding siblings ...)
  2012-11-01 19:53 ` [PATCH 02/13] Rename tree-asan.[ch] to asan.[ch] dodji
@ 2012-11-01 19:53 ` dodji
  2012-11-01 19:53 ` [PATCH 03/13] Initial asan cleanups dodji
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 80+ messages in thread
From: dodji @ 2012-11-01 19:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

From: dnovillo <dnovillo@138bc75d-0d04-0410-961f-82ee72b054a4>

This patch merely imports the initial state of asan as it was in the
Google branch.

It provides basic infrastructure for asan to instrument memory
accesses on the heap, at -O3.  Note that it supports neither stack nor
global variable protection.

The rest of the patches of the set is intended to further improve this
base.

	* Makefile.in: Add tree-asan.c.
	* common.opt: Add -fasan option.
	* invoke.texi: Document the new flag.
	* passes.c: Add the asan pass.
	* toplev.c (compile_file): Call asan_finish_file.
	* tree-asan.c: New file.
	* tree-asan.h: New file.
	* tree-pass.h: Declare pass_asan.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192328 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan  |  10 ++
 gcc/Makefile.in     |   5 +
 gcc/common.opt      |   4 +
 gcc/doc/invoke.texi |   8 +-
 gcc/passes.c        |   1 +
 gcc/toplev.c        |   5 +
 gcc/tree-asan.c     | 403 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/tree-asan.h     |  26 ++++
 gcc/tree-pass.h     |   1 +
 9 files changed, 462 insertions(+), 1 deletion(-)
 create mode 100644 gcc/ChangeLog.asan
 create mode 100644 gcc/tree-asan.c
 create mode 100644 gcc/tree-asan.h

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
new file mode 100644
index 0000000..40299e2
--- /dev/null
+++ b/gcc/ChangeLog.asan
@@ -0,0 +1,10 @@
+2012-10-10  Wei Mi <wmi@google.com>
+
+	* Makefile.in: Add tree-asan.c.
+	* common.opt: Add -fasan option.
+	* invoke.texi: Document the new flag.
+	* passes.c: Add the asan pass.
+	* toplev.c (compile_file): Call asan_finish_file.
+	* tree-asan.c: New file.
+	* tree-asan.h: New file.
+	* tree-pass.h: Declare pass_asan.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 3a8ffbe..e8c4a19 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1350,6 +1350,7 @@ OBJS = \
 	tracer.o \
 	trans-mem.o \
 	tree-affine.o \
+	tree-asan.o \
 	tree-call-cdce.o \
 	tree-cfg.o \
 	tree-cfgcleanup.o \
@@ -2209,6 +2210,10 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
    $(TREE_H) $(PARAMS_H) $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(RTL_H) \
    $(GGC_H) $(TM_P_H) $(TARGET_H) langhooks.h $(REGS_H) gt-stor-layout.h \
    $(DIAGNOSTIC_CORE_H) $(CGRAPH_H) $(TREE_INLINE_H) $(TREE_DUMP_H) $(GIMPLE_H)
+tree-asan.o : tree-asan.c tree-asan.h $(CONFIG_H) pointer-set.h \
+   $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \
+   output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \
+   tree-pretty-print.h
 tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
    $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
    $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) \
diff --git a/gcc/common.opt b/gcc/common.opt
index 5b69aff..eca0740 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -849,6 +849,10 @@ fargument-noalias-anything
 Common Ignore
 Does nothing. Preserved for backward compatibility.
 
+fasan
+Common RejectNegative Report Var(flag_asan)
+Enable AddressSanitizer, a memory error detector
+
 fasynchronous-unwind-tables
 Common Report Var(flag_asynchronous_unwind_tables) Optimization
 Generate unwind tables that are exact at each instruction boundary
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ff0c87d..8292ef1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -353,7 +353,7 @@ Objective-C and Objective-C++ Dialects}.
 @item Optimization Options
 @xref{Optimize Options,,Options that Control Optimization}.
 @gccoptlist{-falign-functions[=@var{n}] -falign-jumps[=@var{n}] @gol
--falign-labels[=@var{n}] -falign-loops[=@var{n}] -fassociative-math @gol
+-falign-labels[=@var{n}] -falign-loops[=@var{n}] -fasan -fassociative-math @gol
 -fauto-inc-dec -fbranch-probabilities -fbranch-target-load-optimize @gol
 -fbranch-target-load-optimize2 -fbtr-bb-exclusive -fcaller-saves @gol
 -fcheck-data-deps -fcombine-stack-adjustments -fconserve-stack @gol
@@ -6822,6 +6822,12 @@ assumptions based on that.
 
 The default is @option{-fzero-initialized-in-bss}.
 
+@item -fasan
+Enable AddressSanitizer, a fast memory error detector.
+Memory access instructions will be instrumented to detect
+out-of-bounds and use-after-free bugs. So far only heap bugs will be detected.
+See @uref{http://code.google.com/p/address-sanitizer/} for more details.
+
 @item -fmudflap -fmudflapth -fmudflapir
 @opindex fmudflap
 @opindex fmudflapth
diff --git a/gcc/passes.c b/gcc/passes.c
index 67aae52..66a2f74 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1456,6 +1456,7 @@ init_optimization_passes (void)
       NEXT_PASS (pass_split_crit_edges);
       NEXT_PASS (pass_pre);
       NEXT_PASS (pass_sink_code);
+      NEXT_PASS (pass_asan);
       NEXT_PASS (pass_tree_loop);
 	{
 	  struct opt_pass **p = &pass_tree_loop.pass.sub;
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 5cbb364..b1aff0c 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -72,6 +72,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "value-prof.h"
 #include "alloc-pool.h"
 #include "tree-mudflap.h"
+#include "tree-asan.h"
 #include "gimple.h"
 #include "tree-ssa-alias.h"
 #include "plugin.h"
@@ -570,6 +571,10 @@ compile_file (void)
       if (flag_mudflap)
 	mudflap_finish_file ();
 
+      /* File-scope initialization for AddressSanitizer.  */
+      if (flag_asan)
+        asan_finish_file ();
+
       output_shared_constant_pool ();
       output_object_blocks ();
       finish_tm_clone_pairs ();
diff --git a/gcc/tree-asan.c b/gcc/tree-asan.c
new file mode 100644
index 0000000..a8841d6
--- /dev/null
+++ b/gcc/tree-asan.c
@@ -0,0 +1,403 @@
+/* AddressSanitizer, a fast memory error detector.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Kostya Serebryany <kcc@google.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "tm_p.h"
+#include "basic-block.h"
+#include "flags.h"
+#include "function.h"
+#include "tree-inline.h"
+#include "gimple.h"
+#include "tree-iterator.h"
+#include "tree-flow.h"
+#include "tree-dump.h"
+#include "tree-pass.h"
+#include "diagnostic.h"
+#include "demangle.h"
+#include "langhooks.h"
+#include "ggc.h"
+#include "cgraph.h"
+#include "gimple.h"
+#include "tree-asan.h"
+#include "gimple-pretty-print.h"
+
+/*
+ AddressSanitizer finds out-of-bounds and use-after-free bugs 
+ with <2x slowdown on average.
+
+ The tool consists of two parts:
+ instrumentation module (this file) and a run-time library.
+ The instrumentation module adds a run-time check before every memory insn.
+   For a 8- or 16- byte load accessing address X:
+     ShadowAddr = (X >> 3) + Offset
+     ShadowValue = *(char*)ShadowAddr;  // *(short*) for 16-byte access.
+     if (ShadowValue)
+       __asan_report_load8(X);
+   For a load of N bytes (N=1, 2 or 4) from address X:
+     ShadowAddr = (X >> 3) + Offset
+     ShadowValue = *(char*)ShadowAddr;
+     if (ShadowValue)
+       if ((X & 7) + N - 1 > ShadowValue)
+         __asan_report_loadN(X);
+ Stores are instrumented similarly, but using __asan_report_storeN functions.
+ A call too __asan_init() is inserted to the list of module CTORs.
+
+ The run-time library redefines malloc (so that redzone are inserted around
+ the allocated memory) and free (so that reuse of free-ed memory is delayed),
+ provides __asan_report* and __asan_init functions.
+
+ Read more:
+ http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm
+
+ Future work:
+ The current implementation supports only detection of out-of-bounds and
+ use-after-free bugs in heap.
+ In order to support out-of-bounds for stack and globals we will need
+ to create redzones for stack and global object and poison them.
+*/
+
+/* The shadow address is computed as (X>>asan_scale) + (1<<asan_offset_log).
+ We may want to add command line flags to change these values.  */
+
+static const int asan_scale = 3;
+static const int asan_offset_log_32 = 29;
+static const int asan_offset_log_64 = 44;
+static int asan_offset_log;
+
+
+/* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
+   IS_STORE is either 1 (for a store) or 0 (for a load).
+   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
+
+static tree
+report_error_func (int is_store, int size_in_bytes)
+{
+  tree fn_type;
+  tree def;
+  char name[100];
+
+  sprintf (name, "__asan_report_%s%d\n",
+           is_store ? "store" : "load", size_in_bytes);
+  fn_type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
+  def = build_fn_decl (name, fn_type);
+  TREE_NOTHROW (def) = 1;
+  TREE_THIS_VOLATILE (def) = 1;  /* Attribute noreturn. Surprise!  */
+  DECL_ATTRIBUTES (def) = tree_cons (get_identifier ("leaf"), 
+                                     NULL, DECL_ATTRIBUTES (def));
+  DECL_ASSEMBLER_NAME (def);
+  return def;
+}
+
+/* Construct a function tree for __asan_init().  */
+
+static tree
+asan_init_func (void)
+{
+  tree fn_type;
+  tree def;
+
+  fn_type = build_function_type_list (void_type_node, NULL_TREE);
+  def = build_fn_decl ("__asan_init", fn_type);
+  TREE_NOTHROW (def) = 1;
+  DECL_ASSEMBLER_NAME (def);
+  return def;
+}
+
+
+/* Instrument the memory access instruction BASE.
+   Insert new statements before ITER.
+   LOCATION is source code location.
+   IS_STORE is either 1 (for a store) or 0 (for a load).
+   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
+
+static void
+build_check_stmt (tree base,
+                  gimple_stmt_iterator *iter,
+                  location_t location, int is_store, int size_in_bytes)
+{
+  gimple_stmt_iterator gsi;
+  basic_block cond_bb, then_bb, join_bb;
+  edge e;
+  tree cond, t, u;
+  tree base_addr;
+  tree shadow_value;
+  gimple g;
+  gimple_seq seq, stmts;
+  tree shadow_type = size_in_bytes == 16 ?
+      short_integer_type_node : char_type_node;
+  tree shadow_ptr_type = build_pointer_type (shadow_type);
+  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode,
+                                                      /*unsignedp=*/true);
+
+  /* We first need to split the current basic block, and start altering
+     the CFG.  This allows us to insert the statements we're about to
+     construct into the right basic blocks.  */
+
+  cond_bb = gimple_bb (gsi_stmt (*iter));
+  gsi = *iter;
+  gsi_prev (&gsi);
+  if (!gsi_end_p (gsi))
+    e = split_block (cond_bb, gsi_stmt (gsi));
+  else
+    e = split_block_after_labels (cond_bb);
+  cond_bb = e->src;
+  join_bb = e->dest;
+
+  /* A recap at this point: join_bb is the basic block at whose head
+     is the gimple statement for which this check expression is being
+     built.  cond_bb is the (possibly new, synthetic) basic block the
+     end of which will contain the cache-lookup code, and a
+     conditional that jumps to the cache-miss code or, much more
+     likely, over to join_bb.  */
+
+  /* Create the bb that contains the crash block.  */
+  then_bb = create_empty_bb (cond_bb);
+  make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  make_single_succ_edge (then_bb, join_bb, EDGE_FALLTHRU);
+
+  /* Mark the pseudo-fallthrough edge from cond_bb to join_bb.  */
+  e = find_edge (cond_bb, join_bb);
+  e->flags = EDGE_FALSE_VALUE;
+  e->count = cond_bb->count;
+  e->probability = REG_BR_PROB_BASE;
+
+  /* Update dominance info.  Note that bb_join's data was
+     updated by split_block.  */
+  if (dom_info_available_p (CDI_DOMINATORS))
+    {
+      set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
+      set_immediate_dominator (CDI_DOMINATORS, join_bb, cond_bb);
+    }
+
+  base_addr = create_tmp_reg (uintptr_type, "__asan_base_addr");
+
+  seq = NULL; 
+  t = fold_convert_loc (location, uintptr_type,
+                        unshare_expr (base));
+  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
+  gimple_seq_add_seq (&seq, stmts);
+  g = gimple_build_assign (base_addr, t);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+
+  /* Build (base_addr >> asan_scale) + (1 << asan_offset_log).  */
+
+  t = build2 (RSHIFT_EXPR, uintptr_type, base_addr,
+              build_int_cst (uintptr_type, asan_scale));
+  t = build2 (PLUS_EXPR, uintptr_type, t,
+              build2 (LSHIFT_EXPR, uintptr_type,
+                      build_int_cst (uintptr_type, 1),
+                      build_int_cst (uintptr_type, asan_offset_log)
+                     ));
+  t = build1 (INDIRECT_REF, shadow_type,
+              build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t));
+  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
+  gimple_seq_add_seq (&seq, stmts);
+  shadow_value = create_tmp_reg (shadow_type, "__asan_shadow");
+  g = gimple_build_assign (shadow_value, t);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+  t = build2 (NE_EXPR, boolean_type_node, shadow_value,
+              build_int_cst (shadow_type, 0));
+  if (size_in_bytes < 8)
+    {
+
+      /* Slow path for 1-, 2- and 4- byte accesses.
+         Build ((base_addr & 7) + (size_in_bytes - 1)) >= shadow_value.  */
+
+      u = build2 (BIT_AND_EXPR, uintptr_type,
+                  base_addr,
+                  build_int_cst (uintptr_type, 7));
+      u = build1 (CONVERT_EXPR, shadow_type, u);
+      u = build2 (PLUS_EXPR, shadow_type, u,
+                  build_int_cst (shadow_type, size_in_bytes - 1));
+      u = build2 (GE_EXPR, uintptr_type, u, shadow_value);
+    }
+  else
+      u = build_int_cst (boolean_type_node, 1);
+  t = build2 (TRUTH_AND_EXPR, boolean_type_node, t, u);
+  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
+  gimple_seq_add_seq (&seq, stmts);
+  cond = create_tmp_reg (boolean_type_node, "__asan_crash_cond");
+  g = gimple_build_assign  (cond, t);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+  g = gimple_build_cond (NE_EXPR, cond, boolean_false_node, NULL_TREE,
+                         NULL_TREE);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+
+  /* Generate call to the run-time library (e.g. __asan_report_load8).  */
+
+  gsi = gsi_last_bb (cond_bb);
+  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
+  seq = NULL; 
+  g = gimple_build_call (report_error_func (is_store, size_in_bytes),
+                         1, base_addr);
+  gimple_seq_add_stmt (&seq, g);
+
+  /* Insert the check code in the THEN block.  */
+
+  gsi = gsi_start_bb (then_bb);
+  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
+
+  *iter = gsi_start_bb (join_bb);
+}
+
+/* If T represents a memory access, add instrumentation code before ITER.
+   LOCATION is source code location.
+   IS_STORE is either 1 (for a store) or 0 (for a load).  */
+
+static void
+instrument_derefs (gimple_stmt_iterator *iter, tree t,
+                  location_t location, int is_store)
+{
+  tree type, base;
+  int size_in_bytes;
+
+  type = TREE_TYPE (t);
+  if (type == error_mark_node)
+    return;
+  switch (TREE_CODE (t))
+    {
+    case ARRAY_REF:
+    case COMPONENT_REF:
+    case INDIRECT_REF:
+    case MEM_REF:
+      break;
+    default:
+      return;
+    }
+  size_in_bytes = tree_low_cst (TYPE_SIZE (type), 0) / BITS_PER_UNIT;
+  if (size_in_bytes != 1 && size_in_bytes != 2 &&
+      size_in_bytes != 4 && size_in_bytes != 8 && size_in_bytes != 16)
+      return;
+  {
+    /* For now just avoid instrumenting bit field acceses.
+     Fixing it is doable, but expected to be messy.  */
+
+    HOST_WIDE_INT bitsize, bitpos;
+    tree offset;
+    enum machine_mode mode;
+    int volatilep = 0, unsignedp = 0;
+    get_inner_reference (t, &bitsize, &bitpos, &offset,
+                         &mode, &unsignedp, &volatilep, false);
+    if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
+        return;
+  }
+
+  base = build_addr (t, current_function_decl);
+  build_check_stmt (base, iter, location, is_store, size_in_bytes);
+}
+
+/* asan: this looks too complex. Can this be done simpler? */
+/* Transform
+   1) Memory references.
+   2) BUILTIN_ALLOCA calls.
+*/
+
+static void
+transform_statements (void)
+{
+  basic_block bb;
+  gimple_stmt_iterator i;
+  int saved_last_basic_block = last_basic_block;
+  enum gimple_rhs_class grhs_class;
+
+  FOR_EACH_BB (bb)
+    {
+      if (bb->index >= saved_last_basic_block) continue;
+      for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+        {
+          gimple s = gsi_stmt (i);
+          if (gimple_code (s) != GIMPLE_ASSIGN)
+              continue;
+          instrument_derefs (&i, gimple_assign_lhs (s),
+                             gimple_location (s), 1);
+          instrument_derefs (&i, gimple_assign_rhs1 (s),
+                             gimple_location (s), 0);
+          grhs_class = get_gimple_rhs_class (gimple_assign_rhs_code (s));
+          if (grhs_class == GIMPLE_BINARY_RHS)
+            instrument_derefs (&i, gimple_assign_rhs2 (s),
+                               gimple_location (s), 0);
+        }
+    }
+}
+
+/* Module-level instrumentation.
+   - Insert __asan_init() into the list of CTORs.
+   - TODO: insert redzones around globals.
+ */
+
+void
+asan_finish_file (void)
+{
+  tree ctor_statements = NULL_TREE;
+  append_to_statement_list (build_call_expr (asan_init_func (), 0),
+                            &ctor_statements);
+  cgraph_build_static_cdtor ('I', ctor_statements,
+                             MAX_RESERVED_INIT_PRIORITY - 1);
+}
+
+/* Instrument the current function.  */
+
+static unsigned int
+asan_instrument (void)
+{
+  struct gimplify_ctx gctx;
+  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode, true);
+  int is_64 = tree_low_cst (TYPE_SIZE (uintptr_type), 0) == 64;
+  asan_offset_log = is_64 ? asan_offset_log_64 : asan_offset_log_32;
+  push_gimplify_context (&gctx);
+  transform_statements ();
+  pop_gimplify_context (NULL);
+  return 0;
+}
+
+static bool
+gate_asan (void)
+{
+  return flag_asan != 0;
+}
+
+struct gimple_opt_pass pass_asan =
+{
+ {
+  GIMPLE_PASS,
+  "asan",                               /* name  */
+  gate_asan,                            /* gate  */
+  asan_instrument,                      /* execute  */
+  NULL,                                 /* sub  */
+  NULL,                                 /* next  */
+  0,                                    /* static_pass_number  */
+  TV_NONE,                              /* tv_id  */
+  PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required  */
+  0,                                    /* properties_provided  */
+  0,                                    /* properties_destroyed  */
+  0,                                    /* todo_flags_start  */
+  TODO_verify_flow | TODO_verify_stmts
+  | TODO_update_ssa    /* todo_flags_finish  */
+ }
+};
diff --git a/gcc/tree-asan.h b/gcc/tree-asan.h
new file mode 100644
index 0000000..590cf35
--- /dev/null
+++ b/gcc/tree-asan.h
@@ -0,0 +1,26 @@
+/* AddressSanitizer, a fast memory error detector.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Kostya Serebryany <kcc@google.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef TREE_ASAN
+#define TREE_ASAN
+
+extern void asan_finish_file(void);
+
+#endif /* TREE_ASAN */
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 8ed2d98..73c5886 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -259,6 +259,7 @@ struct register_pass_info
 
 extern struct gimple_opt_pass pass_mudflap_1;
 extern struct gimple_opt_pass pass_mudflap_2;
+extern struct gimple_opt_pass pass_asan;
 extern struct gimple_opt_pass pass_lower_cf;
 extern struct gimple_opt_pass pass_refactor_eh;
 extern struct gimple_opt_pass pass_lower_eh;
-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 09/13] Don't forget to protect 32 bytes aligned global variables.
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
  2012-11-01 19:53 ` [PATCH 08/13] Fix a couple of ICEs dodji
  2012-11-01 19:53 ` [PATCH 10/13] Make build_check_stmt accept an SSA_NAME for its base dodji
@ 2012-11-01 19:53 ` dodji
  2012-11-01 19:53 ` [PATCH 06/13] Implement protection of stack variables dodji
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 80+ messages in thread
From: dodji @ 2012-11-01 19:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

From: wmi <wmi@138bc75d-0d04-0410-961f-82ee72b054a4>

It appeared that we were forgetting to protect global variables that
are already 32 bytes aligned.  Fixed thus.

	* varasm.c (assemble_variable): Set asan_protected even
	for decls that are already ASAN_RED_ZONE_SIZE or more
	bytes aligned.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192830 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan | 15 +++++++++++++++
 gcc/varasm.c       |  6 +++---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 3da0a0b..57670f7 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,3 +1,18 @@
+2012-10-25  Wei Mi  <wmi@google.com>
+
+	* varasm.c (assemble_variable): Set asan_protected even 
+	for decls that are already ASAN_RED_ZONE_SIZE or more 
+	bytes aligned.
+
+2012-10-19  Diego Novillo  <dnovillo@google.com>
+
+	Merge from trunk rev 192612.
+
+2012-10-18  Xinliang David Li  <davidxl@google.com>
+
+	* asan.c (asan_init_shadow_ptr_types): change shadow type
+	to signed type.
+
 2012-10-18  Jakub Jelinek  <jakub@redhat.com>
 
 	* asan.c (build_check_stmt): Unshare base.
diff --git a/gcc/varasm.c b/gcc/varasm.c
index 8a533ed..641ce0c 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -1991,11 +1991,11 @@ assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED,
   align_variable (decl, dont_output_data);
 
   if (flag_asan
-      && asan_protect_global (decl)
-      && DECL_ALIGN (decl) < ASAN_RED_ZONE_SIZE * BITS_PER_UNIT)
+      && asan_protect_global (decl))
     {
       asan_protected = true;
-      DECL_ALIGN (decl) = ASAN_RED_ZONE_SIZE * BITS_PER_UNIT;
+      DECL_ALIGN (decl) = MAX (DECL_ALIGN (decl), 
+                               ASAN_RED_ZONE_SIZE * BITS_PER_UNIT);
     }
 
   set_mem_align (decl_rtl, DECL_ALIGN (decl));
-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 10/13] Make build_check_stmt accept an SSA_NAME for its base
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
  2012-11-01 19:53 ` [PATCH 08/13] Fix a couple of ICEs dodji
@ 2012-11-01 19:53 ` dodji
  2012-11-01 19:53 ` [PATCH 09/13] Don't forget to protect 32 bytes aligned global variables dodji
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 80+ messages in thread
From: dodji @ 2012-11-01 19:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

From: dodji <dodji@138bc75d-0d04-0410-961f-82ee72b054a4>

This patch makes build_check_stmt accept its memory access parameter
to be an SSA name.  This is useful for a subsequent patch that will
re-use.

Tested by running cc1 -fasan on the program below with and without the
patch and inspecting the gimple output to see that there is no change.

void
foo ()
{
  char foo[1] = {0};

  foo[0] = 1;
}

gcc/
	* asan.c (build_check_stmt): Accept the memory access to be
	represented by an SSA_NAME.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192843 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |  5 +++++
 gcc/asan.c         | 36 +++++++++++++++++++++++-------------
 2 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 57670f7..9159b3f 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,3 +1,8 @@
+2012-10-26  Dodji Seketeli  <dodji@redhat.com>
+
+	* asan.c (build_check_stmt): Accept the memory access to be
+	represented by an SSA_NAME.
+
 2012-10-25  Wei Mi  <wmi@google.com>
 
 	* varasm.c (assemble_variable): Set asan_protected even 
diff --git a/gcc/asan.c b/gcc/asan.c
index 6715e51..b43f03b 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -397,16 +397,18 @@ asan_init_func (void)
 #define PROB_VERY_UNLIKELY	(REG_BR_PROB_BASE / 2000 - 1)
 #define PROB_ALWAYS		(REG_BR_PROB_BASE)
 
-/* Instrument the memory access instruction BASE.
-   Insert new statements before ITER.
-   LOCATION is source code location.
-   IS_STORE is either 1 (for a store) or 0 (for a load).
+/* Instrument the memory access instruction BASE.  Insert new
+   statements before ITER.
+
+   Note that the memory access represented by BASE can be either an
+   SSA_NAME, or a non-SSA expression.  LOCATION is the source code
+   location.  IS_STORE is TRUE for a store, FALSE for a load.
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
 
 static void
-build_check_stmt (tree base,
-                  gimple_stmt_iterator *iter,
-                  location_t location, bool is_store, int size_in_bytes)
+build_check_stmt (tree base, gimple_stmt_iterator *iter,
+                  location_t location, bool is_store,
+		  int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
   basic_block cond_bb, then_bb, else_bb;
@@ -417,6 +419,7 @@ build_check_stmt (tree base,
   tree shadow_type = TREE_TYPE (shadow_ptr_type);
   tree uintptr_type
     = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE (base)), 1);
+  tree base_ssa = base;
 
   /* We first need to split the current basic block, and start altering
      the CFG.  This allows us to insert the statements we're about to
@@ -462,15 +465,22 @@ build_check_stmt (tree base,
   base = unshare_expr (base);
 
   gsi = gsi_last_bb (cond_bb);
-  g = gimple_build_assign_with_ops (TREE_CODE (base),
-				    make_ssa_name (TREE_TYPE (base), NULL),
-				    base, NULL_TREE);
-  gimple_set_location (g, location);
-  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+  /* BASE can already be an SSA_NAME; in that case, do not create a
+     new SSA_NAME for it.  */
+  if (TREE_CODE (base) != SSA_NAME)
+    {
+      g = gimple_build_assign_with_ops (TREE_CODE (base),
+					make_ssa_name (TREE_TYPE (base), NULL),
+					base, NULL_TREE);
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+      base_ssa = gimple_assign_lhs (g);
+    }
 
   g = gimple_build_assign_with_ops (NOP_EXPR,
 				    make_ssa_name (uintptr_type, NULL),
-				    gimple_assign_lhs (g), NULL_TREE);
+				    base_ssa, NULL_TREE);
   gimple_set_location (g, location);
   gsi_insert_after (&gsi, g, GSI_NEW_STMT);
   base_addr = gimple_assign_lhs (g);
-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 07/13] Implement protection of global variables
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
                   ` (7 preceding siblings ...)
  2012-11-01 19:53 ` [PATCH 11/13] Factorize condition insertion code out of build_check_stmt dodji
@ 2012-11-01 19:53 ` dodji
  2012-11-01 19:53 ` [PATCH 05/13] Allow asan at -O0 dodji
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 80+ messages in thread
From: dodji @ 2012-11-01 19:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

From: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4>

This patch implements the protection of global variables.

The basic idea is to insert a red zone between two global variables
and install a constructor function that calls the asan runtime to do
the populating of the relevant shadow memory regions at load time.

So the patch lays out the global variables as to insert a red zone
between them. The size of the red zones is so that each variable
starts on a 32 bytes boundary.

Then it installs a constructor function that, for each global
variable, calls the runtime asan library function
__asan_register_globals_with an instance of this type:

    struct __asan_global
    {
      /* Address of the beginning of the global variable.  */
      const void *__beg;

      /* Initial size of the global variable.  */
      uptr __size;

      /* Size of the global variable + size of the red zone.  This
         size is 32 bytes aligned.  */
      uptr __size_with_redzone;

      /*  Name of the global variable.  */
      const void *__name;

      /* This is always set to NULL for now.  */
      uptr __has_dynamic_init;
    }

The patch also installs a destructor function that calls the
runtime asan library function _asan_unregister_globals.

	* varasm.c: Include asan.h.
	(assemble_noswitch_variable): Grow size by asan_red_zone_size
	if decl is asan protected.
	(place_block_symbol): Likewise.
	(assemble_variable): If decl is asan protected, increase
	DECL_ALIGN if needed, and for decls emitted using
	assemble_variable_contents append padding zeros after it.
	* Makefile.in (varasm.o): Depend on asan.h.
	* asan.c: Include output.h.
	(asan_pp, asan_pp_initialized): New variables.
	(asan_pp_initialize, asan_pp_string): New functions.
	(asan_emit_stack_protection): Use asan_pp{,_initialized}
	instead of local pp{,_initialized} vars, use asan_pp_initialize
	and asan_pp_string helpers.
	(asan_needs_local_alias, asan_protect_global,
	asan_global_struct, asan_add_global): New functions.
	(asan_finish_file): Protect global vars that can be protected.
	* asan.h (asan_protect_global): New prototype.
	(asan_red_zone_size): New inline function.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192541 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |  20 ++++
 gcc/Makefile.in    |   2 +-
 gcc/asan.c         | 299 +++++++++++++++++++++++++++++++++++++++++++++++------
 gcc/asan.h         |  11 ++
 gcc/varasm.c       |  22 ++++
 5 files changed, 319 insertions(+), 35 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 23454f3..971de42 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,5 +1,25 @@
 2012-10-17  Jakub Jelinek  <jakub@redhat.com>
 
+	* varasm.c: Include asan.h.
+	(assemble_noswitch_variable): Grow size by asan_red_zone_size
+	if decl is asan protected.
+	(place_block_symbol): Likewise.
+	(assemble_variable): If decl is asan protected, increase
+	DECL_ALIGN if needed, and for decls emitted using
+	assemble_variable_contents append padding zeros after it.
+	* Makefile.in (varasm.o): Depend on asan.h.
+	* asan.c: Include output.h.
+	(asan_pp, asan_pp_initialized): New variables.
+	(asan_pp_initialize, asan_pp_string): New functions.
+	(asan_emit_stack_protection): Use asan_pp{,_initialized}
+	instead of local pp{,_initialized} vars, use asan_pp_initialize
+	and asan_pp_string helpers.
+	(asan_needs_local_alias, asan_protect_global,
+	asan_global_struct, asan_add_global): New functions.
+	(asan_finish_file): Protect global vars that can be protected.
+	* asan.h (asan_protect_global): New prototype.
+	(asan_red_zone_size): New inline function.
+
 	* Makefile.in (asan.o): Depend on $(EXPR_H) $(OPTABS_H).
 	(cfgexpand.o): Depend on asan.h.
 	* asan.c: Include expr.h and optabs.h.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 2743e24..c6a9825 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2721,7 +2721,7 @@ varasm.o : varasm.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
    output.h $(DIAGNOSTIC_CORE_H) xcoffout.h debug.h $(GGC_H) $(TM_P_H) \
    $(HASHTAB_H) $(TARGET_H) langhooks.h gt-varasm.h $(BASIC_BLOCK_H) \
    $(CGRAPH_H) $(TARGET_DEF_H) tree-mudflap.h \
-   pointer-set.h $(COMMON_TARGET_H)
+   pointer-set.h $(COMMON_TARGET_H) asan.h
 function.o : function.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_ERROR_H) \
    $(TREE_H) $(GIMPLE_H) $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) \
    $(OPTABS_H) $(LIBFUNCS_H) $(REGS_H) hard-reg-set.h insn-config.h $(RECOG_H) \
diff --git a/gcc/asan.c b/gcc/asan.c
index fe0e9a8..c435d35 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "target.h"
 #include "expr.h"
 #include "optabs.h"
+#include "output.h"
 
 /*
  AddressSanitizer finds out-of-bounds and use-after-free bugs 
@@ -87,6 +88,34 @@ alias_set_type asan_shadow_set = -1;
    alias set is used for all shadow memory accesses.  */
 static GTY(()) tree shadow_ptr_types[2];
 
+/* Asan pretty-printer, used for buidling of the description STRING_CSTs.  */
+static pretty_printer asan_pp;
+static bool asan_pp_initialized;
+
+/* Initialize asan_pp.  */
+
+static void
+asan_pp_initialize (void)
+{
+  pp_construct (&asan_pp, /* prefix */NULL, /* line-width */0);
+  asan_pp_initialized = true;
+}
+
+/* Create ADDR_EXPR of STRING_CST with asan_pp text.  */
+
+static tree
+asan_pp_string (void)
+{
+  const char *buf = pp_base_formatted_text (&asan_pp);
+  size_t len = strlen (buf);
+  tree ret = build_string (len + 1, buf);
+  TREE_TYPE (ret)
+    = build_array_type (char_type_node, build_index_type (size_int (len)));
+  TREE_READONLY (ret) = 1;
+  TREE_STATIC (ret) = 1;
+  return build1 (ADDR_EXPR, build_pointer_type (char_type_node), ret);
+}
+
 /* Return a CONST_INT representing 4 subsequent shadow memory bytes.  */
 
 static rtx
@@ -121,51 +150,38 @@ asan_emit_stack_protection (rtx base, HOST_WIDE_INT *offsets, tree *decls,
   HOST_WIDE_INT last_offset, last_size;
   int l;
   unsigned char cur_shadow_byte = ASAN_STACK_MAGIC_LEFT;
-  static pretty_printer pp;
-  static bool pp_initialized;
-  const char *buf;
-  size_t len;
   tree str_cst;
 
   /* First of all, prepare the description string.  */
-  if (!pp_initialized)
-    {
-      pp_construct (&pp, /* prefix */NULL, /* line-width */0);
-      pp_initialized = true;
-    }
-  pp_clear_output_area (&pp);
+  if (!asan_pp_initialized)
+    asan_pp_initialize ();
+
+  pp_clear_output_area (&asan_pp);
   if (DECL_NAME (current_function_decl))
-    pp_base_tree_identifier (&pp, DECL_NAME (current_function_decl));
+    pp_base_tree_identifier (&asan_pp, DECL_NAME (current_function_decl));
   else
-    pp_string (&pp, "<unknown>");
-  pp_space (&pp);
-  pp_decimal_int (&pp, length / 2 - 1);
-  pp_space (&pp);
+    pp_string (&asan_pp, "<unknown>");
+  pp_space (&asan_pp);
+  pp_decimal_int (&asan_pp, length / 2 - 1);
+  pp_space (&asan_pp);
   for (l = length - 2; l; l -= 2)
     {
       tree decl = decls[l / 2 - 1];
-      pp_wide_integer (&pp, offsets[l] - base_offset);
-      pp_space (&pp);
-      pp_wide_integer (&pp, offsets[l - 1] - offsets[l]);
-      pp_space (&pp);
+      pp_wide_integer (&asan_pp, offsets[l] - base_offset);
+      pp_space (&asan_pp);
+      pp_wide_integer (&asan_pp, offsets[l - 1] - offsets[l]);
+      pp_space (&asan_pp);
       if (DECL_P (decl) && DECL_NAME (decl))
 	{
-	  pp_decimal_int (&pp, IDENTIFIER_LENGTH (DECL_NAME (decl)));
-	  pp_space (&pp);
-	  pp_base_tree_identifier (&pp, DECL_NAME (decl));
+	  pp_decimal_int (&asan_pp, IDENTIFIER_LENGTH (DECL_NAME (decl)));
+	  pp_space (&asan_pp);
+	  pp_base_tree_identifier (&asan_pp, DECL_NAME (decl));
 	}
       else
-	pp_string (&pp, "9 <unknown>");
-      pp_space (&pp);
+	pp_string (&asan_pp, "9 <unknown>");
+      pp_space (&asan_pp);
     }
-  buf = pp_base_formatted_text (&pp);
-  len = strlen (buf);
-  str_cst = build_string (len + 1, buf);
-  TREE_TYPE (str_cst)
-    = build_array_type (char_type_node, build_index_type (size_int (len)));
-  TREE_READONLY (str_cst) = 1;
-  TREE_STATIC (str_cst) = 1;
-  str_cst = build1 (ADDR_EXPR, build_pointer_type (char_type_node), str_cst);
+  str_cst = asan_pp_string ();
 
   /* Emit the prologue sequence.  */
   base = expand_binop (Pmode, add_optab, base, GEN_INT (base_offset),
@@ -270,6 +286,75 @@ asan_emit_stack_protection (rtx base, HOST_WIDE_INT *offsets, tree *decls,
   return ret;
 }
 
+/* Return true if DECL, a global var, might be overridden and needs
+   therefore a local alias.  */
+
+static bool
+asan_needs_local_alias (tree decl)
+{
+  return DECL_WEAK (decl) || !targetm.binds_local_p (decl);
+}
+
+/* Return true if DECL is a VAR_DECL that should be protected
+   by Address Sanitizer, by appending a red zone with protected
+   shadow memory after it and aligning it to at least
+   ASAN_RED_ZONE_SIZE bytes.  */
+
+bool
+asan_protect_global (tree decl)
+{
+  rtx rtl, symbol;
+  section *sect;
+
+  if (TREE_CODE (decl) != VAR_DECL
+      /* TLS vars aren't statically protectable.  */
+      || DECL_THREAD_LOCAL_P (decl)
+      /* Externs will be protected elsewhere.  */
+      || DECL_EXTERNAL (decl)
+      || !TREE_ASM_WRITTEN (decl)
+      || !DECL_RTL_SET_P (decl)
+      /* Comdat vars pose an ABI problem, we can't know if
+	 the var that is selected by the linker will have
+	 padding or not.  */
+      || DECL_ONE_ONLY (decl)
+      /* Similarly for common vars.  People can use -fno-common.  */
+      || DECL_COMMON (decl)
+      /* Don't protect if using user section, often vars placed
+	 into user section from multiple TUs are then assumed
+	 to be an array of such vars, putting padding in there
+	 breaks this assumption.  */
+      || (DECL_SECTION_NAME (decl) != NULL_TREE
+	  && !DECL_HAS_IMPLICIT_SECTION_NAME_P (decl))
+      || DECL_SIZE (decl) == 0
+      || ASAN_RED_ZONE_SIZE * BITS_PER_UNIT > MAX_OFILE_ALIGNMENT
+      || !valid_constant_size_p (DECL_SIZE_UNIT (decl))
+      || DECL_ALIGN_UNIT (decl) > 2 * ASAN_RED_ZONE_SIZE)
+    return false;
+
+  rtl = DECL_RTL (decl);
+  if (!MEM_P (rtl) || GET_CODE (XEXP (rtl, 0)) != SYMBOL_REF)
+    return false;
+  symbol = XEXP (rtl, 0);
+
+  if (CONSTANT_POOL_ADDRESS_P (symbol)
+      || TREE_CONSTANT_POOL_ADDRESS_P (symbol))
+    return false;
+
+  sect = get_variable_section (decl, false);
+  if (sect->common.flags & SECTION_COMMON)
+    return false;
+
+  if (lookup_attribute ("weakref", DECL_ATTRIBUTES (decl)))
+    return false;
+
+#ifndef ASM_OUTPUT_DEF
+  if (asan_needs_local_alias (decl))
+    return false;
+#endif
+
+  return true;    
+}
+
 /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
    IS_STORE is either 1 (for a store) or 0 (for a load).
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
@@ -568,6 +653,101 @@ transform_statements (void)
     }
 }
 
+/* Build
+   struct __asan_global
+   {
+     const void *__beg;
+     uptr __size;
+     uptr __size_with_redzone;
+     const void *__name;
+     uptr __has_dynamic_init;
+   } type.  */
+
+static tree
+asan_global_struct (void)
+{
+  static const char *field_names[5]
+    = { "__beg", "__size", "__size_with_redzone",
+	"__name", "__has_dynamic_init" };
+  tree fields[5], ret;
+  int i;
+
+  ret = make_node (RECORD_TYPE);
+  for (i = 0; i < 5; i++)
+    {
+      fields[i]
+	= build_decl (UNKNOWN_LOCATION, FIELD_DECL,
+		      get_identifier (field_names[i]),
+		      (i == 0 || i == 3) ? const_ptr_type_node
+		      : build_nonstandard_integer_type (POINTER_SIZE, 1));
+      DECL_CONTEXT (fields[i]) = ret;
+      if (i)
+	DECL_CHAIN (fields[i - 1]) = fields[i];
+    }
+  TYPE_FIELDS (ret) = fields[0];
+  TYPE_NAME (ret) = get_identifier ("__asan_global");
+  layout_type (ret);
+  return ret;
+}
+
+/* Append description of a single global DECL into vector V.
+   TYPE is __asan_global struct type as returned by asan_global_struct.  */
+
+static void
+asan_add_global (tree decl, tree type, VEC(constructor_elt, gc) *v)
+{
+  tree init, uptr = TREE_TYPE (DECL_CHAIN (TYPE_FIELDS (type)));
+  unsigned HOST_WIDE_INT size;
+  tree str_cst, refdecl = decl;
+  VEC(constructor_elt, gc) *vinner = NULL;
+
+  if (!asan_pp_initialized)
+    asan_pp_initialize ();
+
+  pp_clear_output_area (&asan_pp);
+  if (DECL_NAME (decl))
+    pp_base_tree_identifier (&asan_pp, DECL_NAME (decl));
+  else
+    pp_string (&asan_pp, "<unknown>");
+  pp_space (&asan_pp);
+  pp_left_paren (&asan_pp);
+  pp_string (&asan_pp, main_input_filename);
+  pp_right_paren (&asan_pp);
+  str_cst = asan_pp_string ();
+
+  if (asan_needs_local_alias (decl))
+    {
+      char buf[20];
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LASAN",
+				   VEC_length (constructor_elt, v) + 1);
+      refdecl = build_decl (DECL_SOURCE_LOCATION (decl),
+			    VAR_DECL, get_identifier (buf), TREE_TYPE (decl));
+      TREE_ADDRESSABLE (refdecl) = TREE_ADDRESSABLE (decl);
+      TREE_READONLY (refdecl) = TREE_READONLY (decl);
+      TREE_THIS_VOLATILE (refdecl) = TREE_THIS_VOLATILE (decl);
+      DECL_GIMPLE_REG_P (refdecl) = DECL_GIMPLE_REG_P (decl);
+      DECL_ARTIFICIAL (refdecl) = DECL_ARTIFICIAL (decl);
+      DECL_IGNORED_P (refdecl) = DECL_IGNORED_P (decl);
+      TREE_STATIC (refdecl) = 1;
+      TREE_PUBLIC (refdecl) = 0;
+      TREE_USED (refdecl) = 1;
+      assemble_alias (refdecl, DECL_ASSEMBLER_NAME (decl));
+    }
+
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE,
+			  fold_convert (const_ptr_type_node,
+					build_fold_addr_expr (refdecl)));
+  size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE, build_int_cst (uptr, size));
+  size += asan_red_zone_size (size);
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE, build_int_cst (uptr, size));
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE,
+			  fold_convert (const_ptr_type_node, str_cst));
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE, build_int_cst (uptr, 0));
+  init = build_constructor (type, vinner);
+  CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, init);
+}
+
 /* Module-level instrumentation.
    - Insert __asan_init() into the list of CTORs.
    - TODO: insert redzones around globals.
@@ -577,10 +757,61 @@ void
 asan_finish_file (void)
 {
   tree ctor_statements = NULL_TREE;
+  struct varpool_node *vnode;
+  unsigned HOST_WIDE_INT gcount = 0;
+
   append_to_statement_list (build_call_expr (asan_init_func (), 0),
-                            &ctor_statements);
+			    &ctor_statements);
+  FOR_EACH_DEFINED_VARIABLE (vnode)
+    if (asan_protect_global (vnode->symbol.decl))
+      ++gcount;
+  if (gcount)
+    {
+      tree type = asan_global_struct (), var, ctor, decl;
+      tree uptr = build_nonstandard_integer_type (POINTER_SIZE, 1);
+      tree dtor_statements = NULL_TREE;
+      VEC(constructor_elt, gc) *v;
+      char buf[20];
+
+      type = build_array_type_nelts (type, gcount);
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LASAN", 0);
+      var = build_decl (UNKNOWN_LOCATION, VAR_DECL, get_identifier (buf),
+			type);
+      TREE_STATIC (var) = 1;
+      TREE_PUBLIC (var) = 0;
+      DECL_ARTIFICIAL (var) = 1;
+      DECL_IGNORED_P (var) = 1;
+      v = VEC_alloc (constructor_elt, gc, gcount);
+      FOR_EACH_DEFINED_VARIABLE (vnode)
+	if (asan_protect_global (vnode->symbol.decl))
+	  asan_add_global (vnode->symbol.decl, TREE_TYPE (type), v);
+      ctor = build_constructor (type, v);
+      TREE_CONSTANT (ctor) = 1;
+      TREE_STATIC (ctor) = 1;
+      DECL_INITIAL (var) = ctor;
+      varpool_assemble_decl (varpool_node (var));
+
+      type = build_function_type_list (void_type_node,
+				       build_pointer_type (TREE_TYPE (type)),
+				       uptr, NULL_TREE);
+      decl = build_fn_decl ("__asan_register_globals", type);
+      TREE_NOTHROW (decl) = 1;
+      append_to_statement_list (build_call_expr (decl, 2,
+						 build_fold_addr_expr (var),
+						 build_int_cst (uptr, gcount)),
+				&ctor_statements);
+
+      decl = build_fn_decl ("__asan_unregister_globals", type);
+      TREE_NOTHROW (decl) = 1;
+      append_to_statement_list (build_call_expr (decl, 2,
+						 build_fold_addr_expr (var),
+						 build_int_cst (uptr, gcount)),
+				&dtor_statements);
+      cgraph_build_static_cdtor ('D', dtor_statements,
+				 MAX_RESERVED_INIT_PRIORITY - 1);
+    }
   cgraph_build_static_cdtor ('I', ctor_statements,
-                             MAX_RESERVED_INIT_PRIORITY - 1);
+			     MAX_RESERVED_INIT_PRIORITY - 1);
 }
 
 /* Initialize shadow_ptr_types array.  */
diff --git a/gcc/asan.h b/gcc/asan.h
index 6f0edbf..d9368a8 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 
 extern void asan_finish_file (void);
 extern rtx asan_emit_stack_protection (rtx, HOST_WIDE_INT *, tree *, int);
+extern bool asan_protect_global (tree);
 
 /* Alias set for accessing the shadow memory.  */
 extern alias_set_type asan_shadow_set;
@@ -56,4 +57,14 @@ asan_protect_stack_decl (tree decl)
   return DECL_P (decl) && !DECL_ARTIFICIAL (decl);
 }
 
+/* Return the size of padding needed to insert after a protected
+   decl of SIZE.  */
+
+static inline unsigned int
+asan_red_zone_size (unsigned int size)
+{
+  unsigned int c = size & (ASAN_RED_ZONE_SIZE - 1);
+  return c ? 2 * ASAN_RED_ZONE_SIZE - c : ASAN_RED_ZONE_SIZE;
+}
+
 #endif /* TREE_ASAN */
diff --git a/gcc/varasm.c b/gcc/varasm.c
index b300348..8a533ed 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -51,6 +51,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-mudflap.h"
 #include "cgraph.h"
 #include "pointer-set.h"
+#include "asan.h"
 
 #ifdef XCOFF_DEBUGGING_INFO
 #include "xcoffout.h"		/* Needed for external data
@@ -1831,6 +1832,9 @@ assemble_noswitch_variable (tree decl, const char *name, section *sect)
   size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
   rounded = size;
 
+  if (flag_asan && asan_protect_global (decl))
+    size += asan_red_zone_size (size);
+
   /* Don't allocate zero bytes of common,
      since that means "undefined external" in the linker.  */
   if (size == 0)
@@ -1897,6 +1901,7 @@ assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED,
   const char *name;
   rtx decl_rtl, symbol;
   section *sect;
+  bool asan_protected = false;
 
   /* This function is supposed to handle VARIABLES.  Ensure we have one.  */
   gcc_assert (TREE_CODE (decl) == VAR_DECL);
@@ -1984,6 +1989,15 @@ assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED,
   /* Compute the alignment of this data.  */
 
   align_variable (decl, dont_output_data);
+
+  if (flag_asan
+      && asan_protect_global (decl)
+      && DECL_ALIGN (decl) < ASAN_RED_ZONE_SIZE * BITS_PER_UNIT)
+    {
+      asan_protected = true;
+      DECL_ALIGN (decl) = ASAN_RED_ZONE_SIZE * BITS_PER_UNIT;
+    }
+
   set_mem_align (decl_rtl, DECL_ALIGN (decl));
 
   if (TREE_PUBLIC (decl))
@@ -2022,6 +2036,12 @@ assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED,
       if (DECL_ALIGN (decl) > BITS_PER_UNIT)
 	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (DECL_ALIGN_UNIT (decl)));
       assemble_variable_contents (decl, name, dont_output_data);
+      if (asan_protected)
+	{
+	  unsigned HOST_WIDE_INT int size
+	    = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
+	  assemble_zeros (asan_red_zone_size (size));
+	}
     }
 }
 
@@ -6926,6 +6946,8 @@ place_block_symbol (rtx symbol)
       decl = SYMBOL_REF_DECL (symbol);
       alignment = DECL_ALIGN (decl);
       size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
+      if (flag_asan && asan_protect_global (decl))
+	size += asan_red_zone_size (size);
     }
 
   /* Calculate the object's offset from the start of the block.  */
-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 08/13] Fix a couple of ICEs.
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
@ 2012-11-01 19:53 ` dodji
  2012-11-01 19:53 ` [PATCH 10/13] Make build_check_stmt accept an SSA_NAME for its base dodji
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 80+ messages in thread
From: dodji @ 2012-11-01 19:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

From: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4>

After the previous patches uncovered the fact a NOTE_INSN_BASIC_BLOCK
could show up in the middle of a basic block and thus violating an
important invariant.  THe cfgexpand.c hunk fixes that.

Then it appeared that we could get tree sharing violation if
build_check_stmt doesn't unshare its base memory parameter before
building an ssa name for it.

The last hunk fixes a crash that happens because
cgraph_build_static_cdtor can call ggc_collect so holding trees around
in automatic (thus visited by the gc marker routines) could lead to
these tree behind free-ed underneath us.  So the patch adds new gc
roots for these trees.

	* asan.c (build_check_stmt): Unshare base.

	* asan.c (asan_ctor_statements): New variable.
	(asan_finish_file): Use asan_ctor_statements instead
	of ctor_statements.

	* cfgexpand.c (gimple_expand_cfg): If return_label is
	followed by NOTE_INSN_BASIC_BLOCK, emit var_ret_seq
	after the note instead of before it.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192567 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan | 12 ++++++++++++
 gcc/asan.c         | 13 +++++++++----
 gcc/cfgexpand.c    |  8 +++++++-
 3 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 971de42..3da0a0b 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,3 +1,15 @@
+2012-10-18  Jakub Jelinek  <jakub@redhat.com>
+
+	* asan.c (build_check_stmt): Unshare base.
+
+	* asan.c (asan_ctor_statements): New variable.
+	(asan_finish_file): Use asan_ctor_statements instead
+	of ctor_statements.
+
+	* cfgexpand.c (gimple_expand_cfg): If return_label is
+	followed by NOTE_INSN_BASIC_BLOCK, emit var_ret_seq
+	after the note instead of before it.
+
 2012-10-17  Jakub Jelinek  <jakub@redhat.com>
 
 	* varasm.c: Include asan.h.
diff --git a/gcc/asan.c b/gcc/asan.c
index c435d35..6715e51 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -459,6 +459,8 @@ build_check_stmt (tree base,
       set_immediate_dominator (CDI_DOMINATORS, else_bb, cond_bb);
     }
 
+  base = unshare_expr (base);
+
   gsi = gsi_last_bb (cond_bb);
   g = gimple_build_assign_with_ops (TREE_CODE (base),
 				    make_ssa_name (TREE_TYPE (base), NULL),
@@ -748,6 +750,10 @@ asan_add_global (tree decl, tree type, VEC(constructor_elt, gc) *v)
   CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, init);
 }
 
+/* Needs to be GTY(()), because cgraph_build_static_cdtor may
+   invoke ggc_collect.  */
+static GTY(()) tree asan_ctor_statements;
+
 /* Module-level instrumentation.
    - Insert __asan_init() into the list of CTORs.
    - TODO: insert redzones around globals.
@@ -756,12 +762,11 @@ asan_add_global (tree decl, tree type, VEC(constructor_elt, gc) *v)
 void
 asan_finish_file (void)
 {
-  tree ctor_statements = NULL_TREE;
   struct varpool_node *vnode;
   unsigned HOST_WIDE_INT gcount = 0;
 
   append_to_statement_list (build_call_expr (asan_init_func (), 0),
-			    &ctor_statements);
+			    &asan_ctor_statements);
   FOR_EACH_DEFINED_VARIABLE (vnode)
     if (asan_protect_global (vnode->symbol.decl))
       ++gcount;
@@ -799,7 +804,7 @@ asan_finish_file (void)
       append_to_statement_list (build_call_expr (decl, 2,
 						 build_fold_addr_expr (var),
 						 build_int_cst (uptr, gcount)),
-				&ctor_statements);
+				&asan_ctor_statements);
 
       decl = build_fn_decl ("__asan_unregister_globals", type);
       TREE_NOTHROW (decl) = 1;
@@ -810,7 +815,7 @@ asan_finish_file (void)
       cgraph_build_static_cdtor ('D', dtor_statements,
 				 MAX_RESERVED_INIT_PRIORITY - 1);
     }
-  cgraph_build_static_cdtor ('I', ctor_statements,
+  cgraph_build_static_cdtor ('I', asan_ctor_statements,
 			     MAX_RESERVED_INIT_PRIORITY - 1);
 }
 
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 67cf902..16fd0fb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -4638,7 +4638,13 @@ gimple_expand_cfg (void)
   insn_locations_finalize ();
 
   if (var_ret_seq)
-    emit_insn_after (var_ret_seq, return_label);
+    {
+      rtx after = return_label;
+      rtx next = NEXT_INSN (after);
+      if (next && NOTE_INSN_BASIC_BLOCK_P (next))
+	after = next;
+      emit_insn_after (var_ret_seq, after);
+    }
 
   /* Zap the tree EH table.  */
   set_eh_throw_stmt_table (cfun, NULL);
-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 11/13] Factorize condition insertion code out of build_check_stmt
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
                   ` (6 preceding siblings ...)
  2012-11-01 19:53 ` [PATCH 03/13] Initial asan cleanups dodji
@ 2012-11-01 19:53 ` dodji
  2012-11-01 19:53 ` [PATCH 07/13] Implement protection of global variables dodji
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 80+ messages in thread
From: dodji @ 2012-11-01 19:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

From: dodji <dodji@138bc75d-0d04-0410-961f-82ee72b054a4>

This patch splits a new create_cond_insert_point_before_iter function
out of build_check_stmt, to be used by a later patch.

Tested by running cc1 -fasan on the test program below with and
without the patch and by inspecting the gimple output to see that
there is no change.

void
foo ()
{
  char foo[1] = {0};

  foo[0] = 1;
}

gcc/

	* asan.c (create_cond_insert_point_before_iter): Factorize out of ...
	(build_check_stmt): ... here.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192844 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |   3 ++
 gcc/asan.c         | 120 +++++++++++++++++++++++++++++++++--------------------
 2 files changed, 79 insertions(+), 44 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 9159b3f..0e0b9b8 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,5 +1,8 @@
 2012-10-26  Dodji Seketeli  <dodji@redhat.com>
 
+	* asan.c (create_cond_insert_point_before_iter): Factorize out of ...
+	(build_check_stmt): ... here.
+
 	* asan.c (build_check_stmt): Accept the memory access to be
 	represented by an SSA_NAME.
 
diff --git a/gcc/asan.c b/gcc/asan.c
index b43f03b..736286e 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -397,6 +397,75 @@ asan_init_func (void)
 #define PROB_VERY_UNLIKELY	(REG_BR_PROB_BASE / 2000 - 1)
 #define PROB_ALWAYS		(REG_BR_PROB_BASE)
 
+/* Split the current basic block and create a condition statement
+   insertion point right before the statement pointed to by ITER.
+   Return an iterator to the point at which the caller might safely
+   insert the condition statement.
+
+   THEN_BLOCK must be set to the address of an uninitialized instance
+   of basic_block.  The function will then set *THEN_BLOCK to the
+   'then block' of the condition statement to be inserted by the
+   caller.
+
+   Similarly, the function will set *FALLTRHOUGH_BLOCK to the 'else
+   block' of the condition statement to be inserted by the caller.
+
+   Note that *FALLTHROUGH_BLOCK is a new block that contains the
+   statements starting from *ITER, and *THEN_BLOCK is a new empty
+   block.
+
+   *ITER is adjusted to still point to the same statement it was
+   *pointing to initially.  */
+
+static gimple_stmt_iterator
+create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
+				      bool then_more_likely_p,
+				      basic_block *then_block,
+				      basic_block *fallthrough_block)
+{
+  gimple_stmt_iterator gsi = *iter;
+
+  if (!gsi_end_p (gsi))
+    gsi_prev (&gsi);
+
+  basic_block cur_bb = gsi_bb (*iter);
+
+  edge e = split_block (cur_bb, gsi_stmt (gsi));
+
+  /* Get a hold on the 'condition block', the 'then block' and the
+     'else block'.  */
+  basic_block cond_bb = e->src;
+  basic_block fallthru_bb = e->dest;
+  basic_block then_bb = create_empty_bb (cond_bb);
+
+  /* Set up the newly created 'then block'.  */
+  e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  int fallthrough_probability =
+    then_more_likely_p
+    ? PROB_VERY_UNLIKELY
+    : PROB_ALWAYS - PROB_VERY_UNLIKELY;
+  e->probability = PROB_ALWAYS - fallthrough_probability;
+  make_single_succ_edge (then_bb, fallthru_bb, EDGE_FALLTHRU);
+
+  /* Set up the fallthrough basic block.  */
+  e = find_edge (cond_bb, fallthru_bb);
+  e->flags = EDGE_FALSE_VALUE;
+  e->count = cond_bb->count;
+  e->probability = fallthrough_probability;
+
+  /* Update dominance info for the newly created then_bb; note that
+     fallthru_bb's dominance info has already been updated by
+     split_bock.  */
+  if (dom_info_available_p (CDI_DOMINATORS))
+    set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
+
+  *then_block = then_bb;
+  *fallthrough_block = fallthru_bb;
+  *iter = gsi_start_bb (fallthru_bb);
+
+  return gsi_last_bb (cond_bb);
+}
+
 /* Instrument the memory access instruction BASE.  Insert new
    statements before ITER.
 
@@ -411,8 +480,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 		  int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
-  basic_block cond_bb, then_bb, else_bb;
-  edge e;
+  basic_block then_bb, else_bb;
   tree t, base_addr, shadow;
   gimple g;
   tree shadow_ptr_type = shadow_ptr_types[size_in_bytes == 16 ? 1 : 0];
@@ -421,51 +489,15 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
     = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE (base)), 1);
   tree base_ssa = base;
 
-  /* We first need to split the current basic block, and start altering
-     the CFG.  This allows us to insert the statements we're about to
-     construct into the right basic blocks.  */
-
-  cond_bb = gimple_bb (gsi_stmt (*iter));
-  gsi = *iter;
-  gsi_prev (&gsi);
-  if (!gsi_end_p (gsi))
-    e = split_block (cond_bb, gsi_stmt (gsi));
-  else
-    e = split_block_after_labels (cond_bb);
-  cond_bb = e->src;
-  else_bb = e->dest;
-
-  /* A recap at this point: else_bb is the basic block at whose head
-     is the gimple statement for which this check expression is being
-     built.  cond_bb is the (possibly new, synthetic) basic block the
-     end of which will contain the cache-lookup code, and a
-     conditional that jumps to the cache-miss code or, much more
-     likely, over to else_bb.  */
-
-  /* Create the bb that contains the crash block.  */
-  then_bb = create_empty_bb (cond_bb);
-  e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
-  e->probability = PROB_VERY_UNLIKELY;
-  make_single_succ_edge (then_bb, else_bb, EDGE_FALLTHRU);
-
-  /* Mark the pseudo-fallthrough edge from cond_bb to else_bb.  */
-  e = find_edge (cond_bb, else_bb);
-  e->flags = EDGE_FALSE_VALUE;
-  e->count = cond_bb->count;
-  e->probability = PROB_ALWAYS - PROB_VERY_UNLIKELY;
-
-  /* Update dominance info.  Note that bb_join's data was
-     updated by split_block.  */
-  if (dom_info_available_p (CDI_DOMINATORS))
-    {
-      set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
-      set_immediate_dominator (CDI_DOMINATORS, else_bb, cond_bb);
-    }
+  /* Get an iterator on the point where we can add the condition
+     statement for the instrumentation.  */
+  gsi = create_cond_insert_point_before_iter (iter,
+					      /*then_more_likely_p=*/false,
+					      &then_bb,
+					      &else_bb);
 
   base = unshare_expr (base);
 
-  gsi = gsi_last_bb (cond_bb);
-
   /* BASE can already be an SSA_NAME; in that case, do not create a
      new SSA_NAME for it.  */
   if (TREE_CODE (base) != SSA_NAME)
-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 00/13] Request to merge Address Sanitizer in
@ 2012-11-01 19:53 dodji
  2012-11-01 19:53 ` [PATCH 08/13] Fix a couple of ICEs dodji
                   ` (14 more replies)
  0 siblings, 15 replies; 80+ messages in thread
From: dodji @ 2012-11-01 19:53 UTC (permalink / raw)
  To: gcc-patches
  Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany, Dodji Seketeli

From: Dodji Seketeli <dodji@seketeli.org>

Hello,

The set of patches following this message represents the work that
happened on the asan branch to build up the Address Sanitizer work
started in the Google branch.

Address Sanitizer (aka asan) is a memory error detector.  It finds
use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++
programs.

One can learn about the way it works by reading the pdf slides at [1],
or by reading the documentation on the wiki page of the project at [2].

To make a long story short, it works by associating each memory region
of eight consecutive bytes with a shadow byte that tells whether if
each byte of the memory region is addressable or not.  So,
conceptually, there is a function 'MemToShadow' which, for each set of
contiguous eight bytes of memory returns a shadow byte that tells
whether if each byte is accessible or not.

Then, each memory access is instrumented by the asan pass to retrieve
the shadow byte of the accessed memory; if the access is to a memory
address that is deemed non-accessible, a call to an asan runtime
library function is issued to report a meaningful error to the user,
and the access is performed, letting the user program proceed despite
the error.

The advantage of this approach, compared to say, Valgrind[4] is the
lower time and space overhead.  Eventually, when this tool becomes
more solid, it'll become complementary to Valgrind.

Apart from the compiler components, asan needs a runtime library to
function.  We share that library with the LLVM implementation of asan
that is described at [3].  The last patch of the set imports this
library in its pristine form into our tree.  The plan is to regularly
synchronize it with its LLVM upstream repository.

On behalf of the GCC asan developers listed below, I am thus proposing
these patches for inclusion into trunk.  I chose to follow the
chronological commits that happened on the [asan] branch, to ease the
authorship propagation.  Except for some few exceptions, each of these
commits are reasonably logically atomic, so they hopefully shouldn't
be too hard to review.

The first patch is the initial import of the asan state from the
Google branch into the [asan] branch.  Subsequent patches clean the
code up, add features like protection of stack and global variables,
instrumentation of memory access through built-in functions, and, last
but not least, the import of the runtime library.

Please note that the ChangeLog.asan is meant to disappear at commit
time, as its content will be updated (for the dates) and prepended to
the normal ChangeLog file.

One noticeable shortcoming that we have at the moment is the lack of a
DejaGNU test harness for this.  This is planned to be addressed as
soon as possible.

Please find below is a summary of the patches of the set.

Thanks.

[1]: http://gcc.gnu.org/wiki/cauldron2012?action=AttachFile&do=get&target=kcc.pdf
[2]: http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm
[3]: http://code.google.com/p/address-sanitizer/w/list
[4]: http://www.valgrind.org

Diego Novillo (2):
  Initial import of asan from the Google branch
  Rename tree-asan.[ch] to asan.[ch]

Dodji Seketeli (3):
  Make build_check_stmt accept an SSA_NAME for its base
  Factorize condition insertion code out of build_check_stmt
  Instrument built-in memory access function calls

Jakub Jelinek (6):
  Initial asan cleanups
  Emit GIMPLE directly instead of gimplifying GENERIC.
  Allow asan at -O0
  Implement protection of stack variables
  Implement protection of global variables
  Fix a couple of ICEs.

Wei Mi (2):
  Don't forget to protect 32 bytes aligned global variables.
  Import the asan runtime library into GCC tree

 ChangeLog.asan                                     |     7 +
 Makefile.def                                       |     2 +
 Makefile.in                                        |   487 +-
 configure                                          |     1 +
 configure.ac                                       |     1 +
 gcc/ChangeLog.asan                                 |   175 +
 gcc/Makefile.in                                    |    10 +-
 gcc/asan.c                                         |  1495 ++
 gcc/asan.h                                         |    70 +
 gcc/cfgexpand.c                                    |   165 +-
 gcc/common.opt                                     |     4 +
 gcc/config/i386/i386.c                             |    11 +
 gcc/doc/invoke.texi                                |     8 +-
 gcc/doc/tm.texi                                    |     6 +
 gcc/doc/tm.texi.in                                 |     2 +
 gcc/gcc.c                                          |     1 +
 gcc/passes.c                                       |     2 +
 gcc/target.def                                     |    11 +
 gcc/toplev.c                                       |    14 +
 gcc/tree-pass.h                                    |     2 +
 gcc/varasm.c                                       |    22 +
 libasan/ChangeLog.asan                             |     3 +
 libasan/LICENSE.TXT                                |    97 +
 libasan/Makefile.am                                |    98 +
 libasan/Makefile.in                                |   992 ++
 libasan/README.gcc                                 |     4 +
 libasan/aclocal.m4                                 |  9645 ++++++++++
 libasan/asan_allocator.cc                          |  1045 ++
 libasan/asan_allocator.h                           |   177 +
 libasan/asan_flags.h                               |   103 +
 libasan/asan_globals.cc                            |   206 +
 libasan/asan_intercepted_functions.h               |   217 +
 libasan/asan_interceptors.cc                       |   704 +
 libasan/asan_interceptors.h                        |    39 +
 libasan/asan_internal.h                            |   169 +
 libasan/asan_linux.cc                              |   150 +
 libasan/asan_lock.h                                |    40 +
 libasan/asan_mac.cc                                |   526 +
 libasan/asan_mac.h                                 |    54 +
 libasan/asan_malloc_linux.cc                       |   142 +
 libasan/asan_malloc_mac.cc                         |   427 +
 libasan/asan_malloc_win.cc                         |   140 +
 libasan/asan_mapping.h                             |   120 +
 libasan/asan_new_delete.cc                         |    54 +
 libasan/asan_poisoning.cc                          |   151 +
 libasan/asan_posix.cc                              |   118 +
 libasan/asan_report.cc                             |   492 +
 libasan/asan_report.h                              |    51 +
 libasan/asan_rtl.cc                                |   404 +
 libasan/asan_stack.cc                              |    35 +
 libasan/asan_stack.h                               |    52 +
 libasan/asan_stats.cc                              |    86 +
 libasan/asan_stats.h                               |    65 +
 libasan/asan_thread.cc                             |   153 +
 libasan/asan_thread.h                              |   103 +
 libasan/asan_thread_registry.cc                    |   188 +
 libasan/asan_thread_registry.h                     |    83 +
 libasan/asan_win.cc                                |   190 +
 libasan/config.guess                               |  1530 ++
 libasan/config.sub                                 |  1773 ++
 libasan/configure                                  | 17515 +++++++++++++++++++
 libasan/configure.ac                               |    67 +
 libasan/depcomp                                    |   630 +
 libasan/include/sanitizer/asan_interface.h         |   197 +
 libasan/include/sanitizer/common_interface_defs.h  |    66 +
 libasan/install-sh                                 |   527 +
 libasan/interception/interception.h                |   195 +
 libasan/interception/interception_linux.cc         |    28 +
 libasan/interception/interception_linux.h          |    35 +
 libasan/interception/interception_mac.cc           |    29 +
 libasan/interception/interception_mac.h            |    47 +
 libasan/interception/interception_win.cc           |   149 +
 libasan/interception/interception_win.h            |    43 +
 libasan/libtool-version                            |     6 +
 libasan/ltmain.sh                                  |  9661 ++++++++++
 libasan/missing                                    |   376 +
 libasan/sanitizer_common/sanitizer_allocator.cc    |    83 +
 libasan/sanitizer_common/sanitizer_allocator64.h   |   573 +
 libasan/sanitizer_common/sanitizer_atomic.h        |    63 +
 libasan/sanitizer_common/sanitizer_atomic_clang.h  |   120 +
 libasan/sanitizer_common/sanitizer_atomic_msvc.h   |   134 +
 libasan/sanitizer_common/sanitizer_common.cc       |   151 +
 libasan/sanitizer_common/sanitizer_common.h        |   181 +
 libasan/sanitizer_common/sanitizer_flags.cc        |    95 +
 libasan/sanitizer_common/sanitizer_flags.h         |    25 +
 libasan/sanitizer_common/sanitizer_internal_defs.h |   186 +
 libasan/sanitizer_common/sanitizer_libc.cc         |   189 +
 libasan/sanitizer_common/sanitizer_libc.h          |    69 +
 libasan/sanitizer_common/sanitizer_linux.cc        |   296 +
 libasan/sanitizer_common/sanitizer_list.h          |   118 +
 libasan/sanitizer_common/sanitizer_mac.cc          |   249 +
 libasan/sanitizer_common/sanitizer_mutex.h         |   106 +
 libasan/sanitizer_common/sanitizer_placement_new.h |    31 +
 libasan/sanitizer_common/sanitizer_posix.cc        |   187 +
 libasan/sanitizer_common/sanitizer_printf.cc       |   196 +
 libasan/sanitizer_common/sanitizer_procmaps.h      |    95 +
 libasan/sanitizer_common/sanitizer_stackdepot.cc   |   194 +
 libasan/sanitizer_common/sanitizer_stackdepot.h    |    27 +
 libasan/sanitizer_common/sanitizer_stacktrace.cc   |   245 +
 libasan/sanitizer_common/sanitizer_stacktrace.h    |    73 +
 libasan/sanitizer_common/sanitizer_symbolizer.cc   |   311 +
 libasan/sanitizer_common/sanitizer_symbolizer.h    |    97 +
 .../sanitizer_common/sanitizer_symbolizer_linux.cc |   162 +
 .../sanitizer_common/sanitizer_symbolizer_mac.cc   |    31 +
 .../sanitizer_common/sanitizer_symbolizer_win.cc   |    33 +
 libasan/sanitizer_common/sanitizer_win.cc          |   205 +
 106 files changed, 57193 insertions(+), 25 deletions(-)
 create mode 100644 ChangeLog.asan
 create mode 100644 gcc/ChangeLog.asan
 create mode 100644 gcc/asan.c
 create mode 100644 gcc/asan.h
 create mode 100644 libasan/ChangeLog.asan
 create mode 100644 libasan/LICENSE.TXT
 create mode 100644 libasan/Makefile.am
 create mode 100644 libasan/Makefile.in
 create mode 100644 libasan/README.gcc
 create mode 100644 libasan/aclocal.m4
 create mode 100644 libasan/asan_allocator.cc
 create mode 100644 libasan/asan_allocator.h
 create mode 100644 libasan/asan_flags.h
 create mode 100644 libasan/asan_globals.cc
 create mode 100644 libasan/asan_intercepted_functions.h
 create mode 100644 libasan/asan_interceptors.cc
 create mode 100644 libasan/asan_interceptors.h
 create mode 100644 libasan/asan_internal.h
 create mode 100644 libasan/asan_linux.cc
 create mode 100644 libasan/asan_lock.h
 create mode 100644 libasan/asan_mac.cc
 create mode 100644 libasan/asan_mac.h
 create mode 100644 libasan/asan_malloc_linux.cc
 create mode 100644 libasan/asan_malloc_mac.cc
 create mode 100644 libasan/asan_malloc_win.cc
 create mode 100644 libasan/asan_mapping.h
 create mode 100644 libasan/asan_new_delete.cc
 create mode 100644 libasan/asan_poisoning.cc
 create mode 100644 libasan/asan_posix.cc
 create mode 100644 libasan/asan_report.cc
 create mode 100644 libasan/asan_report.h
 create mode 100644 libasan/asan_rtl.cc
 create mode 100644 libasan/asan_stack.cc
 create mode 100644 libasan/asan_stack.h
 create mode 100644 libasan/asan_stats.cc
 create mode 100644 libasan/asan_stats.h
 create mode 100644 libasan/asan_thread.cc
 create mode 100644 libasan/asan_thread.h
 create mode 100644 libasan/asan_thread_registry.cc
 create mode 100644 libasan/asan_thread_registry.h
 create mode 100644 libasan/asan_win.cc
 create mode 100644 libasan/config.guess
 create mode 100644 libasan/config.sub
 create mode 100644 libasan/configure
 create mode 100644 libasan/configure.ac
 create mode 100644 libasan/depcomp
 create mode 100644 libasan/include/sanitizer/asan_interface.h
 create mode 100644 libasan/include/sanitizer/common_interface_defs.h
 create mode 100644 libasan/install-sh
 create mode 100644 libasan/interception/interception.h
 create mode 100644 libasan/interception/interception_linux.cc
 create mode 100644 libasan/interception/interception_linux.h
 create mode 100644 libasan/interception/interception_mac.cc
 create mode 100644 libasan/interception/interception_mac.h
 create mode 100644 libasan/interception/interception_win.cc
 create mode 100644 libasan/interception/interception_win.h
 create mode 100644 libasan/libtool-version
 create mode 100644 libasan/ltmain.sh
 create mode 100644 libasan/missing
 create mode 100644 libasan/sanitizer_common/sanitizer_allocator.cc
 create mode 100644 libasan/sanitizer_common/sanitizer_allocator64.h
 create mode 100644 libasan/sanitizer_common/sanitizer_atomic.h
 create mode 100644 libasan/sanitizer_common/sanitizer_atomic_clang.h
 create mode 100644 libasan/sanitizer_common/sanitizer_atomic_msvc.h
 create mode 100644 libasan/sanitizer_common/sanitizer_common.cc
 create mode 100644 libasan/sanitizer_common/sanitizer_common.h
 create mode 100644 libasan/sanitizer_common/sanitizer_flags.cc
 create mode 100644 libasan/sanitizer_common/sanitizer_flags.h
 create mode 100644 libasan/sanitizer_common/sanitizer_internal_defs.h
 create mode 100644 libasan/sanitizer_common/sanitizer_libc.cc
 create mode 100644 libasan/sanitizer_common/sanitizer_libc.h
 create mode 100644 libasan/sanitizer_common/sanitizer_linux.cc
 create mode 100644 libasan/sanitizer_common/sanitizer_list.h
 create mode 100644 libasan/sanitizer_common/sanitizer_mac.cc
 create mode 100644 libasan/sanitizer_common/sanitizer_mutex.h
 create mode 100644 libasan/sanitizer_common/sanitizer_placement_new.h
 create mode 100644 libasan/sanitizer_common/sanitizer_posix.cc
 create mode 100644 libasan/sanitizer_common/sanitizer_printf.cc
 create mode 100644 libasan/sanitizer_common/sanitizer_procmaps.h
 create mode 100644 libasan/sanitizer_common/sanitizer_stackdepot.cc
 create mode 100644 libasan/sanitizer_common/sanitizer_stackdepot.h
 create mode 100644 libasan/sanitizer_common/sanitizer_stacktrace.cc
 create mode 100644 libasan/sanitizer_common/sanitizer_stacktrace.h
 create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer.cc
 create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer.h
 create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer_linux.cc
 create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer_mac.cc
 create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer_win.cc
 create mode 100644 libasan/sanitizer_common/sanitizer_win.cc


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 04/13] Emit GIMPLE directly instead of gimplifying GENERIC.
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
                   ` (10 preceding siblings ...)
  2012-11-01 19:53 ` [PATCH 12/13] Instrument built-in memory access function calls dodji
@ 2012-11-01 19:54 ` dodji
  2012-11-02 22:53 ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 80+ messages in thread
From: dodji @ 2012-11-01 19:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

From: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4>

This patch cleanups the instrumentation code generation by emitting
GIMPLE directly, as opposed to emitting GENERIC tree and then
gimplifying them.  It also does some cleanups here and there

	* Makefile.in (GTFILES): Add $(srcdir)/asan.c.
	* asan.c (shadow_ptr_types): New variable.
	(report_error_func): Change is_store argument to bool, don't append
	newline to function name.
	(PROB_VERY_UNLIKELY, PROB_ALWAYS): Define.
	(build_check_stmt): Change is_store argument to bool.  Emit GIMPLE
	directly instead of creating trees and gimplifying them.  Mark
	the error reporting function as very unlikely.
	(instrument_derefs): Change is_store argument to bool.  Use
	int_size_in_bytes to compute size_in_bytes, simplify size check.
	Use build_fold_addr_expr instead of build_addr.
	(transform_statements): Adjust instrument_derefs caller.
	Use gimple_assign_single_p as stmt test.  Don't look at MEM refs
	in rhs2.
	(asan_init_shadow_ptr_types): New function.
	(asan_instrument): Don't push/pop gimplify context.
	Call asan_init_shadow_ptr_types if not yet initialized.
	* asan.h (ASAN_SHADOW_SHIFT): Adjust comment.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192375 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |  19 ++++
 gcc/Makefile.in    |   1 +
 gcc/asan.c         | 268 ++++++++++++++++++++++++++++++++---------------------
 gcc/asan.h         |   2 +-
 4 files changed, 185 insertions(+), 105 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 0bc9420..9bfccd7 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,5 +1,24 @@
 2012-10-11  Jakub Jelinek  <jakub@redhat.com>
 
+	* Makefile.in (GTFILES): Add $(srcdir)/asan.c.
+	* asan.c (shadow_ptr_types): New variable.
+	(report_error_func): Change is_store argument to bool, don't append
+	newline to function name.
+	(PROB_VERY_UNLIKELY, PROB_ALWAYS): Define.
+	(build_check_stmt): Change is_store argument to bool.  Emit GIMPLE
+	directly instead of creating trees and gimplifying them.  Mark
+	the error reporting function as very unlikely.
+	(instrument_derefs): Change is_store argument to bool.  Use
+	int_size_in_bytes to compute size_in_bytes, simplify size check.
+	Use build_fold_addr_expr instead of build_addr.
+	(transform_statements): Adjust instrument_derefs caller.
+	Use gimple_assign_single_p as stmt test.  Don't look at MEM refs
+	in rhs2.
+	(asan_init_shadow_ptr_types): New function.
+	(asan_instrument): Don't push/pop gimplify context.
+	Call asan_init_shadow_ptr_types if not yet initialized.
+	* asan.h (ASAN_SHADOW_SHIFT): Adjust comment.
+
 	* toplev.c (process_options): Warn and turn off -fasan
 	if not supported by target.
 	* asan.c: Include target.h.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index bdc5afb..2ab1ca9 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3726,6 +3726,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/lto-streamer.h \
   $(srcdir)/target-globals.h \
   $(srcdir)/ipa-inline.h \
+  $(srcdir)/asan.c \
   @all_gtfiles@
 
 # Compute the list of GT header files from the corresponding C sources,
diff --git a/gcc/asan.c b/gcc/asan.c
index e95be47..2e7d4d6 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -79,18 +79,22 @@ along with GCC; see the file COPYING3.  If not see
  to create redzones for stack and global object and poison them.
 */
 
+/* Pointer types to 1 resp. 2 byte integers in shadow memory.  A separate
+   alias set is used for all shadow memory accesses.  */
+static GTY(()) tree shadow_ptr_types[2];
+
 /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
    IS_STORE is either 1 (for a store) or 0 (for a load).
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
 
 static tree
-report_error_func (int is_store, int size_in_bytes)
+report_error_func (bool is_store, int size_in_bytes)
 {
   tree fn_type;
   tree def;
   char name[100];
 
-  sprintf (name, "__asan_report_%s%d\n",
+  sprintf (name, "__asan_report_%s%d",
            is_store ? "store" : "load", size_in_bytes);
   fn_type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
   def = build_fn_decl (name, fn_type);
@@ -118,6 +122,9 @@ asan_init_func (void)
 }
 
 
+#define PROB_VERY_UNLIKELY	(REG_BR_PROB_BASE / 2000 - 1)
+#define PROB_ALWAYS		(REG_BR_PROB_BASE)
+
 /* Instrument the memory access instruction BASE.
    Insert new statements before ITER.
    LOCATION is source code location.
@@ -127,21 +134,17 @@ asan_init_func (void)
 static void
 build_check_stmt (tree base,
                   gimple_stmt_iterator *iter,
-                  location_t location, int is_store, int size_in_bytes)
+                  location_t location, bool is_store, int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
   basic_block cond_bb, then_bb, join_bb;
   edge e;
-  tree cond, t, u;
-  tree base_addr;
-  tree shadow_value;
+  tree t, base_addr, shadow;
   gimple g;
-  gimple_seq seq, stmts;
-  tree shadow_type = size_in_bytes == 16 ?
-      short_integer_type_node : char_type_node;
-  tree shadow_ptr_type = build_pointer_type (shadow_type);
-  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode,
-                                                      /*unsignedp=*/true);
+  tree shadow_ptr_type = shadow_ptr_types[size_in_bytes == 16 ? 1 : 0];
+  tree shadow_type = TREE_TYPE (shadow_ptr_type);
+  tree uintptr_type
+    = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE (base)), 1);
 
   /* We first need to split the current basic block, and start altering
      the CFG.  This allows us to insert the statements we're about to
@@ -166,14 +169,15 @@ build_check_stmt (tree base,
 
   /* Create the bb that contains the crash block.  */
   then_bb = create_empty_bb (cond_bb);
-  make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  e->probability = PROB_VERY_UNLIKELY;
   make_single_succ_edge (then_bb, join_bb, EDGE_FALLTHRU);
 
   /* Mark the pseudo-fallthrough edge from cond_bb to join_bb.  */
   e = find_edge (cond_bb, join_bb);
   e->flags = EDGE_FALSE_VALUE;
   e->count = cond_bb->count;
-  e->probability = REG_BR_PROB_BASE;
+  e->probability = PROB_ALWAYS - PROB_VERY_UNLIKELY;
 
   /* Update dominance info.  Note that bb_join's data was
      updated by split_block.  */
@@ -183,75 +187,123 @@ build_check_stmt (tree base,
       set_immediate_dominator (CDI_DOMINATORS, join_bb, cond_bb);
     }
 
-  base_addr = create_tmp_reg (uintptr_type, "__asan_base_addr");
+  gsi = gsi_last_bb (cond_bb);
+  g = gimple_build_assign_with_ops (TREE_CODE (base),
+				    make_ssa_name (TREE_TYPE (base), NULL),
+				    base, NULL_TREE);
+  gimple_set_location (g, location);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
-  seq = NULL; 
-  t = fold_convert_loc (location, uintptr_type,
-                        unshare_expr (base));
-  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
-  gimple_seq_add_seq (&seq, stmts);
-  g = gimple_build_assign (base_addr, t);
+  g = gimple_build_assign_with_ops (NOP_EXPR,
+				    make_ssa_name (uintptr_type, NULL),
+				    gimple_assign_lhs (g), NULL_TREE);
   gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+  base_addr = gimple_assign_lhs (g);
 
   /* Build
-     (base_addr >> ASAN_SHADOW_SHIFT) | targetm.asan_shadow_offset ().  */
-
-  t = build2 (RSHIFT_EXPR, uintptr_type, base_addr,
-	      build_int_cst (uintptr_type, ASAN_SHADOW_SHIFT));
-  t = build2 (PLUS_EXPR, uintptr_type, t,
-	      build_int_cst (uintptr_type, targetm.asan_shadow_offset ()));
-  t = build1 (INDIRECT_REF, shadow_type,
-              build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t));
-  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
-  gimple_seq_add_seq (&seq, stmts);
-  shadow_value = create_tmp_reg (shadow_type, "__asan_shadow");
-  g = gimple_build_assign (shadow_value, t);
-  gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
-  t = build2 (NE_EXPR, boolean_type_node, shadow_value,
-              build_int_cst (shadow_type, 0));
-  if (size_in_bytes < 8)
-    {
+     (base_addr >> ASAN_SHADOW_SHIFT) + targetm.asan_shadow_offset ().  */
 
-      /* Slow path for 1-, 2- and 4- byte accesses.
-         Build ((base_addr & 7) + (size_in_bytes - 1)) >= shadow_value.  */
+  t = build_int_cst (uintptr_type, ASAN_SHADOW_SHIFT);
+  g = gimple_build_assign_with_ops (RSHIFT_EXPR,
+				    make_ssa_name (uintptr_type, NULL),
+				    base_addr, t);
+  gimple_set_location (g, location);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
-      u = build2 (BIT_AND_EXPR, uintptr_type,
-                  base_addr,
-                  build_int_cst (uintptr_type, 7));
-      u = build1 (CONVERT_EXPR, shadow_type, u);
-      u = build2 (PLUS_EXPR, shadow_type, u,
-                  build_int_cst (shadow_type, size_in_bytes - 1));
-      u = build2 (GE_EXPR, uintptr_type, u, shadow_value);
-    }
-  else
-      u = build_int_cst (boolean_type_node, 1);
-  t = build2 (TRUTH_AND_EXPR, boolean_type_node, t, u);
-  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
-  gimple_seq_add_seq (&seq, stmts);
-  cond = create_tmp_reg (boolean_type_node, "__asan_crash_cond");
-  g = gimple_build_assign  (cond, t);
+  t = build_int_cst (uintptr_type, targetm.asan_shadow_offset ());
+  g = gimple_build_assign_with_ops (PLUS_EXPR,
+				    make_ssa_name (uintptr_type, NULL),
+				    gimple_assign_lhs (g), t);
   gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
-  g = gimple_build_cond (NE_EXPR, cond, boolean_false_node, NULL_TREE,
-                         NULL_TREE);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+  g = gimple_build_assign_with_ops (NOP_EXPR,
+				    make_ssa_name (shadow_ptr_type, NULL),
+				    gimple_assign_lhs (g), NULL_TREE);
   gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
-  /* Generate call to the run-time library (e.g. __asan_report_load8).  */
+  t = build2 (MEM_REF, shadow_type, gimple_assign_lhs (g),
+	      build_int_cst (shadow_ptr_type, 0));
+  g = gimple_build_assign_with_ops (MEM_REF,
+				    make_ssa_name (shadow_type, NULL),
+				    t, NULL_TREE);
+  gimple_set_location (g, location);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+  shadow = gimple_assign_lhs (g);
 
-  gsi = gsi_last_bb (cond_bb);
-  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
-  seq = NULL; 
-  g = gimple_build_call (report_error_func (is_store, size_in_bytes),
-                         1, base_addr);
-  gimple_seq_add_stmt (&seq, g);
+  if (size_in_bytes < 8)
+    {
+      /* Slow path for 1, 2 and 4 byte accesses.
+	 Test (shadow != 0)
+	      & ((base_addr & 7) + (size_in_bytes - 1)) >= shadow).  */
+      g = gimple_build_assign_with_ops (NE_EXPR,
+					make_ssa_name (boolean_type_node,
+						       NULL),
+					shadow,
+					build_int_cst (shadow_type, 0));
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+      t = gimple_assign_lhs (g);
+
+      g = gimple_build_assign_with_ops (BIT_AND_EXPR,
+					make_ssa_name (uintptr_type,
+						       NULL),
+					base_addr,
+					build_int_cst (uintptr_type, 7));
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+      g = gimple_build_assign_with_ops (NOP_EXPR,
+					make_ssa_name (shadow_type,
+						       NULL),
+					gimple_assign_lhs (g), NULL_TREE);
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+      if (size_in_bytes > 1)
+	{
+	  g = gimple_build_assign_with_ops (PLUS_EXPR,
+					    make_ssa_name (shadow_type,
+							   NULL),
+					    gimple_assign_lhs (g),
+					    build_int_cst (shadow_type,
+							   size_in_bytes - 1));
+	  gimple_set_location (g, location);
+	  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+	}
+
+      g = gimple_build_assign_with_ops (GE_EXPR,
+					make_ssa_name (boolean_type_node,
+						       NULL),
+					gimple_assign_lhs (g),
+					shadow);
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+      g = gimple_build_assign_with_ops (BIT_AND_EXPR,
+					make_ssa_name (boolean_type_node,
+						       NULL),
+					t, gimple_assign_lhs (g));
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+      t = gimple_assign_lhs (g);
+    }
+  else
+    t = shadow;
 
-  /* Insert the check code in the THEN block.  */
+  g = gimple_build_cond (NE_EXPR, t, build_int_cst (TREE_TYPE (t), 0),
+			 NULL_TREE, NULL_TREE);
+  gimple_set_location (g, location);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
+  /* Generate call to the run-time library (e.g. __asan_report_load8).  */
   gsi = gsi_start_bb (then_bb);
-  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
+  g = gimple_build_call (report_error_func (is_store, size_in_bytes),
+			 1, base_addr);
+  gimple_set_location (g, location);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
   *iter = gsi_start_bb (join_bb);
 }
@@ -262,14 +314,12 @@ build_check_stmt (tree base,
 
 static void
 instrument_derefs (gimple_stmt_iterator *iter, tree t,
-                  location_t location, int is_store)
+                  location_t location, bool is_store)
 {
   tree type, base;
-  int size_in_bytes;
+  HOST_WIDE_INT size_in_bytes;
 
   type = TREE_TYPE (t);
-  if (type == error_mark_node)
-    return;
   switch (TREE_CODE (t))
     {
     case ARRAY_REF:
@@ -280,25 +330,25 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
     default:
       return;
     }
-  size_in_bytes = tree_low_cst (TYPE_SIZE (type), 0) / BITS_PER_UNIT;
-  if (size_in_bytes != 1 && size_in_bytes != 2 &&
-      size_in_bytes != 4 && size_in_bytes != 8 && size_in_bytes != 16)
-      return;
-  {
-    /* For now just avoid instrumenting bit field acceses.
+
+  size_in_bytes = int_size_in_bytes (type);
+  if ((size_in_bytes & (size_in_bytes - 1)) != 0
+      || (unsigned HOST_WIDE_INT) size_in_bytes - 1 >= 16)
+    return;
+
+  /* For now just avoid instrumenting bit field acceses.
      Fixing it is doable, but expected to be messy.  */
 
-    HOST_WIDE_INT bitsize, bitpos;
-    tree offset;
-    enum machine_mode mode;
-    int volatilep = 0, unsignedp = 0;
-    get_inner_reference (t, &bitsize, &bitpos, &offset,
-                         &mode, &unsignedp, &volatilep, false);
-    if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
-        return;
-  }
-
-  base = build_addr (t, current_function_decl);
+  HOST_WIDE_INT bitsize, bitpos;
+  tree offset;
+  enum machine_mode mode;
+  int volatilep = 0, unsignedp = 0;
+  get_inner_reference (t, &bitsize, &bitpos, &offset,
+		       &mode, &unsignedp, &volatilep, false);
+  if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
+    return;
+
+  base = build_fold_addr_expr (t);
   build_check_stmt (base, iter, location, is_store, size_in_bytes);
 }
 
@@ -314,7 +364,6 @@ transform_statements (void)
   basic_block bb;
   gimple_stmt_iterator i;
   int saved_last_basic_block = last_basic_block;
-  enum gimple_rhs_class grhs_class;
 
   FOR_EACH_BB (bb)
     {
@@ -322,16 +371,12 @@ transform_statements (void)
       for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
         {
           gimple s = gsi_stmt (i);
-          if (gimple_code (s) != GIMPLE_ASSIGN)
-              continue;
+          if (!gimple_assign_single_p (s))
+	    continue;
           instrument_derefs (&i, gimple_assign_lhs (s),
-                             gimple_location (s), 1);
+                             gimple_location (s), true);
           instrument_derefs (&i, gimple_assign_rhs1 (s),
-                             gimple_location (s), 0);
-          grhs_class = get_gimple_rhs_class (gimple_assign_rhs_code (s));
-          if (grhs_class == GIMPLE_BINARY_RHS)
-            instrument_derefs (&i, gimple_assign_rhs2 (s),
-                               gimple_location (s), 0);
+                             gimple_location (s), false);
         }
     }
 }
@@ -351,15 +396,28 @@ asan_finish_file (void)
                              MAX_RESERVED_INIT_PRIORITY - 1);
 }
 
+/* Initialize shadow_ptr_types array.  */
+
+static void
+asan_init_shadow_ptr_types (void)
+{
+  alias_set_type set = new_alias_set ();
+  shadow_ptr_types[0] = build_distinct_type_copy (unsigned_char_type_node);
+  TYPE_ALIAS_SET (shadow_ptr_types[0]) = set;
+  shadow_ptr_types[0] = build_pointer_type (shadow_ptr_types[0]);
+  shadow_ptr_types[1] = build_distinct_type_copy (short_unsigned_type_node);
+  TYPE_ALIAS_SET (shadow_ptr_types[1]) = set;
+  shadow_ptr_types[1] = build_pointer_type (shadow_ptr_types[1]);
+}
+
 /* Instrument the current function.  */
 
 static unsigned int
 asan_instrument (void)
 {
-  struct gimplify_ctx gctx;
-  push_gimplify_context (&gctx);
+  if (shadow_ptr_types[0] == NULL_TREE)
+    asan_init_shadow_ptr_types ();
   transform_statements ();
-  pop_gimplify_context (NULL);
   return 0;
 }
 
@@ -385,6 +443,8 @@ struct gimple_opt_pass pass_asan =
   0,                                    /* properties_destroyed  */
   0,                                    /* todo_flags_start  */
   TODO_verify_flow | TODO_verify_stmts
-  | TODO_update_ssa    /* todo_flags_finish  */
+  | TODO_update_ssa			/* todo_flags_finish  */
  }
 };
+
+#include "gt-asan.h"
diff --git a/gcc/asan.h b/gcc/asan.h
index 699820b..0d9ab8b 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -24,7 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 extern void asan_finish_file(void);
 
 /* Shadow memory is found at
-   (address >> ASAN_SHADOW_SHIFT) | targetm.asan_shadow_offset ().  */
+   (address >> ASAN_SHADOW_SHIFT) + targetm.asan_shadow_offset ().  */
 #define ASAN_SHADOW_SHIFT	3
 
 #endif /* TREE_ASAN */
-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 02/13] Rename tree-asan.[ch] to asan.[ch]
  2012-11-01 19:53 ` [PATCH 02/13] Rename tree-asan.[ch] to asan.[ch] dodji
@ 2012-11-01 21:54   ` Joseph S. Myers
  2012-11-02 22:44     ` Dodji Seketeli
  0 siblings, 1 reply; 80+ messages in thread
From: Joseph S. Myers @ 2012-11-01 21:54 UTC (permalink / raw)
  To: dodji; +Cc: gcc-patches, dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

On Thu, 1 Nov 2012, dodji@redhat.com wrote:

> From: dnovillo <dnovillo@138bc75d-0d04-0410-961f-82ee72b054a4>
> 
> Following a discussion we had on this list, this patch renames the
> file tree-asan.* into asan.*.
> 
>     	* asan.c: Rename from tree-asan.c.
>     	Update all users.
>     	* asan.h: Rename from tree-asan.h
>     	Update all users.

Patch series submissions for mainline should be cleanly rebased, with each 
patch as a logical part of the intended eventual changes; they should not 
be a dump of the successive stages by which the patch was developed.

It's reasonable to have an initial patch that adds the skeleton of a 
feature, then subsequent patches that add well-defined additional features 
to it.  The following are examples of patch series structures that are not 
appropriate:

* This sort of adding a file under one name in one patch, then renaming in 
a later patch of the series.

* Introducing a known bug in one patch in the series, where a subsequent 
patch in the series is the fix, unless the fix really depends on 
intermediate patches in the series

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 06/13] Implement protection of stack variables
       [not found]   ` <CAGQ9bdweH8Pn=8vLTNa8FSzAh92OYrWScxK78n9znCodADJUvw@mail.gmail.com>
@ 2012-11-02  4:35     ` Xinliang David Li
  2012-11-02 15:25       ` Dodji Seketeli
  2012-11-02 14:44     ` Dodji Seketeli
  1 sibling, 1 reply; 80+ messages in thread
From: Xinliang David Li @ 2012-11-02  4:35 UTC (permalink / raw)
  To: Konstantin Serebryany
  Cc: Dodji Seketeli, GCC Patches, Diego Novillo, Jakub Jelinek, Wei Mi

Changing the option is part of the plan.

Dodji, can you make the option change part of one the patches (e.g,
the first one that introduces it) -- there seems no need for a
separate patch for it.

thanks,

David

On Thu, Nov 1, 2012 at 9:12 PM, Konstantin Serebryany
<konstantin.s.serebryany@gmail.com> wrote:
>
>
> On Thu, Nov 1, 2012 at 11:52 PM, <dodji@redhat.com> wrote:
>>
>> From: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4>
>>
>> This patch implements the protection of stack variables.
>>
>> To understand how this works, lets look at this example on x86_64
>> where the stack grows downward:
>>
>>  int
>>  foo ()
>>  {
>>    char a[23] = {0};
>>    int b[2] = {0};
>>
>>    a[5] = 1;
>>    b[1] = 2;
>>
>>    return a[5] + b[1];
>>  }
>>
>> For this function, the stack protected by asan will be organized as
>> follows, from the top of the stack to the bottom:
>>
>> Slot 1/ [red zone of 32 bytes called 'RIGHT RedZone']
>>
>> Slot 2/ [24 bytes for variable 'a']
>>
>> Slot 3/ [8 bytes of red zone, that adds up to the space of 'a' to make
>>          the next slot be 32 bytes aligned; this one is called Partial
>>          Redzone; this 32 bytes alignment is an asan constraint]
>>
>> Slot 4/ [red zone of 32 bytes called 'Middle RedZone']
>>
>> Slot 5/ [8 bytes for variable 'b']
>>
>> Slot 6/ [24 bytes of Partial Red Zone (similar to slot 3]
>>
>> Slot 7/ [32 bytes of Red Zone at the bottom of the stack, called 'LEFT
>>          RedZone']
>>
>> [A cultural question I've kept asking myself is Why has address
>>  sanitizer authors called these red zones (LEFT, MIDDLE, RIGHT)
>>  instead of e.g, (BOTTOM, MIDDLE, TOP).  Maybe they can step up and
>>  educate me so that I get less confused in the future.  :-)]
>
>
> Ha! Good question. I guess that's related to the way we explained it in the
> paper,
> where the chunk of memory was typeset horizontally to save space.
>
> Btw, are we still using -fasan option, or did we change it to
> -faddress-sanitizer?
>
> --kcc
>
>
>>
>>
>> The 32 bytes of LEFT red zone at the bottom of the stack can be
>> decomposed as such:
>>
>>     1/ The first 8 bytes contain a magical asan number that is always
>>     0x41B58AB3.
>>
>>     2/ The following 8 bytes contains a pointer to a string (to be
>>     parsed at runtime by the runtime asan library), which format is
>>     the following:
>>
>>      "<function-name> <space> <num-of-variables-on-the-stack>
>>      (<32-bytes-aligned-offset-in-bytes-of-variable> <space>
>>      <length-of-var-in-bytes> ){n} "
>>
>>         where '(...){n}' means the content inside the parenthesis occurs
>> 'n'
>>         times, with 'n' being the number of variables on the stack.
>>
>>      3/ The following 16 bytes of the red zone have no particular
>>      format.
>>
>> The shadow memory for that stack layout is going to look like this:
>>
>>     - content of shadow memory 8 bytes for slot 7: 0xFFFFFFFFF1F1F1F1.
>>       The F1 byte pattern is a magic number called
>>       ASAN_STACK_MAGIC_LEFT and is a way for the runtime to know that
>>       the memory for that shadow byte is part of a the LEFT red zone
>>       intended to seat at the bottom of the variables on the stack.
>>
>>     - content of shadow memory 8 bytes for slots 6 and 5:
>>       0xFFFFFFFFF4F4F400.  The F4 byte pattern is a magic number
>>       called ASAN_STACK_MAGIC_PARTIAL.  It flags the fact that the
>>       memory region for this shadow byte is a PARTIAL red zone
>>       intended to pad a variable A, so that the slot following
>>       {A,padding} is 32 bytes aligned.
>>
>>       Note that the fact that the least significant byte of this
>>       shadow memory content is 00 means that 8 bytes of its
>>       corresponding memory (which corresponds to the memory of
>>       variable 'b') is addressable.
>>
>>     - content of shadow memory 8 bytes for slot 4: 0xFFFFFFFFF2F2F2F2.
>>       The F2 byte pattern is a magic number called
>>       ASAN_STACK_MAGIC_MIDDLE.  It flags the fact that the memory
>>       region for this shadow byte is a MIDDLE red zone intended to
>>       seat between two 32 aligned slots of {variable,padding}.
>>
>>     - content of shadow memory 8 bytes for slot 3 and 2:
>>       0xFFFFFFFFF4000000.  This represents is the concatenation of
>>       variable 'a' and the partial red zone following it, like what we
>>       had for variable 'b'.  The least significant 3 bytes being 00
>>       means that the 3 bytes of variable 'a' are addressable.
>>
>>     - content of shadow memory 8 bytes for slot 1: 0xFFFFFFFFF3F3F3F3.
>>       The F3 byte pattern is a magic number called
>>       ASAN_STACK_MAGIC_RIGHT.  It flags the fact that the memory
>>       region for this shadow byte is a RIGHT red zone intended to seat
>>       at the top of the variables of the stack.
>>
>> So, the patch lays out stack variables as well as the different red
>> zones, emits some prologue code to populate the shadow memory as to
>> poison (mark as non-accessible) the regions of the red zones and mark
>> the regions of stack variables as accessible, and emit some epilogue
>> code to un-poison (mark as accessible) the regions of red zones right
>> before the function exits.
>>
>>         * Makefile.in (asan.o): Depend on $(EXPR_H) $(OPTABS_H).
>>         (cfgexpand.o): Depend on asan.h.
>>         * asan.c: Include expr.h and optabs.h.
>>         (asan_shadow_set): New variable.
>>         (asan_shadow_cst, asan_emit_stack_protection): New functions.
>>         (asan_init_shadow_ptr_types): Initialize also asan_shadow_set.
>>         * cfgexpand.c: Include asan.h.  Define HOST_WIDE_INT heap vector.
>>         (partition_stack_vars): If i is large alignment and j small
>>         alignment or vice versa, break out of the loop instead of
>> continue,
>>         and put the test earlier.  If flag_asan, break out of the loop
>>         if for small alignment size is different.
>>         (struct stack_vars_data): New type.
>>         (expand_stack_vars): Add DATA argument.  Change PRED type to
>>         function taking size_t argument instead of tree.  Adjust pred
>> calls.
>>         Fill DATA in and add needed padding in between variables if
>> -fasan.
>>         (defer_stack_allocation): Defer everything for flag_asan.
>>         (stack_protect_decl_phase_1, stack_protect_decl_phase_2): Take
>>         size_t index into stack_vars array instead of the decl directly.
>>         (asan_decl_phase_3): New function.
>>         (expand_used_vars): Return var destruction sequence.  Adjust
>>         expand_stack_vars calls, add another one for flag_asan.  Call
>>         asan_emit_stack_protection if expand_stack_vars added anything
>>         to the vectors.
>>         (expand_gimple_basic_block): Add disable_tail_calls argument.
>>         (gimple_expand_cfg): Pass true to it if expand_used_vars returned
>>         non-NULL.  Emit the sequence returned by expand_used_vars after
>>         return_label.
>>         * asan.h (asan_emit_stack_protection): New prototype.
>>         (asan_shadow_set): New decl.
>>         (ASAN_RED_ZONE_SIZE, ASAN_STACK_MAGIC_LEFT,
>> ASAN_STACK_MAGIC_MIDDLE,
>>         ASAN_STACK_MAGIC_RIGHT, ASAN_STACK_FRAME_MAGIC): Define.
>>         (asan_protect_stack_decl): New inline.
>>         * toplev.c (process_options): Also disable -fasan on
>>         !FRAME_GROWS_DOWNWARDS targets.
>>
>> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192540
>> 138bc75d-0d04-0410-961f-82ee72b054a4
>> ---
>>  gcc/ChangeLog.asan |  37 ++++++++++
>>  gcc/Makefile.in    |   4 +-
>>  gcc/asan.c         | 193
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>  gcc/asan.h         |  31 ++++++++-
>>  gcc/cfgexpand.c    | 159 +++++++++++++++++++++++++++++++++++++------
>>  gcc/toplev.c       |   4 +-
>>  6 files changed, 400 insertions(+), 28 deletions(-)
>>
>> diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
>> index 505bce9..23454f3 100644
>> --- a/gcc/ChangeLog.asan
>> +++ b/gcc/ChangeLog.asan
>> @@ -1,3 +1,40 @@
>> +2012-10-17  Jakub Jelinek  <jakub@redhat.com>
>> +
>> +       * Makefile.in (asan.o): Depend on $(EXPR_H) $(OPTABS_H).
>> +       (cfgexpand.o): Depend on asan.h.
>> +       * asan.c: Include expr.h and optabs.h.
>> +       (asan_shadow_set): New variable.
>> +       (asan_shadow_cst, asan_emit_stack_protection): New functions.
>> +       (asan_init_shadow_ptr_types): Initialize also asan_shadow_set.
>> +       * cfgexpand.c: Include asan.h.  Define HOST_WIDE_INT heap vector.
>> +       (partition_stack_vars): If i is large alignment and j small
>> +       alignment or vice versa, break out of the loop instead of
>> continue,
>> +       and put the test earlier.  If flag_asan, break out of the loop
>> +       if for small alignment size is different.
>> +       (struct stack_vars_data): New type.
>> +       (expand_stack_vars): Add DATA argument.  Change PRED type to
>> +       function taking size_t argument instead of tree.  Adjust pred
>> calls.
>> +       Fill DATA in and add needed padding in between variables if
>> -fasan.
>> +       (defer_stack_allocation): Defer everything for flag_asan.
>> +       (stack_protect_decl_phase_1, stack_protect_decl_phase_2): Take
>> +       size_t index into stack_vars array instead of the decl directly.
>> +       (asan_decl_phase_3): New function.
>> +       (expand_used_vars): Return var destruction sequence.  Adjust
>> +       expand_stack_vars calls, add another one for flag_asan.  Call
>> +       asan_emit_stack_protection if expand_stack_vars added anything
>> +       to the vectors.
>> +       (expand_gimple_basic_block): Add disable_tail_calls argument.
>> +       (gimple_expand_cfg): Pass true to it if expand_used_vars returned
>> +       non-NULL.  Emit the sequence returned by expand_used_vars after
>> +       return_label.
>> +       * asan.h (asan_emit_stack_protection): New prototype.
>> +       (asan_shadow_set): New decl.
>> +       (ASAN_RED_ZONE_SIZE, ASAN_STACK_MAGIC_LEFT,
>> ASAN_STACK_MAGIC_MIDDLE,
>> +       ASAN_STACK_MAGIC_RIGHT, ASAN_STACK_FRAME_MAGIC): Define.
>> +       (asan_protect_stack_decl): New inline.
>> +       * toplev.c (process_options): Also disable -fasan on
>> +       !FRAME_GROWS_DOWNWARDS targets.
>> +
>>  2012-10-12  Jakub Jelinek  <jakub@redhat.com>
>>
>>         * asan.c (build_check_stmt): Rename join_bb variable to else_bb.
>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
>> index 2ab1ca9..2743e24 100644
>> --- a/gcc/Makefile.in
>> +++ b/gcc/Makefile.in
>> @@ -2213,7 +2213,7 @@ stor-layout.o : stor-layout.c $(CONFIG_H)
>> $(SYSTEM_H) coretypes.h $(TM_H) \
>>  asan.o : asan.c asan.h $(CONFIG_H) pointer-set.h \
>>     $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \
>>     output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \
>> -   tree-pretty-print.h $(TARGET_H)
>> +   tree-pretty-print.h $(TARGET_H) $(EXPR_H) $(OPTABS_H)
>>  tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
>>     $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
>>     $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) \
>> @@ -3083,7 +3083,7 @@ cfgexpand.o : cfgexpand.c $(TREE_FLOW_H) $(CONFIG_H)
>> $(SYSTEM_H) \
>>     $(DIAGNOSTIC_H) toplev.h $(DIAGNOSTIC_CORE_H) $(BASIC_BLOCK_H)
>> $(FLAGS_H) debug.h $(PARAMS_H) \
>>     value-prof.h $(TREE_INLINE_H) $(TARGET_H) $(SSAEXPAND_H) $(REGS_H) \
>>     $(GIMPLE_PRETTY_PRINT_H) $(BITMAP_H) sbitmap.h \
>> -   $(INSN_ATTR_H) $(CFGLOOP_H)
>> +   $(INSN_ATTR_H) $(CFGLOOP_H) asan.h
>>  cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H)
>> $(RTL_ERROR_H) \
>>     $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
>>     $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) $(INSN_ATTR_H) \
>> diff --git a/gcc/asan.c b/gcc/asan.c
>> index 66dc571..fe0e9a8 100644
>> --- a/gcc/asan.c
>> +++ b/gcc/asan.c
>> @@ -43,6 +43,8 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "asan.h"
>>  #include "gimple-pretty-print.h"
>>  #include "target.h"
>> +#include "expr.h"
>> +#include "optabs.h"
>>
>>  /*
>>   AddressSanitizer finds out-of-bounds and use-after-free bugs
>> @@ -79,10 +81,195 @@ along with GCC; see the file COPYING3.  If not see
>>   to create redzones for stack and global object and poison them.
>>  */
>>
>> +alias_set_type asan_shadow_set = -1;
>> +
>>  /* Pointer types to 1 resp. 2 byte integers in shadow memory.  A separate
>>     alias set is used for all shadow memory accesses.  */
>>  static GTY(()) tree shadow_ptr_types[2];
>>
>> +/* Return a CONST_INT representing 4 subsequent shadow memory bytes.  */
>> +
>> +static rtx
>> +asan_shadow_cst (unsigned char shadow_bytes[4])
>> +{
>> +  int i;
>> +  unsigned HOST_WIDE_INT val = 0;
>> +  gcc_assert (WORDS_BIG_ENDIAN == BYTES_BIG_ENDIAN);
>> +  for (i = 0; i < 4; i++)
>> +    val |= (unsigned HOST_WIDE_INT) shadow_bytes[BYTES_BIG_ENDIAN ? 3 - i
>> : i]
>> +          << (BITS_PER_UNIT * i);
>> +  return GEN_INT (trunc_int_for_mode (val, SImode));
>> +}
>> +
>> +/* Insert code to protect stack vars.  The prologue sequence should be
>> emitted
>> +   directly, epilogue sequence returned.  BASE is the register holding
>> the
>> +   stack base, against which OFFSETS array offsets are relative to,
>> OFFSETS
>> +   array contains pairs of offsets in reverse order, always the end
>> offset
>> +   of some gap that needs protection followed by starting offset,
>> +   and DECLS is an array of representative decls for each var partition.
>> +   LENGTH is the length of the OFFSETS array, DECLS array is LENGTH / 2 -
>> 1
>> +   elements long (OFFSETS include gap before the first variable as well
>> +   as gaps after each stack variable).  */
>> +
>> +rtx
>> +asan_emit_stack_protection (rtx base, HOST_WIDE_INT *offsets, tree
>> *decls,
>> +                           int length)
>> +{
>> +  rtx shadow_base, shadow_mem, ret, mem;
>> +  unsigned char shadow_bytes[4];
>> +  HOST_WIDE_INT base_offset = offsets[length - 1], offset, prev_offset;
>> +  HOST_WIDE_INT last_offset, last_size;
>> +  int l;
>> +  unsigned char cur_shadow_byte = ASAN_STACK_MAGIC_LEFT;
>> +  static pretty_printer pp;
>> +  static bool pp_initialized;
>> +  const char *buf;
>> +  size_t len;
>> +  tree str_cst;
>> +
>> +  /* First of all, prepare the description string.  */
>> +  if (!pp_initialized)
>> +    {
>> +      pp_construct (&pp, /* prefix */NULL, /* line-width */0);
>> +      pp_initialized = true;
>> +    }
>> +  pp_clear_output_area (&pp);
>> +  if (DECL_NAME (current_function_decl))
>> +    pp_base_tree_identifier (&pp, DECL_NAME (current_function_decl));
>> +  else
>> +    pp_string (&pp, "<unknown>");
>> +  pp_space (&pp);
>> +  pp_decimal_int (&pp, length / 2 - 1);
>> +  pp_space (&pp);
>> +  for (l = length - 2; l; l -= 2)
>> +    {
>> +      tree decl = decls[l / 2 - 1];
>> +      pp_wide_integer (&pp, offsets[l] - base_offset);
>> +      pp_space (&pp);
>> +      pp_wide_integer (&pp, offsets[l - 1] - offsets[l]);
>> +      pp_space (&pp);
>> +      if (DECL_P (decl) && DECL_NAME (decl))
>> +       {
>> +         pp_decimal_int (&pp, IDENTIFIER_LENGTH (DECL_NAME (decl)));
>> +         pp_space (&pp);
>> +         pp_base_tree_identifier (&pp, DECL_NAME (decl));
>> +       }
>> +      else
>> +       pp_string (&pp, "9 <unknown>");
>> +      pp_space (&pp);
>> +    }
>> +  buf = pp_base_formatted_text (&pp);
>> +  len = strlen (buf);
>> +  str_cst = build_string (len + 1, buf);
>> +  TREE_TYPE (str_cst)
>> +    = build_array_type (char_type_node, build_index_type (size_int
>> (len)));
>> +  TREE_READONLY (str_cst) = 1;
>> +  TREE_STATIC (str_cst) = 1;
>> +  str_cst = build1 (ADDR_EXPR, build_pointer_type (char_type_node),
>> str_cst);
>> +
>> +  /* Emit the prologue sequence.  */
>> +  base = expand_binop (Pmode, add_optab, base, GEN_INT (base_offset),
>> +                      NULL_RTX, 1, OPTAB_DIRECT);
>> +  mem = gen_rtx_MEM (ptr_mode, base);
>> +  emit_move_insn (mem, GEN_INT (ASAN_STACK_FRAME_MAGIC));
>> +  mem = adjust_address (mem, VOIDmode, GET_MODE_SIZE (ptr_mode));
>> +  emit_move_insn (mem, expand_normal (str_cst));
>> +  shadow_base = expand_binop (Pmode, lshr_optab, base,
>> +                             GEN_INT (ASAN_SHADOW_SHIFT),
>> +                             NULL_RTX, 1, OPTAB_DIRECT);
>> +  shadow_base = expand_binop (Pmode, add_optab, shadow_base,
>> +                             GEN_INT (targetm.asan_shadow_offset ()),
>> +                             NULL_RTX, 1, OPTAB_DIRECT);
>> +  gcc_assert (asan_shadow_set != -1
>> +             && (ASAN_RED_ZONE_SIZE >> ASAN_SHADOW_SHIFT) == 4);
>> +  shadow_mem = gen_rtx_MEM (SImode, shadow_base);
>> +  set_mem_alias_set (shadow_mem, asan_shadow_set);
>> +  prev_offset = base_offset;
>> +  for (l = length; l; l -= 2)
>> +    {
>> +      if (l == 2)
>> +       cur_shadow_byte = ASAN_STACK_MAGIC_RIGHT;
>> +      offset = offsets[l - 1];
>> +      if ((offset - base_offset) & (ASAN_RED_ZONE_SIZE - 1))
>> +       {
>> +         int i;
>> +         HOST_WIDE_INT aoff
>> +           = base_offset + ((offset - base_offset)
>> +                            & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1));
>> +         shadow_mem = adjust_address (shadow_mem, VOIDmode,
>> +                                      (aoff - prev_offset)
>> +                                      >> ASAN_SHADOW_SHIFT);
>> +         prev_offset = aoff;
>> +         for (i = 0; i < 4; i++, aoff += (1 << ASAN_SHADOW_SHIFT))
>> +           if (aoff < offset)
>> +             {
>> +               if (aoff < offset - (1 << ASAN_SHADOW_SHIFT) + 1)
>> +                 shadow_bytes[i] = 0;
>> +               else
>> +                 shadow_bytes[i] = offset - aoff;
>> +             }
>> +           else
>> +             shadow_bytes[i] = ASAN_STACK_MAGIC_PARTIAL;
>> +         emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
>> +         offset = aoff;
>> +       }
>> +      while (offset <= offsets[l - 2] - ASAN_RED_ZONE_SIZE)
>> +       {
>> +         shadow_mem = adjust_address (shadow_mem, VOIDmode,
>> +                                      (offset - prev_offset)
>> +                                      >> ASAN_SHADOW_SHIFT);
>> +         prev_offset = offset;
>> +         memset (shadow_bytes, cur_shadow_byte, 4);
>> +         emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
>> +         offset += ASAN_RED_ZONE_SIZE;
>> +       }
>> +      cur_shadow_byte = ASAN_STACK_MAGIC_MIDDLE;
>> +    }
>> +  do_pending_stack_adjust ();
>> +
>> +  /* Construct epilogue sequence.  */
>> +  start_sequence ();
>> +
>> +  shadow_mem = gen_rtx_MEM (BLKmode, shadow_base);
>> +  set_mem_alias_set (shadow_mem, asan_shadow_set);
>> +  prev_offset = base_offset;
>> +  last_offset = base_offset;
>> +  last_size = 0;
>> +  for (l = length; l; l -= 2)
>> +    {
>> +      offset = base_offset + ((offsets[l - 1] - base_offset)
>> +                            & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1));
>> +      if (last_offset + last_size != offset)
>> +       {
>> +         shadow_mem = adjust_address (shadow_mem, VOIDmode,
>> +                                      (last_offset - prev_offset)
>> +                                      >> ASAN_SHADOW_SHIFT);
>> +         prev_offset = last_offset;
>> +         clear_storage (shadow_mem, GEN_INT (last_size >>
>> ASAN_SHADOW_SHIFT),
>> +                        BLOCK_OP_NORMAL);
>> +         last_offset = offset;
>> +         last_size = 0;
>> +       }
>> +      last_size += base_offset + ((offsets[l - 2] - base_offset)
>> +                                 & ~(ASAN_RED_ZONE_SIZE -
>> HOST_WIDE_INT_1))
>> +                  - offset;
>> +    }
>> +  if (last_size)
>> +    {
>> +      shadow_mem = adjust_address (shadow_mem, VOIDmode,
>> +                                  (last_offset - prev_offset)
>> +                                  >> ASAN_SHADOW_SHIFT);
>> +      clear_storage (shadow_mem, GEN_INT (last_size >>
>> ASAN_SHADOW_SHIFT),
>> +                    BLOCK_OP_NORMAL);
>> +    }
>> +
>> +  do_pending_stack_adjust ();
>> +
>> +  ret = get_insns ();
>> +  end_sequence ();
>> +  return ret;
>> +}
>> +
>>  /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
>>     IS_STORE is either 1 (for a store) or 0 (for a load).
>>     SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
>> @@ -401,12 +588,12 @@ asan_finish_file (void)
>>  static void
>>  asan_init_shadow_ptr_types (void)
>>  {
>> -  alias_set_type set = new_alias_set ();
>> +  asan_shadow_set = new_alias_set ();
>>    shadow_ptr_types[0] = build_distinct_type_copy
>> (unsigned_char_type_node);
>> -  TYPE_ALIAS_SET (shadow_ptr_types[0]) = set;
>> +  TYPE_ALIAS_SET (shadow_ptr_types[0]) = asan_shadow_set;
>>    shadow_ptr_types[0] = build_pointer_type (shadow_ptr_types[0]);
>>    shadow_ptr_types[1] = build_distinct_type_copy
>> (short_unsigned_type_node);
>> -  TYPE_ALIAS_SET (shadow_ptr_types[1]) = set;
>> +  TYPE_ALIAS_SET (shadow_ptr_types[1]) = asan_shadow_set;
>>    shadow_ptr_types[1] = build_pointer_type (shadow_ptr_types[1]);
>>  }
>>
>> diff --git a/gcc/asan.h b/gcc/asan.h
>> index 0d9ab8b..6f0edbf 100644
>> --- a/gcc/asan.h
>> +++ b/gcc/asan.h
>> @@ -21,10 +21,39 @@ along with GCC; see the file COPYING3.  If not see
>>  #ifndef TREE_ASAN
>>  #define TREE_ASAN
>>
>> -extern void asan_finish_file(void);
>> +extern void asan_finish_file (void);
>> +extern rtx asan_emit_stack_protection (rtx, HOST_WIDE_INT *, tree *,
>> int);
>> +
>> +/* Alias set for accessing the shadow memory.  */
>> +extern alias_set_type asan_shadow_set;
>>
>>  /* Shadow memory is found at
>>     (address >> ASAN_SHADOW_SHIFT) + targetm.asan_shadow_offset ().  */
>>  #define ASAN_SHADOW_SHIFT      3
>>
>> +/* Red zone size, stack and global variables are padded by
>> ASAN_RED_ZONE_SIZE
>> +   up to 2 * ASAN_RED_ZONE_SIZE - 1 bytes.  */
>> +#define ASAN_RED_ZONE_SIZE     32
>> +
>> +/* Shadow memory values for stack protection.  Left is below protected
>> vars,
>> +   the first pointer in stack corresponding to that offset contains
>> +   ASAN_STACK_FRAME_MAGIC word, the second pointer to a string describing
>> +   the frame.  Middle is for padding in between variables, right is
>> +   above the last protected variable and partial immediately after
>> variables
>> +   up to ASAN_RED_ZONE_SIZE alignment.  */
>> +#define ASAN_STACK_MAGIC_LEFT          0xf1
>> +#define ASAN_STACK_MAGIC_MIDDLE                0xf2
>> +#define ASAN_STACK_MAGIC_RIGHT         0xf3
>> +#define ASAN_STACK_MAGIC_PARTIAL       0xf4
>> +
>> +#define ASAN_STACK_FRAME_MAGIC 0x41b58ab3
>> +
>> +/* Return true if DECL should be guarded on the stack.  */
>> +
>> +static inline bool
>> +asan_protect_stack_decl (tree decl)
>> +{
>> +  return DECL_P (decl) && !DECL_ARTIFICIAL (decl);
>> +}
>> +
>>  #endif /* TREE_ASAN */
>> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
>> index e501b4b..67cf902 100644
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>> @@ -47,6 +47,7 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "cfgloop.h"
>>  #include "regs.h" /* For reg_renumber.  */
>>  #include "insn-attr.h" /* For INSN_SCHEDULING.  */
>> +#include "asan.h"
>>
>>  /* This variable holds information helping the rewriting of SSA trees
>>     into RTL.  */
>> @@ -736,6 +737,7 @@ partition_stack_vars (void)
>>      {
>>        size_t i = stack_vars_sorted[si];
>>        unsigned int ialign = stack_vars[i].alignb;
>> +      HOST_WIDE_INT isize = stack_vars[i].size;
>>
>>        /* Ignore objects that aren't partition representatives. If we
>>           see a var that is not a partition representative, it must
>> @@ -747,19 +749,28 @@ partition_stack_vars (void)
>>         {
>>           size_t j = stack_vars_sorted[sj];
>>           unsigned int jalign = stack_vars[j].alignb;
>> +         HOST_WIDE_INT jsize = stack_vars[j].size;
>>
>>           /* Ignore objects that aren't partition representatives.  */
>>           if (stack_vars[j].representative != j)
>>             continue;
>>
>> -         /* Ignore conflicting objects.  */
>> -         if (stack_var_conflict_p (i, j))
>> -           continue;
>> -
>>           /* Do not mix objects of "small" (supported) alignment
>>              and "large" (unsupported) alignment.  */
>>           if ((ialign * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
>>               != (jalign * BITS_PER_UNIT <=
>> MAX_SUPPORTED_STACK_ALIGNMENT))
>> +           break;
>> +
>> +         /* For Address Sanitizer do not mix objects with different
>> +            sizes, as the shorter vars wouldn't be adequately protected.
>> +            Don't do that for "large" (unsupported) alignment objects,
>> +            those aren't protected anyway.  */
>> +         if (flag_asan && isize != jsize
>> +             && ialign * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
>> +           break;
>> +
>> +         /* Ignore conflicting objects.  */
>> +         if (stack_var_conflict_p (i, j))
>>             continue;
>>
>>           /* UNION the objects, placing J at OFFSET.  */
>> @@ -837,12 +848,26 @@ expand_one_stack_var_at (tree decl, rtx base,
>> unsigned base_align,
>>    set_rtl (decl, x);
>>  }
>>
>> +DEF_VEC_I(HOST_WIDE_INT);
>> +DEF_VEC_ALLOC_I(HOST_WIDE_INT,heap);
>> +
>> +struct stack_vars_data
>> +{
>> +  /* Vector of offset pairs, always end of some padding followed
>> +     by start of the padding that needs Address Sanitizer protection.
>> +     The vector is in reversed, highest offset pairs come first.  */
>> +  VEC(HOST_WIDE_INT, heap) *asan_vec;
>> +
>> +  /* Vector of partition representative decls in between the paddings.
>> */
>> +  VEC(tree, heap) *asan_decl_vec;
>> +};
>> +
>>  /* A subroutine of expand_used_vars.  Give each partition representative
>>     a unique location within the stack frame.  Update each partition
>> member
>>     with that location.  */
>>
>>  static void
>> -expand_stack_vars (bool (*pred) (tree))
>> +expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
>>  {
>>    size_t si, i, j, n = stack_vars_num;
>>    HOST_WIDE_INT large_size = 0, large_alloc = 0;
>> @@ -913,13 +938,45 @@ expand_stack_vars (bool (*pred) (tree))
>>
>>        /* Check the predicate to see whether this variable should be
>>          allocated in this pass.  */
>> -      if (pred && !pred (decl))
>> +      if (pred && !pred (i))
>>         continue;
>>
>>        alignb = stack_vars[i].alignb;
>>        if (alignb * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
>>         {
>> -         offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
>> +         if (flag_asan && pred)
>> +           {
>> +             HOST_WIDE_INT prev_offset = frame_offset;
>> +             tree repr_decl = NULL_TREE;
>> +
>> +             offset
>> +               = alloc_stack_frame_space (stack_vars[i].size
>> +                                          + ASAN_RED_ZONE_SIZE,
>> +                                          MAX (alignb,
>> ASAN_RED_ZONE_SIZE));
>> +             VEC_safe_push (HOST_WIDE_INT, heap, data->asan_vec,
>> +                            prev_offset);
>> +             VEC_safe_push (HOST_WIDE_INT, heap, data->asan_vec,
>> +                            offset + stack_vars[i].size);
>> +             /* Find best representative of the partition.
>> +                Prefer those with DECL_NAME, even better
>> +                satisfying asan_protect_stack_decl predicate.  */
>> +             for (j = i; j != EOC; j = stack_vars[j].next)
>> +               if (asan_protect_stack_decl (stack_vars[j].decl)
>> +                   && DECL_NAME (stack_vars[j].decl))
>> +                 {
>> +                   repr_decl = stack_vars[j].decl;
>> +                   break;
>> +                 }
>> +               else if (repr_decl == NULL_TREE
>> +                        && DECL_P (stack_vars[j].decl)
>> +                        && DECL_NAME (stack_vars[j].decl))
>> +                 repr_decl = stack_vars[j].decl;
>> +             if (repr_decl == NULL_TREE)
>> +               repr_decl = stack_vars[i].decl;
>> +             VEC_safe_push (tree, heap, data->asan_decl_vec, repr_decl);
>> +           }
>> +         else
>> +           offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
>>           base = virtual_stack_vars_rtx;
>>           base_align = crtl->max_used_stack_slot_alignment;
>>         }
>> @@ -1057,8 +1114,9 @@ static bool
>>  defer_stack_allocation (tree var, bool toplevel)
>>  {
>>    /* If stack protection is enabled, *all* stack variables must be
>> deferred,
>> -     so that we can re-order the strings to the top of the frame.  */
>> -  if (flag_stack_protect)
>> +     so that we can re-order the strings to the top of the frame.
>> +     Similarly for Address Sanitizer.  */
>> +  if (flag_stack_protect || flag_asan)
>>      return true;
>>
>>    /* We handle "large" alignment via dynamic allocation.  We want to
>> handle
>> @@ -1329,15 +1387,31 @@ stack_protect_decl_phase (tree decl)
>>     as callbacks for expand_stack_vars.  */
>>
>>  static bool
>> -stack_protect_decl_phase_1 (tree decl)
>> +stack_protect_decl_phase_1 (size_t i)
>>  {
>> -  return stack_protect_decl_phase (decl) == 1;
>> +  return stack_protect_decl_phase (stack_vars[i].decl) == 1;
>>  }
>>
>>  static bool
>> -stack_protect_decl_phase_2 (tree decl)
>> +stack_protect_decl_phase_2 (size_t i)
>>  {
>> -  return stack_protect_decl_phase (decl) == 2;
>> +  return stack_protect_decl_phase (stack_vars[i].decl) == 2;
>> +}
>> +
>> +/* And helper function that checks for asan phase (with stack protector
>> +   it is phase 3).  This is used as callback for expand_stack_vars.
>> +   Returns true if any of the vars in the partition need to be protected.
>> */
>> +
>> +static bool
>> +asan_decl_phase_3 (size_t i)
>> +{
>> +  while (i != EOC)
>> +    {
>> +      if (asan_protect_stack_decl (stack_vars[i].decl))
>> +       return true;
>> +      i = stack_vars[i].next;
>> +    }
>> +  return false;
>>  }
>>
>>  /* Ensure that variables in different stack protection phases conflict
>> @@ -1448,11 +1522,12 @@ estimated_stack_frame_size (struct cgraph_node
>> *node)
>>
>>  /* Expand all variables used in the function.  */
>>
>> -static void
>> +static rtx
>>  expand_used_vars (void)
>>  {
>>    tree var, outer_block = DECL_INITIAL (current_function_decl);
>>    VEC(tree,heap) *maybe_local_decls = NULL;
>> +  rtx var_end_seq = NULL_RTX;
>>    struct pointer_map_t *ssa_name_decls;
>>    unsigned i;
>>    unsigned len;
>> @@ -1603,6 +1678,11 @@ expand_used_vars (void)
>>    /* Assign rtl to each variable based on these partitions.  */
>>    if (stack_vars_num > 0)
>>      {
>> +      struct stack_vars_data data;
>> +
>> +      data.asan_vec = NULL;
>> +      data.asan_decl_vec = NULL;
>> +
>>        /* Reorder decls to be protected by iterating over the variables
>>          array multiple times, and allocating out of each phase in turn.
>> */
>>        /* ??? We could probably integrate this into the qsort we did
>> @@ -1611,14 +1691,41 @@ expand_used_vars (void)
>>        if (has_protected_decls)
>>         {
>>           /* Phase 1 contains only character arrays.  */
>> -         expand_stack_vars (stack_protect_decl_phase_1);
>> +         expand_stack_vars (stack_protect_decl_phase_1, &data);
>>
>>           /* Phase 2 contains other kinds of arrays.  */
>>           if (flag_stack_protect == 2)
>> -           expand_stack_vars (stack_protect_decl_phase_2);
>> +           expand_stack_vars (stack_protect_decl_phase_2, &data);
>> +       }
>> +
>> +      if (flag_asan)
>> +       /* Phase 3, any partitions that need asan protection
>> +          in addition to phase 1 and 2.  */
>> +       expand_stack_vars (asan_decl_phase_3, &data);
>> +
>> +      if (!VEC_empty (HOST_WIDE_INT, data.asan_vec))
>> +       {
>> +         HOST_WIDE_INT prev_offset = frame_offset;
>> +         HOST_WIDE_INT offset
>> +           = alloc_stack_frame_space (ASAN_RED_ZONE_SIZE,
>> +                                      ASAN_RED_ZONE_SIZE);
>> +         VEC_safe_push (HOST_WIDE_INT, heap, data.asan_vec, prev_offset);
>> +         VEC_safe_push (HOST_WIDE_INT, heap, data.asan_vec, offset);
>> +
>> +         var_end_seq
>> +           = asan_emit_stack_protection (virtual_stack_vars_rtx,
>> +                                         VEC_address (HOST_WIDE_INT,
>> +                                                      data.asan_vec),
>> +                                         VEC_address (tree,
>> +
>> data.asan_decl_vec),
>> +                                         VEC_length (HOST_WIDE_INT,
>> +                                                     data.asan_vec));
>>         }
>>
>> -      expand_stack_vars (NULL);
>> +      expand_stack_vars (NULL, &data);
>> +
>> +      VEC_free (HOST_WIDE_INT, heap, data.asan_vec);
>> +      VEC_free (tree, heap, data.asan_decl_vec);
>>      }
>>
>>    fini_vars_expansion ();
>> @@ -1645,6 +1752,8 @@ expand_used_vars (void)
>>         frame_offset += align - 1;
>>        frame_offset &= -align;
>>      }
>> +
>> +  return var_end_seq;
>>  }
>>
>>
>> @@ -3662,7 +3771,7 @@ expand_debug_locations (void)
>>  /* Expand basic block BB from GIMPLE trees to RTL.  */
>>
>>  static basic_block
>> -expand_gimple_basic_block (basic_block bb)
>> +expand_gimple_basic_block (basic_block bb, bool disable_tail_calls)
>>  {
>>    gimple_stmt_iterator gsi;
>>    gimple_seq stmts;
>> @@ -3950,6 +4059,11 @@ expand_gimple_basic_block (basic_block bb)
>>         }
>>        else
>>         {
>> +         if (is_gimple_call (stmt)
>> +             && gimple_call_tail_p (stmt)
>> +             && disable_tail_calls)
>> +           gimple_call_set_tail (stmt, false);
>> +
>>           if (is_gimple_call (stmt) && gimple_call_tail_p (stmt))
>>             {
>>               bool can_fallthru;
>> @@ -4309,7 +4423,7 @@ gimple_expand_cfg (void)
>>    sbitmap blocks;
>>    edge_iterator ei;
>>    edge e;
>> -  rtx var_seq;
>> +  rtx var_seq, var_ret_seq;
>>    unsigned i;
>>
>>    timevar_push (TV_OUT_OF_SSA);
>> @@ -4369,7 +4483,7 @@ gimple_expand_cfg (void)
>>    timevar_push (TV_VAR_EXPAND);
>>    start_sequence ();
>>
>> -  expand_used_vars ();
>> +  var_ret_seq = expand_used_vars ();
>>
>>    var_seq = get_insns ();
>>    end_sequence ();
>> @@ -4495,7 +4609,7 @@ gimple_expand_cfg (void)
>>
>>    lab_rtx_for_bb = pointer_map_create ();
>>    FOR_BB_BETWEEN (bb, init_block->next_bb, EXIT_BLOCK_PTR, next_bb)
>> -    bb = expand_gimple_basic_block (bb);
>> +    bb = expand_gimple_basic_block (bb, var_ret_seq != NULL_RTX);
>>
>>    if (MAY_HAVE_DEBUG_INSNS)
>>      expand_debug_locations ();
>> @@ -4523,6 +4637,9 @@ gimple_expand_cfg (void)
>>    construct_exit_block ();
>>    insn_locations_finalize ();
>>
>> +  if (var_ret_seq)
>> +    emit_insn_after (var_ret_seq, return_label);
>> +
>>    /* Zap the tree EH table.  */
>>    set_eh_throw_stmt_table (cfun, NULL);
>>
>> diff --git a/gcc/toplev.c b/gcc/toplev.c
>> index 68849f5..0fa8ce3 100644
>> --- a/gcc/toplev.c
>> +++ b/gcc/toplev.c
>> @@ -1542,7 +1542,9 @@ process_options (void)
>>      }
>>
>>    /* Address Sanitizer needs porting to each target architecture.  */
>> -  if (flag_asan && targetm.asan_shadow_offset == NULL)
>> +  if (flag_asan
>> +      && (targetm.asan_shadow_offset == NULL
>> +         || !FRAME_GROWS_DOWNWARD))
>>      {
>>        warning (0, "-fasan not supported for this target");
>>        flag_asan = 0;
>> --
>> 1.7.11.7
>>
>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 06/13] Implement protection of stack variables
       [not found]   ` <CAGQ9bdweH8Pn=8vLTNa8FSzAh92OYrWScxK78n9znCodADJUvw@mail.gmail.com>
  2012-11-02  4:35     ` Xinliang David Li
@ 2012-11-02 14:44     ` Dodji Seketeli
       [not found]       ` <CAGQ9bdxQG3i=BrSYmaN-ssdv4omW6F5VTg50viskKNcYrF-8BQ@mail.gmail.com>
  1 sibling, 1 reply; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-02 14:44 UTC (permalink / raw)
  To: Konstantin Serebryany; +Cc: gcc-patches, dnovillo, jakub, wmi, davidxl

Konstantin Serebryany <konstantin.s.serebryany@gmail.com> writes:

>> [A cultural question I've kept asking myself is Why has address
>>  sanitizer authors called these red zones (LEFT, MIDDLE, RIGHT)
>>  instead of e.g, (BOTTOM, MIDDLE, TOP).  Maybe they can step up and
>>  educate me so that I get less confused in the future.  :-)]
>>
>
> Ha! Good question. I guess that's related to the way we explained it in the
> paper,
> where the chunk of memory was typeset horizontally to save space.

Ah, which paper?  The only 'paper' I have seen is the pdf of you talk
you gave at GNU Cauldron this summer[1] and it didn't explain the stack
protection scheme in those terms or detail.

[1]: http://gcc.gnu.org/wiki/cauldron2012?action=AttachFile&do=get&target=kcc.pdf

> Btw, are we still using -fasan option, or did we change it to
> -faddress-sanitizer?

The later.  As I said in my reply to David, I am going to resubmit a
patch that exposes that change as part of the initial import patch of
the series.

Cheers.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 06/13] Implement protection of stack variables
  2012-11-02  4:35     ` Xinliang David Li
@ 2012-11-02 15:25       ` Dodji Seketeli
  0 siblings, 0 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-02 15:25 UTC (permalink / raw)
  To: Xinliang David Li
  Cc: Konstantin Serebryany, GCC Patches, Diego Novillo, Jakub Jelinek, Wei Mi

Xinliang David Li <davidxl@google.com> writes:

> Changing the option is part of the plan.

Indeed.

> Dodji, can you make the option change part of one the patches (e.g,
> the first one that introduces it) -- there seems no need for a
> separate patch for it.

Sure thing.  I have done the change on my local tree.  I'll re-submit
the patch a bit later.  I am doing a bit of patch merging along with
that one, in reply to the comments made by Joseph in another subtread.

Cheers.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 06/13] Implement protection of stack variables
       [not found]       ` <CAGQ9bdxQG3i=BrSYmaN-ssdv4omW6F5VTg50viskKNcYrF-8BQ@mail.gmail.com>
@ 2012-11-02 16:02         ` Dodji Seketeli
  0 siblings, 0 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-02 16:02 UTC (permalink / raw)
  To: Konstantin Serebryany; +Cc: gcc-patches, dnovillo, jakub, wmi, davidxl

Konstantin Serebryany <konstantin.s.serebryany@gmail.com> writes:

> http://research.google.com/pubs/archive/37752.pdf
> The horizontal drawing is given in section 3.3 and hence the redzones there
> are called left/right.
> The stack poisoning is only explained using an example in C.

Great, thanks.  This makes it easier to understand the whole thing than
starring at source code and asm dumps of asan@{llvm,gcc}.  :)

Cheers.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 02/13] Rename tree-asan.[ch] to asan.[ch]
  2012-11-01 21:54   ` Joseph S. Myers
@ 2012-11-02 22:44     ` Dodji Seketeli
  0 siblings, 0 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-02 22:44 UTC (permalink / raw)
  To: Joseph S. Myers
  Cc: gcc-patches, dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

"Joseph S. Myers" <joseph@codesourcery.com> writes:

> On Thu, 1 Nov 2012, dodji@redhat.com wrote:
>
>> From: dnovillo <dnovillo@138bc75d-0d04-0410-961f-82ee72b054a4>
>> 
>> Following a discussion we had on this list, this patch renames the
>> file tree-asan.* into asan.*.
>> 
>>     	* asan.c: Rename from tree-asan.c.
>>     	Update all users.
>>     	* asan.h: Rename from tree-asan.h
>>     	Update all users.
>
> Patch series submissions for mainline should be cleanly rebased, with each 
> patch as a logical part of the intended eventual changes; they should not 
> be a dump of the successive stages by which the patch was developed.
>
> It's reasonable to have an initial patch that adds the skeleton of a 
> feature, then subsequent patches that add well-defined additional features 
> to it.  The following are examples of patch series structures that are not 
> appropriate:
>
> * This sort of adding a file under one name in one patch, then renaming in 
> a later patch of the series.
>
> * Introducing a known bug in one patch in the series, where a subsequent 
> patch in the series is the fix, unless the fix really depends on 
> intermediate patches in the series

I agree with this line of reasoning;  I tried to squash and split
the patches of the set to comply abide by your request.  I'll be
posting a new patch set accordingly.

Sorry for the nuisance.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
                   ` (11 preceding siblings ...)
  2012-11-01 19:54 ` [PATCH 04/13] Emit GIMPLE directly instead of gimplifying GENERIC dodji
@ 2012-11-02 22:53 ` Dodji Seketeli
  2012-11-02 22:56   ` [PATCH 01/10] Initial import of asan from the Google branch into trunk Dodji Seketeli
                     ` (11 more replies)
  2012-11-12 20:39 ` H.J. Lu
  2012-11-15 19:42 ` Jack Howarth
  14 siblings, 12 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-02 22:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

dodji@redhat.com writes:

> The first patch is the initial import of the asan state from the
> Google branch into the [asan] branch.  Subsequent patches clean the
> code up, add features like protection of stack and global variables,
> instrumentation of memory access through built-in functions, and, last
> but not least, the import of the runtime library.

In reply to requests in this thread, I am going to post another patch
set that follows the same grouping that the one above, but that avoids
being a dump of the different commits that happen on the branch.

I made some changes that were requested by some reviewers, like changing
the -fasan flag into -faddress-sanitizer, incorporating the last changes
(to the runtime time library directory layout) from Wei that got
committed to the asan branch, and removing the uselessly included
headers from the asan.c file.  I also rebased the patches on top of
trunk of today.

Below is the new summary of the patch set.  It has been bootstrapped and
passed regression testing on x86_64-unknown-linux-gnu against trunk.

Diego Novillo (1):
  Initial import of asan from the Google branch

Dodji Seketeli (3):
  Make build_check_stmt accept an SSA_NAME for its base
  Factorize condition insertion code out of build_check_stmt
  Instrument built-in memory access function calls

Jakub Jelinek (5):
  Initial asan cleanups
  Emit GIMPLE directly instead of gimplifying GENERIC.
  Allow asan at -O0
  Implement protection of stack variables
  Implement protection of global variables

Wei Mi (1):
  Import the asan runtime library into GCC tree

 ChangeLog.asan                                     |    16 +
 Makefile.def                                       |     2 +
 Makefile.in                                        |   487 +-
 configure                                          |     1 +
 configure.ac                                       |     1 +
 gcc/ChangeLog.asan                                 |   159 +
 gcc/Makefile.in                                    |    10 +-
 gcc/asan.c                                         |  1483 ++
 gcc/asan.h                                         |    70 +
 gcc/cfgexpand.c                                    |   165 +-
 gcc/common.opt                                     |     4 +
 gcc/config/i386/i386.c                             |    11 +
 gcc/doc/invoke.texi                                |    13 +-
 gcc/doc/tm.texi                                    |     6 +
 gcc/doc/tm.texi.in                                 |     2 +
 gcc/gcc.c                                          |     1 +
 gcc/passes.c                                       |     2 +
 gcc/target.def                                     |    11 +
 gcc/toplev.c                                       |    14 +
 gcc/tree-pass.h                                    |     2 +
 gcc/varasm.c                                       |    22 +
 libsanitizer/ChangeLog.asan                        |     3 +
 libsanitizer/LICENSE.TXT                           |    97 +
 libsanitizer/Makefile.am                           |    46 +
 libsanitizer/Makefile.in                           |   773 +
 libsanitizer/README.gcc                            |     4 +
 libsanitizer/aclocal.m4                            |  9599 ++++++++++
 libsanitizer/asan/Makefile.am                      |    76 +
 libsanitizer/asan/Makefile.in                      |   631 +
 libsanitizer/asan/asan_allocator.cc                |  1045 ++
 libsanitizer/asan/asan_allocator.h                 |   177 +
 libsanitizer/asan/asan_flags.h                     |   103 +
 libsanitizer/asan/asan_globals.cc                  |   206 +
 libsanitizer/asan/asan_intercepted_functions.h     |   217 +
 libsanitizer/asan/asan_interceptors.cc             |   704 +
 libsanitizer/asan/asan_interceptors.h              |    39 +
 libsanitizer/asan/asan_internal.h                  |   169 +
 libsanitizer/asan/asan_linux.cc                    |   150 +
 libsanitizer/asan/asan_lock.h                      |    40 +
 libsanitizer/asan/asan_mac.cc                      |   526 +
 libsanitizer/asan/asan_mac.h                       |    54 +
 libsanitizer/asan/asan_malloc_linux.cc             |   142 +
 libsanitizer/asan/asan_malloc_mac.cc               |   427 +
 libsanitizer/asan/asan_malloc_win.cc               |   140 +
 libsanitizer/asan/asan_mapping.h                   |   120 +
 libsanitizer/asan/asan_new_delete.cc               |    54 +
 libsanitizer/asan/asan_poisoning.cc                |   151 +
 libsanitizer/asan/asan_posix.cc                    |   118 +
 libsanitizer/asan/asan_report.cc                   |   492 +
 libsanitizer/asan/asan_report.h                    |    51 +
 libsanitizer/asan/asan_rtl.cc                      |   404 +
 libsanitizer/asan/asan_stack.cc                    |    35 +
 libsanitizer/asan/asan_stack.h                     |    52 +
 libsanitizer/asan/asan_stats.cc                    |    86 +
 libsanitizer/asan/asan_stats.h                     |    65 +
 libsanitizer/asan/asan_thread.cc                   |   153 +
 libsanitizer/asan/asan_thread.h                    |   103 +
 libsanitizer/asan/asan_thread_registry.cc          |   188 +
 libsanitizer/asan/asan_thread_registry.h           |    83 +
 libsanitizer/asan/asan_win.cc                      |   190 +
 libsanitizer/asan/libtool-version                  |     6 +
 libsanitizer/config.guess                          |  1530 ++
 libsanitizer/config.sub                            |  1773 ++
 libsanitizer/configure                             | 17589 +++++++++++++++++++
 libsanitizer/configure.ac                          |    42 +
 libsanitizer/depcomp                               |   630 +
 libsanitizer/include/sanitizer/asan_interface.h    |   197 +
 .../include/sanitizer/common_interface_defs.h      |    66 +
 libsanitizer/install-sh                            |   527 +
 libsanitizer/interception/Makefile.am              |    59 +
 libsanitizer/interception/Makefile.in              |   535 +
 libsanitizer/interception/interception.h           |   195 +
 libsanitizer/interception/interception_linux.cc    |    28 +
 libsanitizer/interception/interception_linux.h     |    35 +
 libsanitizer/interception/interception_mac.cc      |    29 +
 libsanitizer/interception/interception_mac.h       |    47 +
 libsanitizer/interception/interception_win.cc      |   149 +
 libsanitizer/interception/interception_win.h       |    43 +
 libsanitizer/libtool-version                       |     6 +
 libsanitizer/ltmain.sh                             |  9661 ++++++++++
 libsanitizer/missing                               |   376 +
 libsanitizer/sanitizer_common/Makefile.am          |    71 +
 libsanitizer/sanitizer_common/Makefile.in          |   564 +
 .../sanitizer_common/sanitizer_allocator.cc        |    83 +
 .../sanitizer_common/sanitizer_allocator64.h       |   573 +
 libsanitizer/sanitizer_common/sanitizer_atomic.h   |    63 +
 .../sanitizer_common/sanitizer_atomic_clang.h      |   120 +
 .../sanitizer_common/sanitizer_atomic_msvc.h       |   134 +
 libsanitizer/sanitizer_common/sanitizer_common.cc  |   151 +
 libsanitizer/sanitizer_common/sanitizer_common.h   |   181 +
 libsanitizer/sanitizer_common/sanitizer_flags.cc   |    95 +
 libsanitizer/sanitizer_common/sanitizer_flags.h    |    25 +
 .../sanitizer_common/sanitizer_internal_defs.h     |   186 +
 libsanitizer/sanitizer_common/sanitizer_libc.cc    |   189 +
 libsanitizer/sanitizer_common/sanitizer_libc.h     |    69 +
 libsanitizer/sanitizer_common/sanitizer_linux.cc   |   296 +
 libsanitizer/sanitizer_common/sanitizer_list.h     |   118 +
 libsanitizer/sanitizer_common/sanitizer_mac.cc     |   249 +
 libsanitizer/sanitizer_common/sanitizer_mutex.h    |   106 +
 .../sanitizer_common/sanitizer_placement_new.h     |    31 +
 libsanitizer/sanitizer_common/sanitizer_posix.cc   |   187 +
 libsanitizer/sanitizer_common/sanitizer_printf.cc  |   196 +
 libsanitizer/sanitizer_common/sanitizer_procmaps.h |    95 +
 .../sanitizer_common/sanitizer_stackdepot.cc       |   194 +
 .../sanitizer_common/sanitizer_stackdepot.h        |    27 +
 .../sanitizer_common/sanitizer_stacktrace.cc       |   245 +
 .../sanitizer_common/sanitizer_stacktrace.h        |    73 +
 .../sanitizer_common/sanitizer_symbolizer.cc       |   311 +
 .../sanitizer_common/sanitizer_symbolizer.h        |    97 +
 .../sanitizer_common/sanitizer_symbolizer_linux.cc |   162 +
 .../sanitizer_common/sanitizer_symbolizer_mac.cc   |    31 +
 .../sanitizer_common/sanitizer_symbolizer_win.cc   |    33 +
 libsanitizer/sanitizer_common/sanitizer_win.cc     |   205 +
 113 files changed, 58851 insertions(+), 27 deletions(-)
 create mode 100644 ChangeLog.asan
 create mode 100644 gcc/ChangeLog.asan
 create mode 100644 gcc/asan.c
 create mode 100644 gcc/asan.h
 create mode 100644 libsanitizer/ChangeLog.asan
 create mode 100644 libsanitizer/LICENSE.TXT
 create mode 100644 libsanitizer/Makefile.am
 create mode 100644 libsanitizer/Makefile.in
 create mode 100644 libsanitizer/README.gcc
 create mode 100644 libsanitizer/aclocal.m4
 create mode 100644 libsanitizer/asan/Makefile.am
 create mode 100644 libsanitizer/asan/Makefile.in
 create mode 100644 libsanitizer/asan/asan_allocator.cc
 create mode 100644 libsanitizer/asan/asan_allocator.h
 create mode 100644 libsanitizer/asan/asan_flags.h
 create mode 100644 libsanitizer/asan/asan_globals.cc
 create mode 100644 libsanitizer/asan/asan_intercepted_functions.h
 create mode 100644 libsanitizer/asan/asan_interceptors.cc
 create mode 100644 libsanitizer/asan/asan_interceptors.h
 create mode 100644 libsanitizer/asan/asan_internal.h
 create mode 100644 libsanitizer/asan/asan_linux.cc
 create mode 100644 libsanitizer/asan/asan_lock.h
 create mode 100644 libsanitizer/asan/asan_mac.cc
 create mode 100644 libsanitizer/asan/asan_mac.h
 create mode 100644 libsanitizer/asan/asan_malloc_linux.cc
 create mode 100644 libsanitizer/asan/asan_malloc_mac.cc
 create mode 100644 libsanitizer/asan/asan_malloc_win.cc
 create mode 100644 libsanitizer/asan/asan_mapping.h
 create mode 100644 libsanitizer/asan/asan_new_delete.cc
 create mode 100644 libsanitizer/asan/asan_poisoning.cc
 create mode 100644 libsanitizer/asan/asan_posix.cc
 create mode 100644 libsanitizer/asan/asan_report.cc
 create mode 100644 libsanitizer/asan/asan_report.h
 create mode 100644 libsanitizer/asan/asan_rtl.cc
 create mode 100644 libsanitizer/asan/asan_stack.cc
 create mode 100644 libsanitizer/asan/asan_stack.h
 create mode 100644 libsanitizer/asan/asan_stats.cc
 create mode 100644 libsanitizer/asan/asan_stats.h
 create mode 100644 libsanitizer/asan/asan_thread.cc
 create mode 100644 libsanitizer/asan/asan_thread.h
 create mode 100644 libsanitizer/asan/asan_thread_registry.cc
 create mode 100644 libsanitizer/asan/asan_thread_registry.h
 create mode 100644 libsanitizer/asan/asan_win.cc
 create mode 100644 libsanitizer/asan/libtool-version
 create mode 100644 libsanitizer/config.guess
 create mode 100644 libsanitizer/config.sub
 create mode 100755 libsanitizer/configure
 create mode 100644 libsanitizer/configure.ac
 create mode 100644 libsanitizer/depcomp
 create mode 100644 libsanitizer/include/sanitizer/asan_interface.h
 create mode 100644 libsanitizer/include/sanitizer/common_interface_defs.h
 create mode 100644 libsanitizer/install-sh
 create mode 100644 libsanitizer/interception/Makefile.am
 create mode 100644 libsanitizer/interception/Makefile.in
 create mode 100644 libsanitizer/interception/interception.h
 create mode 100644 libsanitizer/interception/interception_linux.cc
 create mode 100644 libsanitizer/interception/interception_linux.h
 create mode 100644 libsanitizer/interception/interception_mac.cc
 create mode 100644 libsanitizer/interception/interception_mac.h
 create mode 100644 libsanitizer/interception/interception_win.cc
 create mode 100644 libsanitizer/interception/interception_win.h
 create mode 100644 libsanitizer/libtool-version
 create mode 100644 libsanitizer/ltmain.sh
 create mode 100644 libsanitizer/missing
 create mode 100644 libsanitizer/sanitizer_common/Makefile.am
 create mode 100644 libsanitizer/sanitizer_common/Makefile.in
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_allocator.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_allocator64.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_atomic.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_atomic_clang.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_atomic_msvc.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_common.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_common.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_flags.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_flags.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_internal_defs.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_libc.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_libc.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_linux.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_list.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_mac.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_mutex.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_placement_new.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_posix.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_printf.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_procmaps.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_stackdepot.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_stackdepot.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_stacktrace.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_stacktrace.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_symbolizer.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_symbolizer.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_symbolizer_linux.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_symbolizer_mac.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_symbolizer_win.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_win.cc

-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 01/10] Initial import of asan from the Google branch into trunk
  2012-11-02 22:53 ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
@ 2012-11-02 22:56   ` Dodji Seketeli
  2012-11-06 17:04     ` Diego Novillo
  2012-11-09 13:14     ` Tobias Burnus
  2012-11-02 22:57   ` [PATCH 02/10] Initial asan cleanups Dodji Seketeli
                     ` (10 subsequent siblings)
  11 siblings, 2 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-02 22:56 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

This patch imports the initial state of asan as it was in the
Google branch.

It provides basic infrastructure for asan to instrument memory
accesses on the heap, at -O3.  Note that it supports neither stack nor
global variable protection.

The rest of the patches of the set is intended to further improve this
base.

	* Makefile.in: Add asan.c and its dependencies.
	* common.opt: Add -fasan option.
	* invoke.texi: Document the new flag.
	* passes.c: Add the asan pass.
	* toplev.c (compile_file): Call asan_finish_file.
	* asan.c: New file.
	* asan.h: New file.
	* tree-pass.h: Declare pass_asan.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192360 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan  |  12 ++
 gcc/Makefile.in     |   5 +
 gcc/asan.c          | 404 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/asan.h          |  26 ++++
 gcc/common.opt      |   4 +
 gcc/doc/invoke.texi |  13 +-
 gcc/passes.c        |   1 +
 gcc/toplev.c        |   5 +
 gcc/tree-pass.h     |   1 +
 9 files changed, 468 insertions(+), 3 deletions(-)
 create mode 100644 gcc/ChangeLog.asan
 create mode 100644 gcc/asan.c
 create mode 100644 gcc/asan.h

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
new file mode 100644
index 0000000..704aa61
--- /dev/null
+++ b/gcc/ChangeLog.asan
@@ -0,0 +1,12 @@
+2012-10-10  Wei Mi <wmi@google.com>
+	    Diego Novillo <dnovillo@google.com>
+	    Dodji Seketeli <dodji@redhat.com>
+
+	* Makefile.in: Add asan.c and its dependencies.
+	* common.opt: Add -faddress-sanitizer option.
+	* invoke.texi: Document the new flag.
+	* passes.c: Add the asan pass.
+	* toplev.c (compile_file): Call asan_finish_file.
+	* asan.c: New file.
+	* asan.h: New file.
+	* tree-pass.h: Declare pass_asan.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 9aea03d..3bade7f 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1348,6 +1348,7 @@ OBJS = \
 	tracer.o \
 	trans-mem.o \
 	tree-affine.o \
+	asan.o \
 	tree-call-cdce.o \
 	tree-cfg.o \
 	tree-cfgcleanup.o \
@@ -2206,6 +2207,10 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
    $(TREE_H) $(PARAMS_H) $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(RTL_H) \
    $(GGC_H) $(TM_P_H) $(TARGET_H) langhooks.h $(REGS_H) gt-stor-layout.h \
    $(DIAGNOSTIC_CORE_H) $(CGRAPH_H) $(TREE_INLINE_H) $(TREE_DUMP_H) $(GIMPLE_H)
+asan.o : asan.c asan.h $(CONFIG_H) pointer-set.h \
+   $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \
+   output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \
+   tree-pretty-print.h
 tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
    $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
    $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) \
diff --git a/gcc/asan.c b/gcc/asan.c
new file mode 100644
index 0000000..4b07c96
--- /dev/null
+++ b/gcc/asan.c
@@ -0,0 +1,404 @@
+/* AddressSanitizer, a fast memory error detector.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Kostya Serebryany <kcc@google.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "tm_p.h"
+#include "basic-block.h"
+#include "flags.h"
+#include "function.h"
+#include "tree-inline.h"
+#include "gimple.h"
+#include "tree-iterator.h"
+#include "tree-flow.h"
+#include "tree-dump.h"
+#include "tree-pass.h"
+#include "diagnostic.h"
+#include "demangle.h"
+#include "langhooks.h"
+#include "ggc.h"
+#include "cgraph.h"
+#include "gimple.h"
+#include "asan.h"
+#include "gimple-pretty-print.h"
+
+/*
+ AddressSanitizer finds out-of-bounds and use-after-free bugs 
+ with <2x slowdown on average.
+
+ The tool consists of two parts:
+ instrumentation module (this file) and a run-time library.
+ The instrumentation module adds a run-time check before every memory insn.
+   For a 8- or 16- byte load accessing address X:
+     ShadowAddr = (X >> 3) + Offset
+     ShadowValue = *(char*)ShadowAddr;  // *(short*) for 16-byte access.
+     if (ShadowValue)
+       __asan_report_load8(X);
+   For a load of N bytes (N=1, 2 or 4) from address X:
+     ShadowAddr = (X >> 3) + Offset
+     ShadowValue = *(char*)ShadowAddr;
+     if (ShadowValue)
+       if ((X & 7) + N - 1 > ShadowValue)
+         __asan_report_loadN(X);
+ Stores are instrumented similarly, but using __asan_report_storeN functions.
+ A call too __asan_init() is inserted to the list of module CTORs.
+
+ The run-time library redefines malloc (so that redzone are inserted around
+ the allocated memory) and free (so that reuse of free-ed memory is delayed),
+ provides __asan_report* and __asan_init functions.
+
+ Read more:
+ http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm
+
+ Future work:
+ The current implementation supports only detection of out-of-bounds and
+ use-after-free bugs in heap.
+ In order to support out-of-bounds for stack and globals we will need
+ to create redzones for stack and global object and poison them.
+*/
+
+/* The shadow address is computed as (X>>asan_scale) + (1<<asan_offset_log).
+ We may want to add command line flags to change these values.  */
+
+static const int asan_scale = 3;
+static const int asan_offset_log_32 = 29;
+static const int asan_offset_log_64 = 44;
+static int asan_offset_log;
+
+
+/* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
+   IS_STORE is either 1 (for a store) or 0 (for a load).
+   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
+
+static tree
+report_error_func (int is_store, int size_in_bytes)
+{
+  tree fn_type;
+  tree def;
+  char name[100];
+
+  sprintf (name, "__asan_report_%s%d\n",
+           is_store ? "store" : "load", size_in_bytes);
+  fn_type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
+  def = build_fn_decl (name, fn_type);
+  TREE_NOTHROW (def) = 1;
+  TREE_THIS_VOLATILE (def) = 1;  /* Attribute noreturn. Surprise!  */
+  DECL_ATTRIBUTES (def) = tree_cons (get_identifier ("leaf"), 
+                                     NULL, DECL_ATTRIBUTES (def));
+  DECL_ASSEMBLER_NAME (def);
+  return def;
+}
+
+/* Construct a function tree for __asan_init().  */
+
+static tree
+asan_init_func (void)
+{
+  tree fn_type;
+  tree def;
+
+  fn_type = build_function_type_list (void_type_node, NULL_TREE);
+  def = build_fn_decl ("__asan_init", fn_type);
+  TREE_NOTHROW (def) = 1;
+  DECL_ASSEMBLER_NAME (def);
+  return def;
+}
+
+
+/* Instrument the memory access instruction BASE.
+   Insert new statements before ITER.
+   LOCATION is source code location.
+   IS_STORE is either 1 (for a store) or 0 (for a load).
+   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
+
+static void
+build_check_stmt (tree base,
+                  gimple_stmt_iterator *iter,
+                  location_t location, int is_store, int size_in_bytes)
+{
+  gimple_stmt_iterator gsi;
+  basic_block cond_bb, then_bb, join_bb;
+  edge e;
+  tree cond, t, u;
+  tree base_addr;
+  tree shadow_value;
+  gimple g;
+  gimple_seq seq, stmts;
+  tree shadow_type = size_in_bytes == 16 ?
+      short_integer_type_node : char_type_node;
+  tree shadow_ptr_type = build_pointer_type (shadow_type);
+  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode,
+                                                      /*unsignedp=*/true);
+
+  /* We first need to split the current basic block, and start altering
+     the CFG.  This allows us to insert the statements we're about to
+     construct into the right basic blocks.  */
+
+  cond_bb = gimple_bb (gsi_stmt (*iter));
+  gsi = *iter;
+  gsi_prev (&gsi);
+  if (!gsi_end_p (gsi))
+    e = split_block (cond_bb, gsi_stmt (gsi));
+  else
+    e = split_block_after_labels (cond_bb);
+  cond_bb = e->src;
+  join_bb = e->dest;
+
+  /* A recap at this point: join_bb is the basic block at whose head
+     is the gimple statement for which this check expression is being
+     built.  cond_bb is the (possibly new, synthetic) basic block the
+     end of which will contain the cache-lookup code, and a
+     conditional that jumps to the cache-miss code or, much more
+     likely, over to join_bb.  */
+
+  /* Create the bb that contains the crash block.  */
+  then_bb = create_empty_bb (cond_bb);
+  make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  make_single_succ_edge (then_bb, join_bb, EDGE_FALLTHRU);
+
+  /* Mark the pseudo-fallthrough edge from cond_bb to join_bb.  */
+  e = find_edge (cond_bb, join_bb);
+  e->flags = EDGE_FALSE_VALUE;
+  e->count = cond_bb->count;
+  e->probability = REG_BR_PROB_BASE;
+
+  /* Update dominance info.  Note that bb_join's data was
+     updated by split_block.  */
+  if (dom_info_available_p (CDI_DOMINATORS))
+    {
+      set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
+      set_immediate_dominator (CDI_DOMINATORS, join_bb, cond_bb);
+    }
+
+  base_addr = create_tmp_reg (uintptr_type, "__asan_base_addr");
+
+  seq = NULL; 
+  t = fold_convert_loc (location, uintptr_type,
+                        unshare_expr (base));
+  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
+  gimple_seq_add_seq (&seq, stmts);
+  g = gimple_build_assign (base_addr, t);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+
+  /* Build (base_addr >> asan_scale) + (1 << asan_offset_log).  */
+
+  t = build2 (RSHIFT_EXPR, uintptr_type, base_addr,
+              build_int_cst (uintptr_type, asan_scale));
+  t = build2 (PLUS_EXPR, uintptr_type, t,
+              build2 (LSHIFT_EXPR, uintptr_type,
+                      build_int_cst (uintptr_type, 1),
+                      build_int_cst (uintptr_type, asan_offset_log)
+                     ));
+  t = build1 (INDIRECT_REF, shadow_type,
+              build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t));
+  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
+  gimple_seq_add_seq (&seq, stmts);
+  shadow_value = create_tmp_reg (shadow_type, "__asan_shadow");
+  g = gimple_build_assign (shadow_value, t);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+  t = build2 (NE_EXPR, boolean_type_node, shadow_value,
+              build_int_cst (shadow_type, 0));
+  if (size_in_bytes < 8)
+    {
+
+      /* Slow path for 1-, 2- and 4- byte accesses.
+         Build ((base_addr & 7) + (size_in_bytes - 1)) >= shadow_value.  */
+
+      u = build2 (BIT_AND_EXPR, uintptr_type,
+                  base_addr,
+                  build_int_cst (uintptr_type, 7));
+      u = build1 (CONVERT_EXPR, shadow_type, u);
+      u = build2 (PLUS_EXPR, shadow_type, u,
+                  build_int_cst (shadow_type, size_in_bytes - 1));
+      u = build2 (GE_EXPR, uintptr_type, u, shadow_value);
+    }
+  else
+      u = build_int_cst (boolean_type_node, 1);
+  t = build2 (TRUTH_AND_EXPR, boolean_type_node, t, u);
+  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
+  gimple_seq_add_seq (&seq, stmts);
+  cond = create_tmp_reg (boolean_type_node, "__asan_crash_cond");
+  g = gimple_build_assign  (cond, t);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+  g = gimple_build_cond (NE_EXPR, cond, boolean_false_node, NULL_TREE,
+                         NULL_TREE);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+
+  /* Generate call to the run-time library (e.g. __asan_report_load8).  */
+
+  gsi = gsi_last_bb (cond_bb);
+  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
+  seq = NULL; 
+  g = gimple_build_call (report_error_func (is_store, size_in_bytes),
+                         1, base_addr);
+  gimple_seq_add_stmt (&seq, g);
+
+  /* Insert the check code in the THEN block.  */
+
+  gsi = gsi_start_bb (then_bb);
+  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
+
+  *iter = gsi_start_bb (join_bb);
+}
+
+/* If T represents a memory access, add instrumentation code before ITER.
+   LOCATION is source code location.
+   IS_STORE is either 1 (for a store) or 0 (for a load).  */
+
+static void
+instrument_derefs (gimple_stmt_iterator *iter, tree t,
+                  location_t location, int is_store)
+{
+  tree type, base;
+  int size_in_bytes;
+
+  type = TREE_TYPE (t);
+  if (type == error_mark_node)
+    return;
+  switch (TREE_CODE (t))
+    {
+    case ARRAY_REF:
+    case COMPONENT_REF:
+    case INDIRECT_REF:
+    case MEM_REF:
+      break;
+    default:
+      return;
+    }
+  size_in_bytes = tree_low_cst (TYPE_SIZE (type), 0) / BITS_PER_UNIT;
+  if (size_in_bytes != 1 && size_in_bytes != 2 &&
+      size_in_bytes != 4 && size_in_bytes != 8 && size_in_bytes != 16)
+      return;
+  {
+    /* For now just avoid instrumenting bit field acceses.
+     Fixing it is doable, but expected to be messy.  */
+
+    HOST_WIDE_INT bitsize, bitpos;
+    tree offset;
+    enum machine_mode mode;
+    int volatilep = 0, unsignedp = 0;
+    get_inner_reference (t, &bitsize, &bitpos, &offset,
+                         &mode, &unsignedp, &volatilep, false);
+    if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
+        return;
+  }
+
+  base = build_addr (t, current_function_decl);
+  build_check_stmt (base, iter, location, is_store, size_in_bytes);
+}
+
+/* asan: this looks too complex. Can this be done simpler? */
+/* Transform
+   1) Memory references.
+   2) BUILTIN_ALLOCA calls.
+*/
+
+static void
+transform_statements (void)
+{
+  basic_block bb;
+  gimple_stmt_iterator i;
+  int saved_last_basic_block = last_basic_block;
+  enum gimple_rhs_class grhs_class;
+
+  FOR_EACH_BB (bb)
+    {
+      if (bb->index >= saved_last_basic_block) continue;
+      for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+        {
+          gimple s = gsi_stmt (i);
+          if (gimple_code (s) != GIMPLE_ASSIGN)
+              continue;
+          instrument_derefs (&i, gimple_assign_lhs (s),
+                             gimple_location (s), 1);
+          instrument_derefs (&i, gimple_assign_rhs1 (s),
+                             gimple_location (s), 0);
+          grhs_class = get_gimple_rhs_class (gimple_assign_rhs_code (s));
+          if (grhs_class == GIMPLE_BINARY_RHS)
+            instrument_derefs (&i, gimple_assign_rhs2 (s),
+                               gimple_location (s), 0);
+        }
+    }
+}
+
+/* Module-level instrumentation.
+   - Insert __asan_init() into the list of CTORs.
+   - TODO: insert redzones around globals.
+ */
+
+void
+asan_finish_file (void)
+{
+  tree ctor_statements = NULL_TREE;
+  append_to_statement_list (build_call_expr (asan_init_func (), 0),
+                            &ctor_statements);
+  cgraph_build_static_cdtor ('I', ctor_statements,
+                             MAX_RESERVED_INIT_PRIORITY - 1);
+}
+
+/* Instrument the current function.  */
+
+static unsigned int
+asan_instrument (void)
+{
+  struct gimplify_ctx gctx;
+  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode, true);
+  int is_64 = tree_low_cst (TYPE_SIZE (uintptr_type), 0) == 64;
+  asan_offset_log = is_64 ? asan_offset_log_64 : asan_offset_log_32;
+  push_gimplify_context (&gctx);
+  transform_statements ();
+  pop_gimplify_context (NULL);
+  return 0;
+}
+
+static bool
+gate_asan (void)
+{
+  return flag_asan != 0;
+}
+
+struct gimple_opt_pass pass_asan =
+{
+ {
+  GIMPLE_PASS,
+  "asan",                               /* name  */
+  OPTGROUP_NONE,                        /* optinfo_flags */
+  gate_asan,                            /* gate  */
+  asan_instrument,                      /* execute  */
+  NULL,                                 /* sub  */
+  NULL,                                 /* next  */
+  0,                                    /* static_pass_number  */
+  TV_NONE,                              /* tv_id  */
+  PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required  */
+  0,                                    /* properties_provided  */
+  0,                                    /* properties_destroyed  */
+  0,                                    /* todo_flags_start  */
+  TODO_verify_flow | TODO_verify_stmts
+  | TODO_update_ssa    /* todo_flags_finish  */
+ }
+};
diff --git a/gcc/asan.h b/gcc/asan.h
new file mode 100644
index 0000000..590cf35
--- /dev/null
+++ b/gcc/asan.h
@@ -0,0 +1,26 @@
+/* AddressSanitizer, a fast memory error detector.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Kostya Serebryany <kcc@google.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef TREE_ASAN
+#define TREE_ASAN
+
+extern void asan_finish_file(void);
+
+#endif /* TREE_ASAN */
diff --git a/gcc/common.opt b/gcc/common.opt
index 5b69aff..789f61c 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -849,6 +849,10 @@ fargument-noalias-anything
 Common Ignore
 Does nothing. Preserved for backward compatibility.
 
+faddress-sanitizer
+Common RejectNegative Report Var(flag_asan)
+Enable AddressSanitizer, a memory error detector
+
 fasynchronous-unwind-tables
 Common Report Var(flag_asynchronous_unwind_tables) Optimization
 Generate unwind tables that are exact at each instruction boundary
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ff0c87d..16ab69a 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -353,9 +353,10 @@ Objective-C and Objective-C++ Dialects}.
 @item Optimization Options
 @xref{Optimize Options,,Options that Control Optimization}.
 @gccoptlist{-falign-functions[=@var{n}] -falign-jumps[=@var{n}] @gol
--falign-labels[=@var{n}] -falign-loops[=@var{n}] -fassociative-math @gol
--fauto-inc-dec -fbranch-probabilities -fbranch-target-load-optimize @gol
--fbranch-target-load-optimize2 -fbtr-bb-exclusive -fcaller-saves @gol
+-falign-labels[=@var{n}] -falign-loops[=@var{n}] -faddress-sanitizer @gol
+--fassociative-math fauto-inc-dec -fbranch-probabilities @gol
+--fbranch-target-load-optimize fbranch-target-load-optimize2 @gol
+--fbtr-bb-exclusive -fcaller-saves @gol
 -fcheck-data-deps -fcombine-stack-adjustments -fconserve-stack @gol
 -fcompare-elim -fcprop-registers -fcrossjumping @gol
 -fcse-follow-jumps -fcse-skip-blocks -fcx-fortran-rules @gol
@@ -6822,6 +6823,12 @@ assumptions based on that.
 
 The default is @option{-fzero-initialized-in-bss}.
 
+@item -faddress-sanitizer
+Enable AddressSanitizer, a fast memory error detector.
+Memory access instructions will be instrumented to detect
+out-of-bounds and use-after-free bugs. So far only heap bugs will be detected.
+See @uref{http://code.google.com/p/address-sanitizer/} for more details.
+
 @item -fmudflap -fmudflapth -fmudflapir
 @opindex fmudflap
 @opindex fmudflapth
diff --git a/gcc/passes.c b/gcc/passes.c
index 67aae52..66a2f74 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1456,6 +1456,7 @@ init_optimization_passes (void)
       NEXT_PASS (pass_split_crit_edges);
       NEXT_PASS (pass_pre);
       NEXT_PASS (pass_sink_code);
+      NEXT_PASS (pass_asan);
       NEXT_PASS (pass_tree_loop);
 	{
 	  struct opt_pass **p = &pass_tree_loop.pass.sub;
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 5cbb364..3ca0736 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -72,6 +72,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "value-prof.h"
 #include "alloc-pool.h"
 #include "tree-mudflap.h"
+#include "asan.h"
 #include "gimple.h"
 #include "tree-ssa-alias.h"
 #include "plugin.h"
@@ -570,6 +571,10 @@ compile_file (void)
       if (flag_mudflap)
 	mudflap_finish_file ();
 
+      /* File-scope initialization for AddressSanitizer.  */
+      if (flag_asan)
+        asan_finish_file ();
+
       output_shared_constant_pool ();
       output_object_blocks ();
       finish_tm_clone_pairs ();
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 8ed2d98..73c5886 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -259,6 +259,7 @@ struct register_pass_info
 
 extern struct gimple_opt_pass pass_mudflap_1;
 extern struct gimple_opt_pass pass_mudflap_2;
+extern struct gimple_opt_pass pass_asan;
 extern struct gimple_opt_pass pass_lower_cf;
 extern struct gimple_opt_pass pass_refactor_eh;
 extern struct gimple_opt_pass pass_lower_eh;
-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 02/10] Initial asan cleanups
  2012-11-02 22:53 ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
  2012-11-02 22:56   ` [PATCH 01/10] Initial import of asan from the Google branch into trunk Dodji Seketeli
@ 2012-11-02 22:57   ` Dodji Seketeli
  2012-11-06 17:04     ` Diego Novillo
  2012-11-02 22:58   ` [PATCH 03/10] Emit GIMPLE directly instead of gimplifying GENERIC Dodji Seketeli
                     ` (9 subsequent siblings)
  11 siblings, 1 reply; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-02 22:57 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

This patch defines a new asan_shadow_offset target macro, instead of
having a mere macro in the asan.c file.  It becomes thus cleaner to
define the target macro for targets that supports asan, namely x86 for
now.  The ASAN_SHADOW_SHIFT (which, along with the asan_shadow_offset
constant, is used to compute the address of the shadow memory byte for
a given memory address) is defined in asan.h.

	* toplev.c (process_options): Warn and turn off
	-faddress-sanitizer if not supported by target.
	* asan.c: Include target.h.
	(asan_scale, asan_offset_log_32, asan_offset_log_64,
	asan_offset_log): Removed.
	(build_check_stmt): Use ASAN_SHADOW_SHIFT and
	targetm.asan_shadow_offset ().
	(asan_instrument): Don't initialize asan_offset_log.
	* asan.h (ASAN_SHADOW_SHIFT): Define.
	* target.def (TARGET_ASAN_SHADOW_OFFSET): New hook.
	* doc/tm.texi.in (TARGET_ASAN_SHADOW_OFFSET): Add it.
	* doc/tm.texi: Regenerated.
	* Makefile.in (asan.o): Depend on $(TARGET_H).
	* config/i386/i386.c (ix86_asan_shadow_offset): New function.
	(TARGET_ASAN_SHADOW_OFFSET): Define.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192372 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan     | 18 ++++++++++++++++++
 gcc/Makefile.in        |  2 +-
 gcc/asan.c             | 25 ++++++-------------------
 gcc/asan.h             |  6 +++++-
 gcc/config/i386/i386.c | 11 +++++++++++
 gcc/doc/tm.texi        |  6 ++++++
 gcc/doc/tm.texi.in     |  2 ++
 gcc/target.def         | 11 +++++++++++
 gcc/toplev.c           |  7 +++++++
 9 files changed, 67 insertions(+), 21 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 704aa61..d13a584 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,3 +1,21 @@
+2012-10-11  Jakub Jelinek  <jakub@redhat.com>
+
+	* toplev.c (process_options): Warn and turn off
+	-faddress-sanitizer if not supported by target.
+	* asan.c: Include target.h.
+	(asan_scale, asan_offset_log_32, asan_offset_log_64,
+	asan_offset_log): Removed.
+	(build_check_stmt): Use ASAN_SHADOW_SHIFT and
+	targetm.asan_shadow_offset ().
+	(asan_instrument): Don't initialize asan_offset_log.
+	* asan.h (ASAN_SHADOW_SHIFT): Define.
+	* target.def (TARGET_ASAN_SHADOW_OFFSET): New hook.
+	* doc/tm.texi.in (TARGET_ASAN_SHADOW_OFFSET): Add it.
+	* doc/tm.texi: Regenerated.
+	* Makefile.in (asan.o): Depend on $(TARGET_H).
+	* config/i386/i386.c (ix86_asan_shadow_offset): New function.
+	(TARGET_ASAN_SHADOW_OFFSET): Define.
+
 2012-10-10  Wei Mi <wmi@google.com>
 	    Diego Novillo <dnovillo@google.com>
 	    Dodji Seketeli <dodji@redhat.com>
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 3bade7f..c86bbeb 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2210,7 +2210,7 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
 asan.o : asan.c asan.h $(CONFIG_H) pointer-set.h \
    $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \
    output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \
-   tree-pretty-print.h
+   tree-pretty-print.h $(TARGET_H)
 tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
    $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
    $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) \
diff --git a/gcc/asan.c b/gcc/asan.c
index 4b07c96..6cd9625 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1,5 +1,5 @@
 /* AddressSanitizer, a fast memory error detector.
-   Copyright (C) 2011 Free Software Foundation, Inc.
+   Copyright (C) 2011, 2012 Free Software Foundation, Inc.
    Contributed by Kostya Serebryany <kcc@google.com>
 
 This file is part of GCC.
@@ -42,6 +42,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple.h"
 #include "asan.h"
 #include "gimple-pretty-print.h"
+#include "target.h"
 
 /*
  AddressSanitizer finds out-of-bounds and use-after-free bugs 
@@ -78,15 +79,6 @@ along with GCC; see the file COPYING3.  If not see
  to create redzones for stack and global object and poison them.
 */
 
-/* The shadow address is computed as (X>>asan_scale) + (1<<asan_offset_log).
- We may want to add command line flags to change these values.  */
-
-static const int asan_scale = 3;
-static const int asan_offset_log_32 = 29;
-static const int asan_offset_log_64 = 44;
-static int asan_offset_log;
-
-
 /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
    IS_STORE is either 1 (for a store) or 0 (for a load).
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
@@ -202,15 +194,13 @@ build_check_stmt (tree base,
   gimple_set_location (g, location);
   gimple_seq_add_stmt (&seq, g);
 
-  /* Build (base_addr >> asan_scale) + (1 << asan_offset_log).  */
+  /* Build
+     (base_addr >> ASAN_SHADOW_SHIFT) | targetm.asan_shadow_offset ().  */
 
   t = build2 (RSHIFT_EXPR, uintptr_type, base_addr,
-              build_int_cst (uintptr_type, asan_scale));
+	      build_int_cst (uintptr_type, ASAN_SHADOW_SHIFT));
   t = build2 (PLUS_EXPR, uintptr_type, t,
-              build2 (LSHIFT_EXPR, uintptr_type,
-                      build_int_cst (uintptr_type, 1),
-                      build_int_cst (uintptr_type, asan_offset_log)
-                     ));
+	      build_int_cst (uintptr_type, targetm.asan_shadow_offset ()));
   t = build1 (INDIRECT_REF, shadow_type,
               build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t));
   t = force_gimple_operand (t, &stmts, false, NULL_TREE);
@@ -367,9 +357,6 @@ static unsigned int
 asan_instrument (void)
 {
   struct gimplify_ctx gctx;
-  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode, true);
-  int is_64 = tree_low_cst (TYPE_SIZE (uintptr_type), 0) == 64;
-  asan_offset_log = is_64 ? asan_offset_log_64 : asan_offset_log_32;
   push_gimplify_context (&gctx);
   transform_statements ();
   pop_gimplify_context (NULL);
diff --git a/gcc/asan.h b/gcc/asan.h
index 590cf35..699820b 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -1,5 +1,5 @@
 /* AddressSanitizer, a fast memory error detector.
-   Copyright (C) 2011 Free Software Foundation, Inc.
+   Copyright (C) 2011, 2012 Free Software Foundation, Inc.
    Contributed by Kostya Serebryany <kcc@google.com>
 
 This file is part of GCC.
@@ -23,4 +23,8 @@ along with GCC; see the file COPYING3.  If not see
 
 extern void asan_finish_file(void);
 
+/* Shadow memory is found at
+   (address >> ASAN_SHADOW_SHIFT) | targetm.asan_shadow_offset ().  */
+#define ASAN_SHADOW_SHIFT	3
+
 #endif /* TREE_ASAN */
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index fae300e..881c955 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5647,6 +5647,14 @@ ix86_legitimate_combined_insn (rtx insn)
   return true;
 }
 \f
+/* Implement the TARGET_ASAN_SHADOW_OFFSET hook.  */
+
+static unsigned HOST_WIDE_INT
+ix86_asan_shadow_offset (void)
+{
+  return (unsigned HOST_WIDE_INT) 1 << (TARGET_LP64 ? 44 : 29);
+}
+\f
 /* Argument support functions.  */
 
 /* Return true when register may be used to pass function parameters.  */
@@ -41379,6 +41387,9 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val)
 #undef TARGET_LEGITIMATE_COMBINED_INSN
 #define TARGET_LEGITIMATE_COMBINED_INSN ix86_legitimate_combined_insn
 
+#undef TARGET_ASAN_SHADOW_OFFSET
+#define TARGET_ASAN_SHADOW_OFFSET ix86_asan_shadow_offset
+
 #undef TARGET_GIMPLIFY_VA_ARG_EXPR
 #define TARGET_GIMPLIFY_VA_ARG_EXPR ix86_gimplify_va_arg
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 665c5b1..908ddbf 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -11326,6 +11326,12 @@ MIPS, where add-immediate takes a 16-bit signed value,
 is zero, which disables this optimization.
 @end deftypevr
 
+@deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_ASAN_SHADOW_OFFSET (void)
+Return the offset bitwise ored into shifted address to get corresponding
+Address Sanitizer shadow memory address.  NULL if Address Sanitizer is not
+supported by the target.
+@end deftypefn
+
 @deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_MEMMODEL_CHECK (unsigned HOST_WIDE_INT @var{val})
 Validate target specific memory model mask bits. When NULL no target specific
 memory model bits are allowed.
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 289934b..0786691 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -11168,6 +11168,8 @@ MIPS, where add-immediate takes a 16-bit signed value,
 is zero, which disables this optimization.
 @end deftypevr
 
+@hook TARGET_ASAN_SHADOW_OFFSET
+
 @hook TARGET_MEMMODEL_CHECK
 Validate target specific memory model mask bits. When NULL no target specific
 memory model bits are allowed.
diff --git a/gcc/target.def b/gcc/target.def
index 5865224..f8781a8 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2025,6 +2025,17 @@ DEFHOOK
  "",
  unsigned HOST_WIDE_INT, (unsigned HOST_WIDE_INT val), NULL)
 
+/* Defines an offset bitwise ored into shifted address to get corresponding
+   Address Sanitizer shadow address, or -1 if Address Sanitizer is not
+   supported by the target.  */
+DEFHOOK
+(asan_shadow_offset,
+ "Return the offset bitwise ored into shifted address to get corresponding\n\
+Address Sanitizer shadow memory address.  NULL if Address Sanitizer is not\n\
+supported by the target.",
+ unsigned HOST_WIDE_INT, (void),
+ NULL)
+
 /* Functions relating to calls - argument passing, returns, etc.  */
 /* Members of struct call have no special macro prefix.  */
 HOOK_VECTOR (TARGET_CALLS, calls)
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 3ca0736..68849f5 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1541,6 +1541,13 @@ process_options (void)
       flag_omit_frame_pointer = 0;
     }
 
+  /* Address Sanitizer needs porting to each target architecture.  */
+  if (flag_asan && targetm.asan_shadow_offset == NULL)
+    {
+      warning (0, "-fasan not supported for this target");
+      flag_asan = 0;
+    }
+
   /* Enable -Werror=coverage-mismatch when -Werror and -Wno-error
      have not been set.  */
   if (!global_options_set.x_warnings_are_errors
-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 03/10] Emit GIMPLE directly instead of gimplifying GENERIC.
  2012-11-02 22:53 ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
  2012-11-02 22:56   ` [PATCH 01/10] Initial import of asan from the Google branch into trunk Dodji Seketeli
  2012-11-02 22:57   ` [PATCH 02/10] Initial asan cleanups Dodji Seketeli
@ 2012-11-02 22:58   ` Dodji Seketeli
  2012-11-06 17:08     ` Diego Novillo
  2012-11-02 22:59   ` [PATCH 04/10] Allow asan at -O0 Dodji Seketeli
                     ` (8 subsequent siblings)
  11 siblings, 1 reply; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-02 22:58 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

This patch cleanups the instrumentation code generation by emitting
GIMPLE directly, as opposed to emitting GENERIC tree and then
gimplifying them.  It also does some cleanups here and there

	* Makefile.in (GTFILES): Add $(srcdir)/asan.c.
	(asan.o): Update the dependencies of asan.o.
	* asan.c (tm.h, tree.h, tm_p.h, basic-block.h, flags.h
	function.h, tree-inline.h, tree-dump.h, diagnostic.h, demangle.h,
	langhooks.h, ggc.h, cgraph.h, gimple.h): Remove these unused but
	included headers.
	(shadow_ptr_types): New variable.
	(report_error_func): Change is_store argument to bool, don't append
	newline to function name.
	(PROB_VERY_UNLIKELY, PROB_ALWAYS): Define.
	(build_check_stmt): Change is_store argument to bool.  Emit GIMPLE
	directly instead of creating trees and gimplifying them.  Mark
	the error reporting function as very unlikely.
	(instrument_derefs): Change is_store argument to bool.  Use
	int_size_in_bytes to compute size_in_bytes, simplify size check.
	Use build_fold_addr_expr instead of build_addr.
	(transform_statements): Adjust instrument_derefs caller.
	Use gimple_assign_single_p as stmt test.  Don't look at MEM refs
	in rhs2.
	(asan_init_shadow_ptr_types): New function.
	(asan_instrument): Don't push/pop gimplify context.
	Call asan_init_shadow_ptr_types if not yet initialized.
	* asan.h (ASAN_SHADOW_SHIFT): Adjust comment.
---
 gcc/ChangeLog.asan |  27 +++++
 gcc/Makefile.in    |   9 +-
 gcc/asan.c         | 284 +++++++++++++++++++++++++++++++----------------------
 gcc/asan.h         |   2 +-
 4 files changed, 199 insertions(+), 123 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index d13a584..973ee6b 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,4 +1,31 @@
 2012-10-11  Jakub Jelinek  <jakub@redhat.com>
+	    Dodji Seketeli <dodji@redhat.com>
+
+	* Makefile.in (GTFILES): Add $(srcdir)/asan.c.
+	(asan.o): Update the dependencies of asan.o.
+	* asan.c (tm.h, tree.h, tm_p.h, basic-block.h, flags.h
+	function.h, tree-inline.h, tree-dump.h, diagnostic.h, demangle.h,
+	langhooks.h, ggc.h, cgraph.h, gimple.h): Remove these unused but
+	included headers.
+	(shadow_ptr_types): New variable.
+	(report_error_func): Change is_store argument to bool, don't append
+	newline to function name.
+	(PROB_VERY_UNLIKELY, PROB_ALWAYS): Define.
+	(build_check_stmt): Change is_store argument to bool.  Emit GIMPLE
+	directly instead of creating trees and gimplifying them.  Mark
+	the error reporting function as very unlikely.
+	(instrument_derefs): Change is_store argument to bool.  Use
+	int_size_in_bytes to compute size_in_bytes, simplify size check.
+	Use build_fold_addr_expr instead of build_addr.
+	(transform_statements): Adjust instrument_derefs caller.
+	Use gimple_assign_single_p as stmt test.  Don't look at MEM refs
+	in rhs2.
+	(asan_init_shadow_ptr_types): New function.
+	(asan_instrument): Don't push/pop gimplify context.
+	Call asan_init_shadow_ptr_types if not yet initialized.
+	* asan.h (ASAN_SHADOW_SHIFT): Adjust comment.
+
+2012-10-11  Jakub Jelinek  <jakub@redhat.com>
 
 	* toplev.c (process_options): Warn and turn off
 	-faddress-sanitizer if not supported by target.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index c86bbeb..1536800 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2207,10 +2207,10 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
    $(TREE_H) $(PARAMS_H) $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(RTL_H) \
    $(GGC_H) $(TM_P_H) $(TARGET_H) langhooks.h $(REGS_H) gt-stor-layout.h \
    $(DIAGNOSTIC_CORE_H) $(CGRAPH_H) $(TREE_INLINE_H) $(TREE_DUMP_H) $(GIMPLE_H)
-asan.o : asan.c asan.h $(CONFIG_H) pointer-set.h \
-   $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \
-   output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \
-   tree-pretty-print.h $(TARGET_H)
+asan.o : asan.c asan.h $(CONFIG_H) $(SYSTEM_H) $(GIMPLE_H) \
+   output.h coretypes.h $(GIMPLE_PRETTY_PRINT_H) \
+   tree-iterator.h $(TREE_FLOW_H) $(TREE_PASS_H) \
+   $(TARGET_H)
 tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
    $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
    $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) \
@@ -3723,6 +3723,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/lto-streamer.h \
   $(srcdir)/target-globals.h \
   $(srcdir)/ipa-inline.h \
+  $(srcdir)/asan.c \
   @all_gtfiles@
 
 # Compute the list of GT header files from the corresponding C sources,
diff --git a/gcc/asan.c b/gcc/asan.c
index 6cd9625..baaec0f 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -22,24 +22,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
-#include "tm.h"
-#include "tree.h"
-#include "tm_p.h"
-#include "basic-block.h"
-#include "flags.h"
-#include "function.h"
-#include "tree-inline.h"
 #include "gimple.h"
 #include "tree-iterator.h"
 #include "tree-flow.h"
-#include "tree-dump.h"
 #include "tree-pass.h"
-#include "diagnostic.h"
-#include "demangle.h"
-#include "langhooks.h"
-#include "ggc.h"
-#include "cgraph.h"
-#include "gimple.h"
 #include "asan.h"
 #include "gimple-pretty-print.h"
 #include "target.h"
@@ -79,18 +65,22 @@ along with GCC; see the file COPYING3.  If not see
  to create redzones for stack and global object and poison them.
 */
 
+/* Pointer types to 1 resp. 2 byte integers in shadow memory.  A separate
+   alias set is used for all shadow memory accesses.  */
+static GTY(()) tree shadow_ptr_types[2];
+
 /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
    IS_STORE is either 1 (for a store) or 0 (for a load).
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
 
 static tree
-report_error_func (int is_store, int size_in_bytes)
+report_error_func (bool is_store, int size_in_bytes)
 {
   tree fn_type;
   tree def;
   char name[100];
 
-  sprintf (name, "__asan_report_%s%d\n",
+  sprintf (name, "__asan_report_%s%d",
            is_store ? "store" : "load", size_in_bytes);
   fn_type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
   def = build_fn_decl (name, fn_type);
@@ -118,6 +108,9 @@ asan_init_func (void)
 }
 
 
+#define PROB_VERY_UNLIKELY	(REG_BR_PROB_BASE / 2000 - 1)
+#define PROB_ALWAYS		(REG_BR_PROB_BASE)
+
 /* Instrument the memory access instruction BASE.
    Insert new statements before ITER.
    LOCATION is source code location.
@@ -127,21 +120,17 @@ asan_init_func (void)
 static void
 build_check_stmt (tree base,
                   gimple_stmt_iterator *iter,
-                  location_t location, int is_store, int size_in_bytes)
+                  location_t location, bool is_store, int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
   basic_block cond_bb, then_bb, join_bb;
   edge e;
-  tree cond, t, u;
-  tree base_addr;
-  tree shadow_value;
+  tree t, base_addr, shadow;
   gimple g;
-  gimple_seq seq, stmts;
-  tree shadow_type = size_in_bytes == 16 ?
-      short_integer_type_node : char_type_node;
-  tree shadow_ptr_type = build_pointer_type (shadow_type);
-  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode,
-                                                      /*unsignedp=*/true);
+  tree shadow_ptr_type = shadow_ptr_types[size_in_bytes == 16 ? 1 : 0];
+  tree shadow_type = TREE_TYPE (shadow_ptr_type);
+  tree uintptr_type
+    = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE (base)), 1);
 
   /* We first need to split the current basic block, and start altering
      the CFG.  This allows us to insert the statements we're about to
@@ -166,14 +155,15 @@ build_check_stmt (tree base,
 
   /* Create the bb that contains the crash block.  */
   then_bb = create_empty_bb (cond_bb);
-  make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  e->probability = PROB_VERY_UNLIKELY;
   make_single_succ_edge (then_bb, join_bb, EDGE_FALLTHRU);
 
   /* Mark the pseudo-fallthrough edge from cond_bb to join_bb.  */
   e = find_edge (cond_bb, join_bb);
   e->flags = EDGE_FALSE_VALUE;
   e->count = cond_bb->count;
-  e->probability = REG_BR_PROB_BASE;
+  e->probability = PROB_ALWAYS - PROB_VERY_UNLIKELY;
 
   /* Update dominance info.  Note that bb_join's data was
      updated by split_block.  */
@@ -183,75 +173,125 @@ build_check_stmt (tree base,
       set_immediate_dominator (CDI_DOMINATORS, join_bb, cond_bb);
     }
 
-  base_addr = create_tmp_reg (uintptr_type, "__asan_base_addr");
+  base = unshare_expr (base);
 
-  seq = NULL; 
-  t = fold_convert_loc (location, uintptr_type,
-                        unshare_expr (base));
-  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
-  gimple_seq_add_seq (&seq, stmts);
-  g = gimple_build_assign (base_addr, t);
+  gsi = gsi_last_bb (cond_bb);
+  g = gimple_build_assign_with_ops (TREE_CODE (base),
+				    make_ssa_name (TREE_TYPE (base), NULL),
+				    base, NULL_TREE);
   gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
-  /* Build
-     (base_addr >> ASAN_SHADOW_SHIFT) | targetm.asan_shadow_offset ().  */
-
-  t = build2 (RSHIFT_EXPR, uintptr_type, base_addr,
-	      build_int_cst (uintptr_type, ASAN_SHADOW_SHIFT));
-  t = build2 (PLUS_EXPR, uintptr_type, t,
-	      build_int_cst (uintptr_type, targetm.asan_shadow_offset ()));
-  t = build1 (INDIRECT_REF, shadow_type,
-              build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t));
-  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
-  gimple_seq_add_seq (&seq, stmts);
-  shadow_value = create_tmp_reg (shadow_type, "__asan_shadow");
-  g = gimple_build_assign (shadow_value, t);
+  g = gimple_build_assign_with_ops (NOP_EXPR,
+				    make_ssa_name (uintptr_type, NULL),
+				    gimple_assign_lhs (g), NULL_TREE);
   gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
-  t = build2 (NE_EXPR, boolean_type_node, shadow_value,
-              build_int_cst (shadow_type, 0));
-  if (size_in_bytes < 8)
-    {
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+  base_addr = gimple_assign_lhs (g);
 
-      /* Slow path for 1-, 2- and 4- byte accesses.
-         Build ((base_addr & 7) + (size_in_bytes - 1)) >= shadow_value.  */
+  /* Build
+     (base_addr >> ASAN_SHADOW_SHIFT) + targetm.asan_shadow_offset ().  */
 
-      u = build2 (BIT_AND_EXPR, uintptr_type,
-                  base_addr,
-                  build_int_cst (uintptr_type, 7));
-      u = build1 (CONVERT_EXPR, shadow_type, u);
-      u = build2 (PLUS_EXPR, shadow_type, u,
-                  build_int_cst (shadow_type, size_in_bytes - 1));
-      u = build2 (GE_EXPR, uintptr_type, u, shadow_value);
-    }
-  else
-      u = build_int_cst (boolean_type_node, 1);
-  t = build2 (TRUTH_AND_EXPR, boolean_type_node, t, u);
-  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
-  gimple_seq_add_seq (&seq, stmts);
-  cond = create_tmp_reg (boolean_type_node, "__asan_crash_cond");
-  g = gimple_build_assign  (cond, t);
+  t = build_int_cst (uintptr_type, ASAN_SHADOW_SHIFT);
+  g = gimple_build_assign_with_ops (RSHIFT_EXPR,
+				    make_ssa_name (uintptr_type, NULL),
+				    base_addr, t);
   gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
-  g = gimple_build_cond (NE_EXPR, cond, boolean_false_node, NULL_TREE,
-                         NULL_TREE);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+  t = build_int_cst (uintptr_type, targetm.asan_shadow_offset ());
+  g = gimple_build_assign_with_ops (PLUS_EXPR,
+				    make_ssa_name (uintptr_type, NULL),
+				    gimple_assign_lhs (g), t);
   gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
-  /* Generate call to the run-time library (e.g. __asan_report_load8).  */
+  g = gimple_build_assign_with_ops (NOP_EXPR,
+				    make_ssa_name (shadow_ptr_type, NULL),
+				    gimple_assign_lhs (g), NULL_TREE);
+  gimple_set_location (g, location);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
-  gsi = gsi_last_bb (cond_bb);
-  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
-  seq = NULL; 
-  g = gimple_build_call (report_error_func (is_store, size_in_bytes),
-                         1, base_addr);
-  gimple_seq_add_stmt (&seq, g);
+  t = build2 (MEM_REF, shadow_type, gimple_assign_lhs (g),
+	      build_int_cst (shadow_ptr_type, 0));
+  g = gimple_build_assign_with_ops (MEM_REF,
+				    make_ssa_name (shadow_type, NULL),
+				    t, NULL_TREE);
+  gimple_set_location (g, location);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+  shadow = gimple_assign_lhs (g);
+
+  if (size_in_bytes < 8)
+    {
+      /* Slow path for 1, 2 and 4 byte accesses.
+	 Test (shadow != 0)
+	      & ((base_addr & 7) + (size_in_bytes - 1)) >= shadow).  */
+      g = gimple_build_assign_with_ops (NE_EXPR,
+					make_ssa_name (boolean_type_node,
+						       NULL),
+					shadow,
+					build_int_cst (shadow_type, 0));
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+      t = gimple_assign_lhs (g);
+
+      g = gimple_build_assign_with_ops (BIT_AND_EXPR,
+					make_ssa_name (uintptr_type,
+						       NULL),
+					base_addr,
+					build_int_cst (uintptr_type, 7));
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+      g = gimple_build_assign_with_ops (NOP_EXPR,
+					make_ssa_name (shadow_type,
+						       NULL),
+					gimple_assign_lhs (g), NULL_TREE);
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+      if (size_in_bytes > 1)
+	{
+	  g = gimple_build_assign_with_ops (PLUS_EXPR,
+					    make_ssa_name (shadow_type,
+							   NULL),
+					    gimple_assign_lhs (g),
+					    build_int_cst (shadow_type,
+							   size_in_bytes - 1));
+	  gimple_set_location (g, location);
+	  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+	}
+
+      g = gimple_build_assign_with_ops (GE_EXPR,
+					make_ssa_name (boolean_type_node,
+						       NULL),
+					gimple_assign_lhs (g),
+					shadow);
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+      g = gimple_build_assign_with_ops (BIT_AND_EXPR,
+					make_ssa_name (boolean_type_node,
+						       NULL),
+					t, gimple_assign_lhs (g));
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+      t = gimple_assign_lhs (g);
+    }
+  else
+    t = shadow;
 
-  /* Insert the check code in the THEN block.  */
+  g = gimple_build_cond (NE_EXPR, t, build_int_cst (TREE_TYPE (t), 0),
+			 NULL_TREE, NULL_TREE);
+  gimple_set_location (g, location);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
+  /* Generate call to the run-time library (e.g. __asan_report_load8).  */
   gsi = gsi_start_bb (then_bb);
-  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
+  g = gimple_build_call (report_error_func (is_store, size_in_bytes),
+			 1, base_addr);
+  gimple_set_location (g, location);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
   *iter = gsi_start_bb (join_bb);
 }
@@ -262,14 +302,12 @@ build_check_stmt (tree base,
 
 static void
 instrument_derefs (gimple_stmt_iterator *iter, tree t,
-                  location_t location, int is_store)
+                  location_t location, bool is_store)
 {
   tree type, base;
-  int size_in_bytes;
+  HOST_WIDE_INT size_in_bytes;
 
   type = TREE_TYPE (t);
-  if (type == error_mark_node)
-    return;
   switch (TREE_CODE (t))
     {
     case ARRAY_REF:
@@ -280,25 +318,25 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
     default:
       return;
     }
-  size_in_bytes = tree_low_cst (TYPE_SIZE (type), 0) / BITS_PER_UNIT;
-  if (size_in_bytes != 1 && size_in_bytes != 2 &&
-      size_in_bytes != 4 && size_in_bytes != 8 && size_in_bytes != 16)
-      return;
-  {
-    /* For now just avoid instrumenting bit field acceses.
+
+  size_in_bytes = int_size_in_bytes (type);
+  if ((size_in_bytes & (size_in_bytes - 1)) != 0
+      || (unsigned HOST_WIDE_INT) size_in_bytes - 1 >= 16)
+    return;
+
+  /* For now just avoid instrumenting bit field acceses.
      Fixing it is doable, but expected to be messy.  */
 
-    HOST_WIDE_INT bitsize, bitpos;
-    tree offset;
-    enum machine_mode mode;
-    int volatilep = 0, unsignedp = 0;
-    get_inner_reference (t, &bitsize, &bitpos, &offset,
-                         &mode, &unsignedp, &volatilep, false);
-    if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
-        return;
-  }
-
-  base = build_addr (t, current_function_decl);
+  HOST_WIDE_INT bitsize, bitpos;
+  tree offset;
+  enum machine_mode mode;
+  int volatilep = 0, unsignedp = 0;
+  get_inner_reference (t, &bitsize, &bitpos, &offset,
+		       &mode, &unsignedp, &volatilep, false);
+  if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
+    return;
+
+  base = build_fold_addr_expr (t);
   build_check_stmt (base, iter, location, is_store, size_in_bytes);
 }
 
@@ -314,7 +352,6 @@ transform_statements (void)
   basic_block bb;
   gimple_stmt_iterator i;
   int saved_last_basic_block = last_basic_block;
-  enum gimple_rhs_class grhs_class;
 
   FOR_EACH_BB (bb)
     {
@@ -322,16 +359,12 @@ transform_statements (void)
       for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
         {
           gimple s = gsi_stmt (i);
-          if (gimple_code (s) != GIMPLE_ASSIGN)
-              continue;
+          if (!gimple_assign_single_p (s))
+	    continue;
           instrument_derefs (&i, gimple_assign_lhs (s),
-                             gimple_location (s), 1);
+                             gimple_location (s), true);
           instrument_derefs (&i, gimple_assign_rhs1 (s),
-                             gimple_location (s), 0);
-          grhs_class = get_gimple_rhs_class (gimple_assign_rhs_code (s));
-          if (grhs_class == GIMPLE_BINARY_RHS)
-            instrument_derefs (&i, gimple_assign_rhs2 (s),
-                               gimple_location (s), 0);
+                             gimple_location (s), false);
         }
     }
 }
@@ -351,15 +384,28 @@ asan_finish_file (void)
                              MAX_RESERVED_INIT_PRIORITY - 1);
 }
 
+/* Initialize shadow_ptr_types array.  */
+
+static void
+asan_init_shadow_ptr_types (void)
+{
+  alias_set_type set = new_alias_set ();
+  shadow_ptr_types[0] = build_distinct_type_copy (unsigned_char_type_node);
+  TYPE_ALIAS_SET (shadow_ptr_types[0]) = set;
+  shadow_ptr_types[0] = build_pointer_type (shadow_ptr_types[0]);
+  shadow_ptr_types[1] = build_distinct_type_copy (short_unsigned_type_node);
+  TYPE_ALIAS_SET (shadow_ptr_types[1]) = set;
+  shadow_ptr_types[1] = build_pointer_type (shadow_ptr_types[1]);
+}
+
 /* Instrument the current function.  */
 
 static unsigned int
 asan_instrument (void)
 {
-  struct gimplify_ctx gctx;
-  push_gimplify_context (&gctx);
+  if (shadow_ptr_types[0] == NULL_TREE)
+    asan_init_shadow_ptr_types ();
   transform_statements ();
-  pop_gimplify_context (NULL);
   return 0;
 }
 
@@ -386,6 +432,8 @@ struct gimple_opt_pass pass_asan =
   0,                                    /* properties_destroyed  */
   0,                                    /* todo_flags_start  */
   TODO_verify_flow | TODO_verify_stmts
-  | TODO_update_ssa    /* todo_flags_finish  */
+  | TODO_update_ssa			/* todo_flags_finish  */
  }
 };
+
+#include "gt-asan.h"
diff --git a/gcc/asan.h b/gcc/asan.h
index 699820b..0d9ab8b 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -24,7 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 extern void asan_finish_file(void);
 
 /* Shadow memory is found at
-   (address >> ASAN_SHADOW_SHIFT) | targetm.asan_shadow_offset ().  */
+   (address >> ASAN_SHADOW_SHIFT) + targetm.asan_shadow_offset ().  */
 #define ASAN_SHADOW_SHIFT	3
 
 #endif /* TREE_ASAN */
-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 04/10] Allow asan at -O0
  2012-11-02 22:53 ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
                     ` (2 preceding siblings ...)
  2012-11-02 22:58   ` [PATCH 03/10] Emit GIMPLE directly instead of gimplifying GENERIC Dodji Seketeli
@ 2012-11-02 22:59   ` Dodji Seketeli
  2012-11-06 17:12     ` Diego Novillo
  2012-11-02 23:00   ` [PATCH 05/10] Implement protection of stack variables Dodji Seketeli
                     ` (7 subsequent siblings)
  11 siblings, 1 reply; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-02 22:59 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

This patch defines a new asan pass gate that is activated at -O0, in
addition to the pass that was initially activated at -O3 level The
patch also does some comment cleanups here and there.

	* asan.c (build_check_stmt): Rename join_bb variable to else_bb.
	(gate_asan_O0): New function.
	(pass_asan_O0): New variable.
	* passes.c (init_optimization_passes): Add pass_asan_O0.
	* tree-pass.h (pass_asan_O0): New declaration.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192415 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |  8 ++++++++
 gcc/asan.c         | 45 ++++++++++++++++++++++++++++++++++++---------
 gcc/passes.c       |  1 +
 gcc/tree-pass.h    |  1 +
 4 files changed, 46 insertions(+), 9 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 973ee6b..131afc7 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,3 +1,11 @@
+2012-10-12  Jakub Jelinek  <jakub@redhat.com>
+
+	* asan.c (build_check_stmt): Rename join_bb variable to else_bb.
+	(gate_asan_O0): New function.
+	(pass_asan_O0): New variable.
+	* passes.c (init_optimization_passes): Add pass_asan_O0.
+	* tree-pass.h (pass_asan_O0): New declaration.
+
 2012-10-11  Jakub Jelinek  <jakub@redhat.com>
 	    Dodji Seketeli <dodji@redhat.com>
 
diff --git a/gcc/asan.c b/gcc/asan.c
index baaec0f..e7f4943 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -123,7 +123,7 @@ build_check_stmt (tree base,
                   location_t location, bool is_store, int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
-  basic_block cond_bb, then_bb, join_bb;
+  basic_block cond_bb, then_bb, else_bb;
   edge e;
   tree t, base_addr, shadow;
   gimple g;
@@ -144,23 +144,23 @@ build_check_stmt (tree base,
   else
     e = split_block_after_labels (cond_bb);
   cond_bb = e->src;
-  join_bb = e->dest;
+  else_bb = e->dest;
 
-  /* A recap at this point: join_bb is the basic block at whose head
+  /* A recap at this point: else_bb is the basic block at whose head
      is the gimple statement for which this check expression is being
      built.  cond_bb is the (possibly new, synthetic) basic block the
      end of which will contain the cache-lookup code, and a
      conditional that jumps to the cache-miss code or, much more
-     likely, over to join_bb.  */
+     likely, over to else_bb.  */
 
   /* Create the bb that contains the crash block.  */
   then_bb = create_empty_bb (cond_bb);
   e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
   e->probability = PROB_VERY_UNLIKELY;
-  make_single_succ_edge (then_bb, join_bb, EDGE_FALLTHRU);
+  make_single_succ_edge (then_bb, else_bb, EDGE_FALLTHRU);
 
-  /* Mark the pseudo-fallthrough edge from cond_bb to join_bb.  */
-  e = find_edge (cond_bb, join_bb);
+  /* Mark the pseudo-fallthrough edge from cond_bb to else_bb.  */
+  e = find_edge (cond_bb, else_bb);
   e->flags = EDGE_FALSE_VALUE;
   e->count = cond_bb->count;
   e->probability = PROB_ALWAYS - PROB_VERY_UNLIKELY;
@@ -170,7 +170,7 @@ build_check_stmt (tree base,
   if (dom_info_available_p (CDI_DOMINATORS))
     {
       set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
-      set_immediate_dominator (CDI_DOMINATORS, join_bb, cond_bb);
+      set_immediate_dominator (CDI_DOMINATORS, else_bb, cond_bb);
     }
 
   base = unshare_expr (base);
@@ -293,7 +293,7 @@ build_check_stmt (tree base,
   gimple_set_location (g, location);
   gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
-  *iter = gsi_start_bb (join_bb);
+  *iter = gsi_start_bb (else_bb);
 }
 
 /* If T represents a memory access, add instrumentation code before ITER.
@@ -436,4 +436,31 @@ struct gimple_opt_pass pass_asan =
  }
 };
 
+static bool
+gate_asan_O0 (void)
+{
+  return flag_asan != 0 && !optimize;
+}
+
+struct gimple_opt_pass pass_asan_O0 =
+{
+ {
+  GIMPLE_PASS,
+  "asan0",				/* name  */
+  OPTGROUP_NONE,                        /* optinfo_flags */
+  gate_asan_O0,				/* gate  */
+  asan_instrument,			/* execute  */
+  NULL,					/* sub  */
+  NULL,					/* next  */
+  0,					/* static_pass_number  */
+  TV_NONE,				/* tv_id  */
+  PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required  */
+  0,					/* properties_provided  */
+  0,					/* properties_destroyed  */
+  0,					/* todo_flags_start  */
+  TODO_verify_flow | TODO_verify_stmts
+  | TODO_update_ssa			/* todo_flags_finish  */
+ }
+};
+
 #include "gt-asan.h"
diff --git a/gcc/passes.c b/gcc/passes.c
index 66a2f74..d4115b3 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1562,6 +1562,7 @@ init_optimization_passes (void)
       NEXT_PASS (pass_tm_edges);
     }
   NEXT_PASS (pass_lower_complex_O0);
+  NEXT_PASS (pass_asan_O0);
   NEXT_PASS (pass_cleanup_eh);
   NEXT_PASS (pass_lower_resx);
   NEXT_PASS (pass_nrv);
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 73c5886..69baa0d 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -260,6 +260,7 @@ struct register_pass_info
 extern struct gimple_opt_pass pass_mudflap_1;
 extern struct gimple_opt_pass pass_mudflap_2;
 extern struct gimple_opt_pass pass_asan;
+extern struct gimple_opt_pass pass_asan_O0;
 extern struct gimple_opt_pass pass_lower_cf;
 extern struct gimple_opt_pass pass_refactor_eh;
 extern struct gimple_opt_pass pass_lower_eh;
-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 05/10] Implement protection of stack variables
  2012-11-02 22:53 ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
                     ` (3 preceding siblings ...)
  2012-11-02 22:59   ` [PATCH 04/10] Allow asan at -O0 Dodji Seketeli
@ 2012-11-02 23:00   ` Dodji Seketeli
  2012-11-06 17:22     ` Diego Novillo
  2012-11-02 23:01   ` [PATCH 06/10] Implement protection of global variables Dodji Seketeli
                     ` (6 subsequent siblings)
  11 siblings, 1 reply; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-02 23:00 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

This patch implements the protection of stack variables.

To understand how this works, lets look at this example on x86_64
where the stack grows downward:

 int
 foo ()
 {
   char a[23] = {0};
   int b[2] = {0};

   a[5] = 1;
   b[1] = 2;

   return a[5] + b[1];
 }

For this function, the stack protected by asan will be organized as
follows, from the top of the stack to the bottom:

Slot 1/ [red zone of 32 bytes called 'RIGHT RedZone']

Slot 2/ [24 bytes for variable 'a']

Slot 3/ [8 bytes of red zone, that adds up to the space of 'a' to make
         the next slot be 32 bytes aligned; this one is called Partial
         Redzone; this 32 bytes alignment is an asan constraint]

Slot 4/ [red zone of 32 bytes called 'Middle RedZone']

Slot 5/ [8 bytes for variable 'b']

Slot 6/ [24 bytes of Partial Red Zone (similar to slot 3]

Slot 7/ [32 bytes of Red Zone at the bottom of the stack, called 'LEFT
         RedZone']

[A cultural question I've kept asking myself is Why has address
 sanitizer authors called these red zones (LEFT, MIDDLE, RIGHT)
 instead of e.g, (BOTTOM, MIDDLE, TOP).  Maybe they can step up and
 educate me so that I get less confused in the future.  :-)]

The 32 bytes of LEFT red zone at the bottom of the stack can be
decomposed as such:

    1/ The first 8 bytes contain a magical asan number that is always
    0x41B58AB3.

    2/ The following 8 bytes contains a pointer to a string (to be
    parsed at runtime by the runtime asan library), which format is
    the following:

     "<function-name> <space> <num-of-variables-on-the-stack>
     (<32-bytes-aligned-offset-in-bytes-of-variable> <space>
     <length-of-var-in-bytes> ){n} "

	where '(...){n}' means the content inside the parenthesis occurs 'n'
	times, with 'n' being the number of variables on the stack.

     3/ The following 16 bytes of the red zone have no particular
     format.

The shadow memory for that stack layout is going to look like this:

    - content of shadow memory 8 bytes for slot 7: 0xFFFFFFFFF1F1F1F1.
      The F1 byte pattern is a magic number called
      ASAN_STACK_MAGIC_LEFT and is a way for the runtime to know that
      the memory for that shadow byte is part of a the LEFT red zone
      intended to seat at the bottom of the variables on the stack.

    - content of shadow memory 8 bytes for slots 6 and 5:
      0xFFFFFFFFF4F4F400.  The F4 byte pattern is a magic number
      called ASAN_STACK_MAGIC_PARTIAL.  It flags the fact that the
      memory region for this shadow byte is a PARTIAL red zone
      intended to pad a variable A, so that the slot following
      {A,padding} is 32 bytes aligned.

      Note that the fact that the least significant byte of this
      shadow memory content is 00 means that 8 bytes of its
      corresponding memory (which corresponds to the memory of
      variable 'b') is addressable.

    - content of shadow memory 8 bytes for slot 4: 0xFFFFFFFFF2F2F2F2.
      The F2 byte pattern is a magic number called
      ASAN_STACK_MAGIC_MIDDLE.  It flags the fact that the memory
      region for this shadow byte is a MIDDLE red zone intended to
      seat between two 32 aligned slots of {variable,padding}.

    - content of shadow memory 8 bytes for slot 3 and 2:
      0xFFFFFFFFF4000000.  This represents is the concatenation of
      variable 'a' and the partial red zone following it, like what we
      had for variable 'b'.  The least significant 3 bytes being 00
      means that the 3 bytes of variable 'a' are addressable.

    - content of shadow memory 8 bytes for slot 1: 0xFFFFFFFFF3F3F3F3.
      The F3 byte pattern is a magic number called
      ASAN_STACK_MAGIC_RIGHT.  It flags the fact that the memory
      region for this shadow byte is a RIGHT red zone intended to seat
      at the top of the variables of the stack.

So, the patch lays out stack variables as well as the different red
zones, emits some prologue code to populate the shadow memory as to
poison (mark as non-accessible) the regions of the red zones and mark
the regions of stack variables as accessible, and emit some epilogue
code to un-poison (mark as accessible) the regions of red zones right
before the function exits.

	* Makefile.in (asan.o): Depend on $(EXPR_H) $(OPTABS_H).
	(cfgexpand.o): Depend on asan.h.
	* asan.c: Include expr.h and optabs.h.
	(asan_shadow_set): New variable.
	(asan_shadow_cst, asan_emit_stack_protection): New functions.
	(asan_init_shadow_ptr_types): Initialize also asan_shadow_set.
	* cfgexpand.c: Include asan.h.  Define HOST_WIDE_INT heap vector.
	(partition_stack_vars): If i is large alignment and j small
	alignment or vice versa, break out of the loop instead of continue,
	and put the test earlier.  If flag_asan, break out of the loop
	if for small alignment size is different.
	(struct stack_vars_data): New type.
	(expand_stack_vars): Add DATA argument.  Change PRED type to
	function taking size_t argument instead of tree.  Adjust pred calls.
	Fill DATA in and add needed padding in between variables if -fasan.
	(defer_stack_allocation): Defer everything for flag_asan.
	(stack_protect_decl_phase_1, stack_protect_decl_phase_2): Take
	size_t index into stack_vars array instead of the decl directly.
	(asan_decl_phase_3): New function.
	(expand_used_vars): Return var destruction sequence.  Adjust
	expand_stack_vars calls, add another one for flag_asan.  Call
	asan_emit_stack_protection if expand_stack_vars added anything
	to the vectors.
	(expand_gimple_basic_block): Add disable_tail_calls argument.
	(gimple_expand_cfg): Pass true to it if expand_used_vars returned
	non-NULL.  Emit the sequence returned by expand_used_vars after
	return_label.
	* asan.h (asan_emit_stack_protection): New prototype.
	(asan_shadow_set): New decl.
	(ASAN_RED_ZONE_SIZE, ASAN_STACK_MAGIC_LEFT, ASAN_STACK_MAGIC_MIDDLE,
	ASAN_STACK_MAGIC_RIGHT, ASAN_STACK_FRAME_MAGIC): Define.
	(asan_protect_stack_decl): New inline.
	* toplev.c (process_options): Also disable -fasan on
	!FRAME_GROWS_DOWNWARDS targets.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192540 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |  37 ++++++++++
 gcc/Makefile.in    |   4 +-
 gcc/asan.c         | 193 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 gcc/asan.h         |  31 ++++++++-
 gcc/cfgexpand.c    | 165 +++++++++++++++++++++++++++++++++++++++------
 gcc/toplev.c       |   4 +-
 6 files changed, 406 insertions(+), 28 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 131afc7..14a0b98 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,3 +1,40 @@
+2012-10-17  Jakub Jelinek  <jakub@redhat.com>
+
+	* Makefile.in (asan.o): Depend on $(EXPR_H) $(OPTABS_H).
+	(cfgexpand.o): Depend on asan.h.
+	* asan.c: Include expr.h and optabs.h.
+	(asan_shadow_set): New variable.
+	(asan_shadow_cst, asan_emit_stack_protection): New functions.
+	(asan_init_shadow_ptr_types): Initialize also asan_shadow_set.
+	* cfgexpand.c: Include asan.h.  Define HOST_WIDE_INT heap vector.
+	(partition_stack_vars): If i is large alignment and j small
+	alignment or vice versa, break out of the loop instead of continue,
+	and put the test earlier.  If flag_asan, break out of the loop
+	if for small alignment size is different.
+	(struct stack_vars_data): New type.
+	(expand_stack_vars): Add DATA argument.  Change PRED type to
+	function taking size_t argument instead of tree.  Adjust pred calls.
+	Fill DATA in and add needed padding in between variables if -fasan.
+	(defer_stack_allocation): Defer everything for flag_asan.
+	(stack_protect_decl_phase_1, stack_protect_decl_phase_2): Take
+	size_t index into stack_vars array instead of the decl directly.
+	(asan_decl_phase_3): New function.
+	(expand_used_vars): Return var destruction sequence.  Adjust
+	expand_stack_vars calls, add another one for flag_asan.  Call
+	asan_emit_stack_protection if expand_stack_vars added anything
+	to the vectors.
+	(expand_gimple_basic_block): Add disable_tail_calls argument.
+	(gimple_expand_cfg): Pass true to it if expand_used_vars returned
+	non-NULL.  Emit the sequence returned by expand_used_vars after
+	return_label.
+	* asan.h (asan_emit_stack_protection): New prototype.
+	(asan_shadow_set): New decl.
+	(ASAN_RED_ZONE_SIZE, ASAN_STACK_MAGIC_LEFT, ASAN_STACK_MAGIC_MIDDLE,
+	ASAN_STACK_MAGIC_RIGHT, ASAN_STACK_FRAME_MAGIC): Define.
+	(asan_protect_stack_decl): New inline.
+	* toplev.c (process_options): Also disable -fasan on
+	!FRAME_GROWS_DOWNWARDS targets.
+
 2012-10-12  Jakub Jelinek  <jakub@redhat.com>
 
 	* asan.c (build_check_stmt): Rename join_bb variable to else_bb.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 1536800..988574e 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2210,7 +2210,7 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
 asan.o : asan.c asan.h $(CONFIG_H) $(SYSTEM_H) $(GIMPLE_H) \
    output.h coretypes.h $(GIMPLE_PRETTY_PRINT_H) \
    tree-iterator.h $(TREE_FLOW_H) $(TREE_PASS_H) \
-   $(TARGET_H)
+   $(TARGET_H) $(EXPR_H) $(OPTABS_H)
 tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
    $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
    $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) \
@@ -3080,7 +3080,7 @@ cfgexpand.o : cfgexpand.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
    $(DIAGNOSTIC_H) toplev.h $(DIAGNOSTIC_CORE_H) $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
    value-prof.h $(TREE_INLINE_H) $(TARGET_H) $(SSAEXPAND_H) $(REGS_H) \
    $(GIMPLE_PRETTY_PRINT_H) $(BITMAP_H) sbitmap.h \
-   $(INSN_ATTR_H) $(CFGLOOP_H)
+   $(INSN_ATTR_H) $(CFGLOOP_H) asan.h
 cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_ERROR_H) \
    $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
    $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) $(INSN_ATTR_H) \
diff --git a/gcc/asan.c b/gcc/asan.c
index e7f4943..578bb02 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -29,6 +29,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "asan.h"
 #include "gimple-pretty-print.h"
 #include "target.h"
+#include "expr.h"
+#include "optabs.h"
 
 /*
  AddressSanitizer finds out-of-bounds and use-after-free bugs 
@@ -65,10 +67,195 @@ along with GCC; see the file COPYING3.  If not see
  to create redzones for stack and global object and poison them.
 */
 
+alias_set_type asan_shadow_set = -1;
+
 /* Pointer types to 1 resp. 2 byte integers in shadow memory.  A separate
    alias set is used for all shadow memory accesses.  */
 static GTY(()) tree shadow_ptr_types[2];
 
+/* Return a CONST_INT representing 4 subsequent shadow memory bytes.  */
+
+static rtx
+asan_shadow_cst (unsigned char shadow_bytes[4])
+{
+  int i;
+  unsigned HOST_WIDE_INT val = 0;
+  gcc_assert (WORDS_BIG_ENDIAN == BYTES_BIG_ENDIAN);
+  for (i = 0; i < 4; i++)
+    val |= (unsigned HOST_WIDE_INT) shadow_bytes[BYTES_BIG_ENDIAN ? 3 - i : i]
+	   << (BITS_PER_UNIT * i);
+  return GEN_INT (trunc_int_for_mode (val, SImode));
+}
+
+/* Insert code to protect stack vars.  The prologue sequence should be emitted
+   directly, epilogue sequence returned.  BASE is the register holding the
+   stack base, against which OFFSETS array offsets are relative to, OFFSETS
+   array contains pairs of offsets in reverse order, always the end offset
+   of some gap that needs protection followed by starting offset,
+   and DECLS is an array of representative decls for each var partition.
+   LENGTH is the length of the OFFSETS array, DECLS array is LENGTH / 2 - 1
+   elements long (OFFSETS include gap before the first variable as well
+   as gaps after each stack variable).  */
+
+rtx
+asan_emit_stack_protection (rtx base, HOST_WIDE_INT *offsets, tree *decls,
+			    int length)
+{
+  rtx shadow_base, shadow_mem, ret, mem;
+  unsigned char shadow_bytes[4];
+  HOST_WIDE_INT base_offset = offsets[length - 1], offset, prev_offset;
+  HOST_WIDE_INT last_offset, last_size;
+  int l;
+  unsigned char cur_shadow_byte = ASAN_STACK_MAGIC_LEFT;
+  static pretty_printer pp;
+  static bool pp_initialized;
+  const char *buf;
+  size_t len;
+  tree str_cst;
+
+  /* First of all, prepare the description string.  */
+  if (!pp_initialized)
+    {
+      pp_construct (&pp, /* prefix */NULL, /* line-width */0);
+      pp_initialized = true;
+    }
+  pp_clear_output_area (&pp);
+  if (DECL_NAME (current_function_decl))
+    pp_base_tree_identifier (&pp, DECL_NAME (current_function_decl));
+  else
+    pp_string (&pp, "<unknown>");
+  pp_space (&pp);
+  pp_decimal_int (&pp, length / 2 - 1);
+  pp_space (&pp);
+  for (l = length - 2; l; l -= 2)
+    {
+      tree decl = decls[l / 2 - 1];
+      pp_wide_integer (&pp, offsets[l] - base_offset);
+      pp_space (&pp);
+      pp_wide_integer (&pp, offsets[l - 1] - offsets[l]);
+      pp_space (&pp);
+      if (DECL_P (decl) && DECL_NAME (decl))
+	{
+	  pp_decimal_int (&pp, IDENTIFIER_LENGTH (DECL_NAME (decl)));
+	  pp_space (&pp);
+	  pp_base_tree_identifier (&pp, DECL_NAME (decl));
+	}
+      else
+	pp_string (&pp, "9 <unknown>");
+      pp_space (&pp);
+    }
+  buf = pp_base_formatted_text (&pp);
+  len = strlen (buf);
+  str_cst = build_string (len + 1, buf);
+  TREE_TYPE (str_cst)
+    = build_array_type (char_type_node, build_index_type (size_int (len)));
+  TREE_READONLY (str_cst) = 1;
+  TREE_STATIC (str_cst) = 1;
+  str_cst = build1 (ADDR_EXPR, build_pointer_type (char_type_node), str_cst);
+
+  /* Emit the prologue sequence.  */
+  base = expand_binop (Pmode, add_optab, base, GEN_INT (base_offset),
+		       NULL_RTX, 1, OPTAB_DIRECT);
+  mem = gen_rtx_MEM (ptr_mode, base);
+  emit_move_insn (mem, GEN_INT (ASAN_STACK_FRAME_MAGIC));
+  mem = adjust_address (mem, VOIDmode, GET_MODE_SIZE (ptr_mode));
+  emit_move_insn (mem, expand_normal (str_cst));
+  shadow_base = expand_binop (Pmode, lshr_optab, base,
+			      GEN_INT (ASAN_SHADOW_SHIFT),
+			      NULL_RTX, 1, OPTAB_DIRECT);
+  shadow_base = expand_binop (Pmode, add_optab, shadow_base,
+			      GEN_INT (targetm.asan_shadow_offset ()),
+			      NULL_RTX, 1, OPTAB_DIRECT);
+  gcc_assert (asan_shadow_set != -1
+	      && (ASAN_RED_ZONE_SIZE >> ASAN_SHADOW_SHIFT) == 4);
+  shadow_mem = gen_rtx_MEM (SImode, shadow_base);
+  set_mem_alias_set (shadow_mem, asan_shadow_set);
+  prev_offset = base_offset;
+  for (l = length; l; l -= 2)
+    {
+      if (l == 2)
+	cur_shadow_byte = ASAN_STACK_MAGIC_RIGHT;
+      offset = offsets[l - 1];
+      if ((offset - base_offset) & (ASAN_RED_ZONE_SIZE - 1))
+	{
+	  int i;
+	  HOST_WIDE_INT aoff
+	    = base_offset + ((offset - base_offset)
+			     & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1));
+	  shadow_mem = adjust_address (shadow_mem, VOIDmode,
+				       (aoff - prev_offset)
+				       >> ASAN_SHADOW_SHIFT);
+	  prev_offset = aoff;
+	  for (i = 0; i < 4; i++, aoff += (1 << ASAN_SHADOW_SHIFT))
+	    if (aoff < offset)
+	      {
+		if (aoff < offset - (1 << ASAN_SHADOW_SHIFT) + 1)
+		  shadow_bytes[i] = 0;
+		else
+		  shadow_bytes[i] = offset - aoff;
+	      }
+	    else
+	      shadow_bytes[i] = ASAN_STACK_MAGIC_PARTIAL;
+	  emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
+	  offset = aoff;
+	}
+      while (offset <= offsets[l - 2] - ASAN_RED_ZONE_SIZE)
+	{
+	  shadow_mem = adjust_address (shadow_mem, VOIDmode,
+				       (offset - prev_offset)
+				       >> ASAN_SHADOW_SHIFT);
+	  prev_offset = offset;
+	  memset (shadow_bytes, cur_shadow_byte, 4);
+	  emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
+	  offset += ASAN_RED_ZONE_SIZE;
+	}
+      cur_shadow_byte = ASAN_STACK_MAGIC_MIDDLE;
+    }
+  do_pending_stack_adjust ();
+
+  /* Construct epilogue sequence.  */
+  start_sequence ();
+
+  shadow_mem = gen_rtx_MEM (BLKmode, shadow_base);
+  set_mem_alias_set (shadow_mem, asan_shadow_set);
+  prev_offset = base_offset;
+  last_offset = base_offset;
+  last_size = 0;
+  for (l = length; l; l -= 2)
+    {
+      offset = base_offset + ((offsets[l - 1] - base_offset)
+			     & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1));
+      if (last_offset + last_size != offset)
+	{
+	  shadow_mem = adjust_address (shadow_mem, VOIDmode,
+				       (last_offset - prev_offset)
+				       >> ASAN_SHADOW_SHIFT);
+	  prev_offset = last_offset;
+	  clear_storage (shadow_mem, GEN_INT (last_size >> ASAN_SHADOW_SHIFT),
+			 BLOCK_OP_NORMAL);
+	  last_offset = offset;
+	  last_size = 0;
+	}
+      last_size += base_offset + ((offsets[l - 2] - base_offset)
+				  & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1))
+		   - offset;
+    }
+  if (last_size)
+    {
+      shadow_mem = adjust_address (shadow_mem, VOIDmode,
+				   (last_offset - prev_offset)
+				   >> ASAN_SHADOW_SHIFT);
+      clear_storage (shadow_mem, GEN_INT (last_size >> ASAN_SHADOW_SHIFT),
+		     BLOCK_OP_NORMAL);
+    }
+
+  do_pending_stack_adjust ();
+
+  ret = get_insns ();
+  end_sequence ();
+  return ret;
+}
+
 /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
    IS_STORE is either 1 (for a store) or 0 (for a load).
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
@@ -389,12 +576,12 @@ asan_finish_file (void)
 static void
 asan_init_shadow_ptr_types (void)
 {
-  alias_set_type set = new_alias_set ();
+  asan_shadow_set = new_alias_set ();
   shadow_ptr_types[0] = build_distinct_type_copy (unsigned_char_type_node);
-  TYPE_ALIAS_SET (shadow_ptr_types[0]) = set;
+  TYPE_ALIAS_SET (shadow_ptr_types[0]) = asan_shadow_set;
   shadow_ptr_types[0] = build_pointer_type (shadow_ptr_types[0]);
   shadow_ptr_types[1] = build_distinct_type_copy (short_unsigned_type_node);
-  TYPE_ALIAS_SET (shadow_ptr_types[1]) = set;
+  TYPE_ALIAS_SET (shadow_ptr_types[1]) = asan_shadow_set;
   shadow_ptr_types[1] = build_pointer_type (shadow_ptr_types[1]);
 }
 
diff --git a/gcc/asan.h b/gcc/asan.h
index 0d9ab8b..6f0edbf 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -21,10 +21,39 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef TREE_ASAN
 #define TREE_ASAN
 
-extern void asan_finish_file(void);
+extern void asan_finish_file (void);
+extern rtx asan_emit_stack_protection (rtx, HOST_WIDE_INT *, tree *, int);
+
+/* Alias set for accessing the shadow memory.  */
+extern alias_set_type asan_shadow_set;
 
 /* Shadow memory is found at
    (address >> ASAN_SHADOW_SHIFT) + targetm.asan_shadow_offset ().  */
 #define ASAN_SHADOW_SHIFT	3
 
+/* Red zone size, stack and global variables are padded by ASAN_RED_ZONE_SIZE
+   up to 2 * ASAN_RED_ZONE_SIZE - 1 bytes.  */
+#define ASAN_RED_ZONE_SIZE	32
+
+/* Shadow memory values for stack protection.  Left is below protected vars,
+   the first pointer in stack corresponding to that offset contains
+   ASAN_STACK_FRAME_MAGIC word, the second pointer to a string describing
+   the frame.  Middle is for padding in between variables, right is
+   above the last protected variable and partial immediately after variables
+   up to ASAN_RED_ZONE_SIZE alignment.  */
+#define ASAN_STACK_MAGIC_LEFT		0xf1
+#define ASAN_STACK_MAGIC_MIDDLE		0xf2
+#define ASAN_STACK_MAGIC_RIGHT		0xf3
+#define ASAN_STACK_MAGIC_PARTIAL	0xf4
+
+#define ASAN_STACK_FRAME_MAGIC	0x41b58ab3
+
+/* Return true if DECL should be guarded on the stack.  */
+
+static inline bool
+asan_protect_stack_decl (tree decl)
+{
+  return DECL_P (decl) && !DECL_ARTIFICIAL (decl);
+}
+
 #endif /* TREE_ASAN */
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index e501b4b..16fd0fb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -47,6 +47,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "regs.h" /* For reg_renumber.  */
 #include "insn-attr.h" /* For INSN_SCHEDULING.  */
+#include "asan.h"
 
 /* This variable holds information helping the rewriting of SSA trees
    into RTL.  */
@@ -736,6 +737,7 @@ partition_stack_vars (void)
     {
       size_t i = stack_vars_sorted[si];
       unsigned int ialign = stack_vars[i].alignb;
+      HOST_WIDE_INT isize = stack_vars[i].size;
 
       /* Ignore objects that aren't partition representatives. If we
          see a var that is not a partition representative, it must
@@ -747,19 +749,28 @@ partition_stack_vars (void)
 	{
 	  size_t j = stack_vars_sorted[sj];
 	  unsigned int jalign = stack_vars[j].alignb;
+	  HOST_WIDE_INT jsize = stack_vars[j].size;
 
 	  /* Ignore objects that aren't partition representatives.  */
 	  if (stack_vars[j].representative != j)
 	    continue;
 
-	  /* Ignore conflicting objects.  */
-	  if (stack_var_conflict_p (i, j))
-	    continue;
-
 	  /* Do not mix objects of "small" (supported) alignment
 	     and "large" (unsupported) alignment.  */
 	  if ((ialign * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
 	      != (jalign * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT))
+	    break;
+
+	  /* For Address Sanitizer do not mix objects with different
+	     sizes, as the shorter vars wouldn't be adequately protected.
+	     Don't do that for "large" (unsupported) alignment objects,
+	     those aren't protected anyway.  */
+	  if (flag_asan && isize != jsize
+	      && ialign * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
+	    break;
+
+	  /* Ignore conflicting objects.  */
+	  if (stack_var_conflict_p (i, j))
 	    continue;
 
 	  /* UNION the objects, placing J at OFFSET.  */
@@ -837,12 +848,26 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
   set_rtl (decl, x);
 }
 
+DEF_VEC_I(HOST_WIDE_INT);
+DEF_VEC_ALLOC_I(HOST_WIDE_INT,heap);
+
+struct stack_vars_data
+{
+  /* Vector of offset pairs, always end of some padding followed
+     by start of the padding that needs Address Sanitizer protection.
+     The vector is in reversed, highest offset pairs come first.  */
+  VEC(HOST_WIDE_INT, heap) *asan_vec;
+
+  /* Vector of partition representative decls in between the paddings.  */
+  VEC(tree, heap) *asan_decl_vec;
+};
+
 /* A subroutine of expand_used_vars.  Give each partition representative
    a unique location within the stack frame.  Update each partition member
    with that location.  */
 
 static void
-expand_stack_vars (bool (*pred) (tree))
+expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
 {
   size_t si, i, j, n = stack_vars_num;
   HOST_WIDE_INT large_size = 0, large_alloc = 0;
@@ -913,13 +938,45 @@ expand_stack_vars (bool (*pred) (tree))
 
       /* Check the predicate to see whether this variable should be
 	 allocated in this pass.  */
-      if (pred && !pred (decl))
+      if (pred && !pred (i))
 	continue;
 
       alignb = stack_vars[i].alignb;
       if (alignb * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
 	{
-	  offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
+	  if (flag_asan && pred)
+	    {
+	      HOST_WIDE_INT prev_offset = frame_offset;
+	      tree repr_decl = NULL_TREE;
+
+	      offset
+		= alloc_stack_frame_space (stack_vars[i].size
+					   + ASAN_RED_ZONE_SIZE,
+					   MAX (alignb, ASAN_RED_ZONE_SIZE));
+	      VEC_safe_push (HOST_WIDE_INT, heap, data->asan_vec,
+			     prev_offset);
+	      VEC_safe_push (HOST_WIDE_INT, heap, data->asan_vec,
+			     offset + stack_vars[i].size);
+	      /* Find best representative of the partition.
+		 Prefer those with DECL_NAME, even better
+		 satisfying asan_protect_stack_decl predicate.  */
+	      for (j = i; j != EOC; j = stack_vars[j].next)
+		if (asan_protect_stack_decl (stack_vars[j].decl)
+		    && DECL_NAME (stack_vars[j].decl))
+		  {
+		    repr_decl = stack_vars[j].decl;
+		    break;
+		  }
+		else if (repr_decl == NULL_TREE
+			 && DECL_P (stack_vars[j].decl)
+			 && DECL_NAME (stack_vars[j].decl))
+		  repr_decl = stack_vars[j].decl;
+	      if (repr_decl == NULL_TREE)
+		repr_decl = stack_vars[i].decl;
+	      VEC_safe_push (tree, heap, data->asan_decl_vec, repr_decl);
+	    }
+	  else
+	    offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
 	  base = virtual_stack_vars_rtx;
 	  base_align = crtl->max_used_stack_slot_alignment;
 	}
@@ -1057,8 +1114,9 @@ static bool
 defer_stack_allocation (tree var, bool toplevel)
 {
   /* If stack protection is enabled, *all* stack variables must be deferred,
-     so that we can re-order the strings to the top of the frame.  */
-  if (flag_stack_protect)
+     so that we can re-order the strings to the top of the frame.
+     Similarly for Address Sanitizer.  */
+  if (flag_stack_protect || flag_asan)
     return true;
 
   /* We handle "large" alignment via dynamic allocation.  We want to handle
@@ -1329,15 +1387,31 @@ stack_protect_decl_phase (tree decl)
    as callbacks for expand_stack_vars.  */
 
 static bool
-stack_protect_decl_phase_1 (tree decl)
+stack_protect_decl_phase_1 (size_t i)
+{
+  return stack_protect_decl_phase (stack_vars[i].decl) == 1;
+}
+
+static bool
+stack_protect_decl_phase_2 (size_t i)
 {
-  return stack_protect_decl_phase (decl) == 1;
+  return stack_protect_decl_phase (stack_vars[i].decl) == 2;
 }
 
+/* And helper function that checks for asan phase (with stack protector
+   it is phase 3).  This is used as callback for expand_stack_vars.
+   Returns true if any of the vars in the partition need to be protected.  */
+
 static bool
-stack_protect_decl_phase_2 (tree decl)
+asan_decl_phase_3 (size_t i)
 {
-  return stack_protect_decl_phase (decl) == 2;
+  while (i != EOC)
+    {
+      if (asan_protect_stack_decl (stack_vars[i].decl))
+	return true;
+      i = stack_vars[i].next;
+    }
+  return false;
 }
 
 /* Ensure that variables in different stack protection phases conflict
@@ -1448,11 +1522,12 @@ estimated_stack_frame_size (struct cgraph_node *node)
 
 /* Expand all variables used in the function.  */
 
-static void
+static rtx
 expand_used_vars (void)
 {
   tree var, outer_block = DECL_INITIAL (current_function_decl);
   VEC(tree,heap) *maybe_local_decls = NULL;
+  rtx var_end_seq = NULL_RTX;
   struct pointer_map_t *ssa_name_decls;
   unsigned i;
   unsigned len;
@@ -1603,6 +1678,11 @@ expand_used_vars (void)
   /* Assign rtl to each variable based on these partitions.  */
   if (stack_vars_num > 0)
     {
+      struct stack_vars_data data;
+
+      data.asan_vec = NULL;
+      data.asan_decl_vec = NULL;
+
       /* Reorder decls to be protected by iterating over the variables
 	 array multiple times, and allocating out of each phase in turn.  */
       /* ??? We could probably integrate this into the qsort we did
@@ -1611,14 +1691,41 @@ expand_used_vars (void)
       if (has_protected_decls)
 	{
 	  /* Phase 1 contains only character arrays.  */
-	  expand_stack_vars (stack_protect_decl_phase_1);
+	  expand_stack_vars (stack_protect_decl_phase_1, &data);
 
 	  /* Phase 2 contains other kinds of arrays.  */
 	  if (flag_stack_protect == 2)
-	    expand_stack_vars (stack_protect_decl_phase_2);
+	    expand_stack_vars (stack_protect_decl_phase_2, &data);
 	}
 
-      expand_stack_vars (NULL);
+      if (flag_asan)
+	/* Phase 3, any partitions that need asan protection
+	   in addition to phase 1 and 2.  */
+	expand_stack_vars (asan_decl_phase_3, &data);
+
+      if (!VEC_empty (HOST_WIDE_INT, data.asan_vec))
+	{
+	  HOST_WIDE_INT prev_offset = frame_offset;
+	  HOST_WIDE_INT offset
+	    = alloc_stack_frame_space (ASAN_RED_ZONE_SIZE,
+				       ASAN_RED_ZONE_SIZE);
+	  VEC_safe_push (HOST_WIDE_INT, heap, data.asan_vec, prev_offset);
+	  VEC_safe_push (HOST_WIDE_INT, heap, data.asan_vec, offset);
+
+	  var_end_seq
+	    = asan_emit_stack_protection (virtual_stack_vars_rtx,
+					  VEC_address (HOST_WIDE_INT,
+						       data.asan_vec),
+					  VEC_address (tree,
+						       data.asan_decl_vec),
+					  VEC_length (HOST_WIDE_INT,
+						      data.asan_vec));
+	}
+
+      expand_stack_vars (NULL, &data);
+
+      VEC_free (HOST_WIDE_INT, heap, data.asan_vec);
+      VEC_free (tree, heap, data.asan_decl_vec);
     }
 
   fini_vars_expansion ();
@@ -1645,6 +1752,8 @@ expand_used_vars (void)
 	frame_offset += align - 1;
       frame_offset &= -align;
     }
+
+  return var_end_seq;
 }
 
 
@@ -3662,7 +3771,7 @@ expand_debug_locations (void)
 /* Expand basic block BB from GIMPLE trees to RTL.  */
 
 static basic_block
-expand_gimple_basic_block (basic_block bb)
+expand_gimple_basic_block (basic_block bb, bool disable_tail_calls)
 {
   gimple_stmt_iterator gsi;
   gimple_seq stmts;
@@ -3950,6 +4059,11 @@ expand_gimple_basic_block (basic_block bb)
 	}
       else
 	{
+	  if (is_gimple_call (stmt)
+	      && gimple_call_tail_p (stmt)
+	      && disable_tail_calls)
+	    gimple_call_set_tail (stmt, false);
+
 	  if (is_gimple_call (stmt) && gimple_call_tail_p (stmt))
 	    {
 	      bool can_fallthru;
@@ -4309,7 +4423,7 @@ gimple_expand_cfg (void)
   sbitmap blocks;
   edge_iterator ei;
   edge e;
-  rtx var_seq;
+  rtx var_seq, var_ret_seq;
   unsigned i;
 
   timevar_push (TV_OUT_OF_SSA);
@@ -4369,7 +4483,7 @@ gimple_expand_cfg (void)
   timevar_push (TV_VAR_EXPAND);
   start_sequence ();
 
-  expand_used_vars ();
+  var_ret_seq = expand_used_vars ();
 
   var_seq = get_insns ();
   end_sequence ();
@@ -4495,7 +4609,7 @@ gimple_expand_cfg (void)
 
   lab_rtx_for_bb = pointer_map_create ();
   FOR_BB_BETWEEN (bb, init_block->next_bb, EXIT_BLOCK_PTR, next_bb)
-    bb = expand_gimple_basic_block (bb);
+    bb = expand_gimple_basic_block (bb, var_ret_seq != NULL_RTX);
 
   if (MAY_HAVE_DEBUG_INSNS)
     expand_debug_locations ();
@@ -4523,6 +4637,15 @@ gimple_expand_cfg (void)
   construct_exit_block ();
   insn_locations_finalize ();
 
+  if (var_ret_seq)
+    {
+      rtx after = return_label;
+      rtx next = NEXT_INSN (after);
+      if (next && NOTE_INSN_BASIC_BLOCK_P (next))
+	after = next;
+      emit_insn_after (var_ret_seq, after);
+    }
+
   /* Zap the tree EH table.  */
   set_eh_throw_stmt_table (cfun, NULL);
 
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 68849f5..0fa8ce3 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1542,7 +1542,9 @@ process_options (void)
     }
 
   /* Address Sanitizer needs porting to each target architecture.  */
-  if (flag_asan && targetm.asan_shadow_offset == NULL)
+  if (flag_asan
+      && (targetm.asan_shadow_offset == NULL
+	  || !FRAME_GROWS_DOWNWARD))
     {
       warning (0, "-fasan not supported for this target");
       flag_asan = 0;
-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 06/10] Implement protection of global variables
  2012-11-02 22:53 ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
                     ` (4 preceding siblings ...)
  2012-11-02 23:00   ` [PATCH 05/10] Implement protection of stack variables Dodji Seketeli
@ 2012-11-02 23:01   ` Dodji Seketeli
  2012-11-06 17:27     ` Diego Novillo
  2012-11-02 23:02   ` [PATCH 07/10] Make build_check_stmt accept an SSA_NAME for its base Dodji Seketeli
                     ` (5 subsequent siblings)
  11 siblings, 1 reply; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-02 23:01 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

This patch implements the protection of global variables.

The basic idea is to insert a red zone between two global variables
and install a constructor function that calls the asan runtime to do
the populating of the relevant shadow memory regions at load time.

So the patch lays out the global variables as to insert a red zone
between them. The size of the red zones is so that each variable
starts on a 32 bytes boundary.

Then it installs a constructor function that, for each global
variable, calls the runtime asan library function
__asan_register_globals_with an instance of this type:

    struct __asan_global
    {
      /* Address of the beginning of the global variable.  */
      const void *__beg;

      /* Initial size of the global variable.  */
      uptr __size;

      /* Size of the global variable + size of the red zone.  This
         size is 32 bytes aligned.  */
      uptr __size_with_redzone;

      /*  Name of the global variable.  */
      const void *__name;

      /* This is always set to NULL for now.  */
      uptr __has_dynamic_init;
    }

The patch also installs a destructor function that calls the
runtime asan library function _asan_unregister_globals.

	* varasm.c: Include asan.h.
	(assemble_noswitch_variable): Grow size by asan_red_zone_size
	if decl is asan protected.
	(place_block_symbol): Likewise.
	(assemble_variable): If decl is asan protected, increase
	DECL_ALIGN if needed, and for decls emitted using
	assemble_variable_contents append padding zeros after it.
	* Makefile.in (varasm.o): Depend on asan.h.
	* asan.c: Include output.h.
	(asan_pp, asan_pp_initialized, asan_ctor_statements): New variables.
	(asan_pp_initialize, asan_pp_string): New functions.
	(asan_emit_stack_protection): Use asan_pp{,_initialized}
	instead of local pp{,_initialized} vars, use asan_pp_initialize
	and asan_pp_string helpers.
	(asan_needs_local_alias, asan_protect_global,
	asan_global_struct, asan_add_global): New functions.
	(asan_finish_file): Protect global vars that can be protected. Use
	asan_ctor_statements instead of ctor_statements
	* asan.h (asan_protect_global): New prototype.
	(asan_red_zone_size): New inline function.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192541 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |  24 +++++
 gcc/Makefile.in    |   2 +-
 gcc/asan.c         | 306 ++++++++++++++++++++++++++++++++++++++++++++++-------
 gcc/asan.h         |  11 ++
 gcc/varasm.c       |  22 ++++
 5 files changed, 328 insertions(+), 37 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 14a0b98..a2e18ce 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,4 +1,28 @@
 2012-10-17  Jakub Jelinek  <jakub@redhat.com>
+	    Wei Mi <wmi@google.com>
+
+	* varasm.c: Include asan.h.
+	(assemble_noswitch_variable): Grow size by asan_red_zone_size
+	if decl is asan protected.
+	(place_block_symbol): Likewise.
+	(assemble_variable): If decl is asan protected, increase
+	DECL_ALIGN if needed, and for decls emitted using
+	assemble_variable_contents append padding zeros after it.
+	* Makefile.in (varasm.o): Depend on asan.h.
+	* asan.c: Include output.h.
+	(asan_pp, asan_pp_initialized, asan_ctor_statements): New variables.
+	(asan_pp_initialize, asan_pp_string): New functions.
+	(asan_emit_stack_protection): Use asan_pp{,_initialized}
+	instead of local pp{,_initialized} vars, use asan_pp_initialize
+	and asan_pp_string helpers.
+	(asan_needs_local_alias, asan_protect_global,
+	asan_global_struct, asan_add_global): New functions.
+	(asan_finish_file): Protect global vars that can be protected. Use
+	asan_ctor_statements instead of ctor_statements
+	* asan.h (asan_protect_global): New prototype.
+	(asan_red_zone_size): New inline function.
+
+2012-10-17  Jakub Jelinek  <jakub@redhat.com>
 
 	* Makefile.in (asan.o): Depend on $(EXPR_H) $(OPTABS_H).
 	(cfgexpand.o): Depend on asan.h.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 988574e..3251a55 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2718,7 +2718,7 @@ varasm.o : varasm.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
    output.h $(DIAGNOSTIC_CORE_H) xcoffout.h debug.h $(GGC_H) $(TM_P_H) \
    $(HASHTAB_H) $(TARGET_H) langhooks.h gt-varasm.h $(BASIC_BLOCK_H) \
    $(CGRAPH_H) $(TARGET_DEF_H) tree-mudflap.h \
-   pointer-set.h $(COMMON_TARGET_H)
+   pointer-set.h $(COMMON_TARGET_H) asan.h
 function.o : function.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_ERROR_H) \
    $(TREE_H) $(GIMPLE_H) $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) \
    $(OPTABS_H) $(LIBFUNCS_H) $(REGS_H) hard-reg-set.h insn-config.h $(RECOG_H) \
diff --git a/gcc/asan.c b/gcc/asan.c
index 578bb02..6880270 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "target.h"
 #include "expr.h"
 #include "optabs.h"
+#include "output.h"
 
 /*
  AddressSanitizer finds out-of-bounds and use-after-free bugs 
@@ -73,6 +74,34 @@ alias_set_type asan_shadow_set = -1;
    alias set is used for all shadow memory accesses.  */
 static GTY(()) tree shadow_ptr_types[2];
 
+/* Asan pretty-printer, used for buidling of the description STRING_CSTs.  */
+static pretty_printer asan_pp;
+static bool asan_pp_initialized;
+
+/* Initialize asan_pp.  */
+
+static void
+asan_pp_initialize (void)
+{
+  pp_construct (&asan_pp, /* prefix */NULL, /* line-width */0);
+  asan_pp_initialized = true;
+}
+
+/* Create ADDR_EXPR of STRING_CST with asan_pp text.  */
+
+static tree
+asan_pp_string (void)
+{
+  const char *buf = pp_base_formatted_text (&asan_pp);
+  size_t len = strlen (buf);
+  tree ret = build_string (len + 1, buf);
+  TREE_TYPE (ret)
+    = build_array_type (char_type_node, build_index_type (size_int (len)));
+  TREE_READONLY (ret) = 1;
+  TREE_STATIC (ret) = 1;
+  return build1 (ADDR_EXPR, build_pointer_type (char_type_node), ret);
+}
+
 /* Return a CONST_INT representing 4 subsequent shadow memory bytes.  */
 
 static rtx
@@ -107,51 +136,38 @@ asan_emit_stack_protection (rtx base, HOST_WIDE_INT *offsets, tree *decls,
   HOST_WIDE_INT last_offset, last_size;
   int l;
   unsigned char cur_shadow_byte = ASAN_STACK_MAGIC_LEFT;
-  static pretty_printer pp;
-  static bool pp_initialized;
-  const char *buf;
-  size_t len;
   tree str_cst;
 
   /* First of all, prepare the description string.  */
-  if (!pp_initialized)
-    {
-      pp_construct (&pp, /* prefix */NULL, /* line-width */0);
-      pp_initialized = true;
-    }
-  pp_clear_output_area (&pp);
+  if (!asan_pp_initialized)
+    asan_pp_initialize ();
+
+  pp_clear_output_area (&asan_pp);
   if (DECL_NAME (current_function_decl))
-    pp_base_tree_identifier (&pp, DECL_NAME (current_function_decl));
+    pp_base_tree_identifier (&asan_pp, DECL_NAME (current_function_decl));
   else
-    pp_string (&pp, "<unknown>");
-  pp_space (&pp);
-  pp_decimal_int (&pp, length / 2 - 1);
-  pp_space (&pp);
+    pp_string (&asan_pp, "<unknown>");
+  pp_space (&asan_pp);
+  pp_decimal_int (&asan_pp, length / 2 - 1);
+  pp_space (&asan_pp);
   for (l = length - 2; l; l -= 2)
     {
       tree decl = decls[l / 2 - 1];
-      pp_wide_integer (&pp, offsets[l] - base_offset);
-      pp_space (&pp);
-      pp_wide_integer (&pp, offsets[l - 1] - offsets[l]);
-      pp_space (&pp);
+      pp_wide_integer (&asan_pp, offsets[l] - base_offset);
+      pp_space (&asan_pp);
+      pp_wide_integer (&asan_pp, offsets[l - 1] - offsets[l]);
+      pp_space (&asan_pp);
       if (DECL_P (decl) && DECL_NAME (decl))
 	{
-	  pp_decimal_int (&pp, IDENTIFIER_LENGTH (DECL_NAME (decl)));
-	  pp_space (&pp);
-	  pp_base_tree_identifier (&pp, DECL_NAME (decl));
+	  pp_decimal_int (&asan_pp, IDENTIFIER_LENGTH (DECL_NAME (decl)));
+	  pp_space (&asan_pp);
+	  pp_base_tree_identifier (&asan_pp, DECL_NAME (decl));
 	}
       else
-	pp_string (&pp, "9 <unknown>");
-      pp_space (&pp);
+	pp_string (&asan_pp, "9 <unknown>");
+      pp_space (&asan_pp);
     }
-  buf = pp_base_formatted_text (&pp);
-  len = strlen (buf);
-  str_cst = build_string (len + 1, buf);
-  TREE_TYPE (str_cst)
-    = build_array_type (char_type_node, build_index_type (size_int (len)));
-  TREE_READONLY (str_cst) = 1;
-  TREE_STATIC (str_cst) = 1;
-  str_cst = build1 (ADDR_EXPR, build_pointer_type (char_type_node), str_cst);
+  str_cst = asan_pp_string ();
 
   /* Emit the prologue sequence.  */
   base = expand_binop (Pmode, add_optab, base, GEN_INT (base_offset),
@@ -256,6 +272,75 @@ asan_emit_stack_protection (rtx base, HOST_WIDE_INT *offsets, tree *decls,
   return ret;
 }
 
+/* Return true if DECL, a global var, might be overridden and needs
+   therefore a local alias.  */
+
+static bool
+asan_needs_local_alias (tree decl)
+{
+  return DECL_WEAK (decl) || !targetm.binds_local_p (decl);
+}
+
+/* Return true if DECL is a VAR_DECL that should be protected
+   by Address Sanitizer, by appending a red zone with protected
+   shadow memory after it and aligning it to at least
+   ASAN_RED_ZONE_SIZE bytes.  */
+
+bool
+asan_protect_global (tree decl)
+{
+  rtx rtl, symbol;
+  section *sect;
+
+  if (TREE_CODE (decl) != VAR_DECL
+      /* TLS vars aren't statically protectable.  */
+      || DECL_THREAD_LOCAL_P (decl)
+      /* Externs will be protected elsewhere.  */
+      || DECL_EXTERNAL (decl)
+      || !TREE_ASM_WRITTEN (decl)
+      || !DECL_RTL_SET_P (decl)
+      /* Comdat vars pose an ABI problem, we can't know if
+	 the var that is selected by the linker will have
+	 padding or not.  */
+      || DECL_ONE_ONLY (decl)
+      /* Similarly for common vars.  People can use -fno-common.  */
+      || DECL_COMMON (decl)
+      /* Don't protect if using user section, often vars placed
+	 into user section from multiple TUs are then assumed
+	 to be an array of such vars, putting padding in there
+	 breaks this assumption.  */
+      || (DECL_SECTION_NAME (decl) != NULL_TREE
+	  && !DECL_HAS_IMPLICIT_SECTION_NAME_P (decl))
+      || DECL_SIZE (decl) == 0
+      || ASAN_RED_ZONE_SIZE * BITS_PER_UNIT > MAX_OFILE_ALIGNMENT
+      || !valid_constant_size_p (DECL_SIZE_UNIT (decl))
+      || DECL_ALIGN_UNIT (decl) > 2 * ASAN_RED_ZONE_SIZE)
+    return false;
+
+  rtl = DECL_RTL (decl);
+  if (!MEM_P (rtl) || GET_CODE (XEXP (rtl, 0)) != SYMBOL_REF)
+    return false;
+  symbol = XEXP (rtl, 0);
+
+  if (CONSTANT_POOL_ADDRESS_P (symbol)
+      || TREE_CONSTANT_POOL_ADDRESS_P (symbol))
+    return false;
+
+  sect = get_variable_section (decl, false);
+  if (sect->common.flags & SECTION_COMMON)
+    return false;
+
+  if (lookup_attribute ("weakref", DECL_ATTRIBUTES (decl)))
+    return false;
+
+#ifndef ASM_OUTPUT_DEF
+  if (asan_needs_local_alias (decl))
+    return false;
+#endif
+
+  return true;    
+}
+
 /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
    IS_STORE is either 1 (for a store) or 0 (for a load).
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
@@ -556,6 +641,105 @@ transform_statements (void)
     }
 }
 
+/* Build
+   struct __asan_global
+   {
+     const void *__beg;
+     uptr __size;
+     uptr __size_with_redzone;
+     const void *__name;
+     uptr __has_dynamic_init;
+   } type.  */
+
+static tree
+asan_global_struct (void)
+{
+  static const char *field_names[5]
+    = { "__beg", "__size", "__size_with_redzone",
+	"__name", "__has_dynamic_init" };
+  tree fields[5], ret;
+  int i;
+
+  ret = make_node (RECORD_TYPE);
+  for (i = 0; i < 5; i++)
+    {
+      fields[i]
+	= build_decl (UNKNOWN_LOCATION, FIELD_DECL,
+		      get_identifier (field_names[i]),
+		      (i == 0 || i == 3) ? const_ptr_type_node
+		      : build_nonstandard_integer_type (POINTER_SIZE, 1));
+      DECL_CONTEXT (fields[i]) = ret;
+      if (i)
+	DECL_CHAIN (fields[i - 1]) = fields[i];
+    }
+  TYPE_FIELDS (ret) = fields[0];
+  TYPE_NAME (ret) = get_identifier ("__asan_global");
+  layout_type (ret);
+  return ret;
+}
+
+/* Append description of a single global DECL into vector V.
+   TYPE is __asan_global struct type as returned by asan_global_struct.  */
+
+static void
+asan_add_global (tree decl, tree type, VEC(constructor_elt, gc) *v)
+{
+  tree init, uptr = TREE_TYPE (DECL_CHAIN (TYPE_FIELDS (type)));
+  unsigned HOST_WIDE_INT size;
+  tree str_cst, refdecl = decl;
+  VEC(constructor_elt, gc) *vinner = NULL;
+
+  if (!asan_pp_initialized)
+    asan_pp_initialize ();
+
+  pp_clear_output_area (&asan_pp);
+  if (DECL_NAME (decl))
+    pp_base_tree_identifier (&asan_pp, DECL_NAME (decl));
+  else
+    pp_string (&asan_pp, "<unknown>");
+  pp_space (&asan_pp);
+  pp_left_paren (&asan_pp);
+  pp_string (&asan_pp, main_input_filename);
+  pp_right_paren (&asan_pp);
+  str_cst = asan_pp_string ();
+
+  if (asan_needs_local_alias (decl))
+    {
+      char buf[20];
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LASAN",
+				   VEC_length (constructor_elt, v) + 1);
+      refdecl = build_decl (DECL_SOURCE_LOCATION (decl),
+			    VAR_DECL, get_identifier (buf), TREE_TYPE (decl));
+      TREE_ADDRESSABLE (refdecl) = TREE_ADDRESSABLE (decl);
+      TREE_READONLY (refdecl) = TREE_READONLY (decl);
+      TREE_THIS_VOLATILE (refdecl) = TREE_THIS_VOLATILE (decl);
+      DECL_GIMPLE_REG_P (refdecl) = DECL_GIMPLE_REG_P (decl);
+      DECL_ARTIFICIAL (refdecl) = DECL_ARTIFICIAL (decl);
+      DECL_IGNORED_P (refdecl) = DECL_IGNORED_P (decl);
+      TREE_STATIC (refdecl) = 1;
+      TREE_PUBLIC (refdecl) = 0;
+      TREE_USED (refdecl) = 1;
+      assemble_alias (refdecl, DECL_ASSEMBLER_NAME (decl));
+    }
+
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE,
+			  fold_convert (const_ptr_type_node,
+					build_fold_addr_expr (refdecl)));
+  size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE, build_int_cst (uptr, size));
+  size += asan_red_zone_size (size);
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE, build_int_cst (uptr, size));
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE,
+			  fold_convert (const_ptr_type_node, str_cst));
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE, build_int_cst (uptr, 0));
+  init = build_constructor (type, vinner);
+  CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, init);
+}
+
+/* Needs to be GTY(()), because cgraph_build_static_cdtor may
+   invoke ggc_collect.  */
+static GTY(()) tree asan_ctor_statements;
+
 /* Module-level instrumentation.
    - Insert __asan_init() into the list of CTORs.
    - TODO: insert redzones around globals.
@@ -564,11 +748,61 @@ transform_statements (void)
 void
 asan_finish_file (void)
 {
-  tree ctor_statements = NULL_TREE;
+  struct varpool_node *vnode;
+  unsigned HOST_WIDE_INT gcount = 0;
+
   append_to_statement_list (build_call_expr (asan_init_func (), 0),
-                            &ctor_statements);
-  cgraph_build_static_cdtor ('I', ctor_statements,
-                             MAX_RESERVED_INIT_PRIORITY - 1);
+			    &asan_ctor_statements);
+  FOR_EACH_DEFINED_VARIABLE (vnode)
+    if (asan_protect_global (vnode->symbol.decl))
+      ++gcount;
+  if (gcount)
+    {
+      tree type = asan_global_struct (), var, ctor, decl;
+      tree uptr = build_nonstandard_integer_type (POINTER_SIZE, 1);
+      tree dtor_statements = NULL_TREE;
+      VEC(constructor_elt, gc) *v;
+      char buf[20];
+
+      type = build_array_type_nelts (type, gcount);
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LASAN", 0);
+      var = build_decl (UNKNOWN_LOCATION, VAR_DECL, get_identifier (buf),
+			type);
+      TREE_STATIC (var) = 1;
+      TREE_PUBLIC (var) = 0;
+      DECL_ARTIFICIAL (var) = 1;
+      DECL_IGNORED_P (var) = 1;
+      v = VEC_alloc (constructor_elt, gc, gcount);
+      FOR_EACH_DEFINED_VARIABLE (vnode)
+	if (asan_protect_global (vnode->symbol.decl))
+	  asan_add_global (vnode->symbol.decl, TREE_TYPE (type), v);
+      ctor = build_constructor (type, v);
+      TREE_CONSTANT (ctor) = 1;
+      TREE_STATIC (ctor) = 1;
+      DECL_INITIAL (var) = ctor;
+      varpool_assemble_decl (varpool_node_for_decl (var));
+
+      type = build_function_type_list (void_type_node,
+				       build_pointer_type (TREE_TYPE (type)),
+				       uptr, NULL_TREE);
+      decl = build_fn_decl ("__asan_register_globals", type);
+      TREE_NOTHROW (decl) = 1;
+      append_to_statement_list (build_call_expr (decl, 2,
+						 build_fold_addr_expr (var),
+						 build_int_cst (uptr, gcount)),
+				&asan_ctor_statements);
+
+      decl = build_fn_decl ("__asan_unregister_globals", type);
+      TREE_NOTHROW (decl) = 1;
+      append_to_statement_list (build_call_expr (decl, 2,
+						 build_fold_addr_expr (var),
+						 build_int_cst (uptr, gcount)),
+				&dtor_statements);
+      cgraph_build_static_cdtor ('D', dtor_statements,
+				 MAX_RESERVED_INIT_PRIORITY - 1);
+    }
+  cgraph_build_static_cdtor ('I', asan_ctor_statements,
+			     MAX_RESERVED_INIT_PRIORITY - 1);
 }
 
 /* Initialize shadow_ptr_types array.  */
diff --git a/gcc/asan.h b/gcc/asan.h
index 6f0edbf..d9368a8 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 
 extern void asan_finish_file (void);
 extern rtx asan_emit_stack_protection (rtx, HOST_WIDE_INT *, tree *, int);
+extern bool asan_protect_global (tree);
 
 /* Alias set for accessing the shadow memory.  */
 extern alias_set_type asan_shadow_set;
@@ -56,4 +57,14 @@ asan_protect_stack_decl (tree decl)
   return DECL_P (decl) && !DECL_ARTIFICIAL (decl);
 }
 
+/* Return the size of padding needed to insert after a protected
+   decl of SIZE.  */
+
+static inline unsigned int
+asan_red_zone_size (unsigned int size)
+{
+  unsigned int c = size & (ASAN_RED_ZONE_SIZE - 1);
+  return c ? 2 * ASAN_RED_ZONE_SIZE - c : ASAN_RED_ZONE_SIZE;
+}
+
 #endif /* TREE_ASAN */
diff --git a/gcc/varasm.c b/gcc/varasm.c
index b300348..641ce0c 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -51,6 +51,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-mudflap.h"
 #include "cgraph.h"
 #include "pointer-set.h"
+#include "asan.h"
 
 #ifdef XCOFF_DEBUGGING_INFO
 #include "xcoffout.h"		/* Needed for external data
@@ -1831,6 +1832,9 @@ assemble_noswitch_variable (tree decl, const char *name, section *sect)
   size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
   rounded = size;
 
+  if (flag_asan && asan_protect_global (decl))
+    size += asan_red_zone_size (size);
+
   /* Don't allocate zero bytes of common,
      since that means "undefined external" in the linker.  */
   if (size == 0)
@@ -1897,6 +1901,7 @@ assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED,
   const char *name;
   rtx decl_rtl, symbol;
   section *sect;
+  bool asan_protected = false;
 
   /* This function is supposed to handle VARIABLES.  Ensure we have one.  */
   gcc_assert (TREE_CODE (decl) == VAR_DECL);
@@ -1984,6 +1989,15 @@ assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED,
   /* Compute the alignment of this data.  */
 
   align_variable (decl, dont_output_data);
+
+  if (flag_asan
+      && asan_protect_global (decl))
+    {
+      asan_protected = true;
+      DECL_ALIGN (decl) = MAX (DECL_ALIGN (decl), 
+                               ASAN_RED_ZONE_SIZE * BITS_PER_UNIT);
+    }
+
   set_mem_align (decl_rtl, DECL_ALIGN (decl));
 
   if (TREE_PUBLIC (decl))
@@ -2022,6 +2036,12 @@ assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED,
       if (DECL_ALIGN (decl) > BITS_PER_UNIT)
 	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (DECL_ALIGN_UNIT (decl)));
       assemble_variable_contents (decl, name, dont_output_data);
+      if (asan_protected)
+	{
+	  unsigned HOST_WIDE_INT int size
+	    = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
+	  assemble_zeros (asan_red_zone_size (size));
+	}
     }
 }
 
@@ -6926,6 +6946,8 @@ place_block_symbol (rtx symbol)
       decl = SYMBOL_REF_DECL (symbol);
       alignment = DECL_ALIGN (decl);
       size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
+      if (flag_asan && asan_protect_global (decl))
+	size += asan_red_zone_size (size);
     }
 
   /* Calculate the object's offset from the start of the block.  */
-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 07/10] Make build_check_stmt accept an SSA_NAME for its base
  2012-11-02 22:53 ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
                     ` (5 preceding siblings ...)
  2012-11-02 23:01   ` [PATCH 06/10] Implement protection of global variables Dodji Seketeli
@ 2012-11-02 23:02   ` Dodji Seketeli
  2012-11-06 17:28     ` Diego Novillo
  2012-11-02 23:03   ` [PATCH 08/10] Factorize condition insertion code out of build_check_stmt Dodji Seketeli
                     ` (4 subsequent siblings)
  11 siblings, 1 reply; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-02 23:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

This patch makes build_check_stmt accept its memory access parameter
to be an SSA name.  This is useful for a subsequent patch that will
re-use.

Tested by running cc1 -fasan on the program below with and without the
patch and inspecting the gimple output to see that there is no change.

void
foo ()
{
  char foo[1] = {0};

  foo[0] = 1;
}

gcc/
	* asan.c (build_check_stmt): Accept the memory access to be
	represented by an SSA_NAME.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192843 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |  5 +++++
 gcc/asan.c         | 36 +++++++++++++++++++++++-------------
 2 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index a2e18ce..395ba4f 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,3 +1,8 @@
+2012-10-26  Dodji Seketeli  <dodji@redhat.com>
+
+	* asan.c (build_check_stmt): Accept the memory access to be
+	represented by an SSA_NAME.
+
 2012-10-17  Jakub Jelinek  <jakub@redhat.com>
 	    Wei Mi <wmi@google.com>
 
diff --git a/gcc/asan.c b/gcc/asan.c
index 6880270..7c99173 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -383,16 +383,18 @@ asan_init_func (void)
 #define PROB_VERY_UNLIKELY	(REG_BR_PROB_BASE / 2000 - 1)
 #define PROB_ALWAYS		(REG_BR_PROB_BASE)
 
-/* Instrument the memory access instruction BASE.
-   Insert new statements before ITER.
-   LOCATION is source code location.
-   IS_STORE is either 1 (for a store) or 0 (for a load).
+/* Instrument the memory access instruction BASE.  Insert new
+   statements before ITER.
+
+   Note that the memory access represented by BASE can be either an
+   SSA_NAME, or a non-SSA expression.  LOCATION is the source code
+   location.  IS_STORE is TRUE for a store, FALSE for a load.
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
 
 static void
-build_check_stmt (tree base,
-                  gimple_stmt_iterator *iter,
-                  location_t location, bool is_store, int size_in_bytes)
+build_check_stmt (tree base, gimple_stmt_iterator *iter,
+                  location_t location, bool is_store,
+		  int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
   basic_block cond_bb, then_bb, else_bb;
@@ -403,6 +405,7 @@ build_check_stmt (tree base,
   tree shadow_type = TREE_TYPE (shadow_ptr_type);
   tree uintptr_type
     = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE (base)), 1);
+  tree base_ssa = base;
 
   /* We first need to split the current basic block, and start altering
      the CFG.  This allows us to insert the statements we're about to
@@ -448,15 +451,22 @@ build_check_stmt (tree base,
   base = unshare_expr (base);
 
   gsi = gsi_last_bb (cond_bb);
-  g = gimple_build_assign_with_ops (TREE_CODE (base),
-				    make_ssa_name (TREE_TYPE (base), NULL),
-				    base, NULL_TREE);
-  gimple_set_location (g, location);
-  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+  /* BASE can already be an SSA_NAME; in that case, do not create a
+     new SSA_NAME for it.  */
+  if (TREE_CODE (base) != SSA_NAME)
+    {
+      g = gimple_build_assign_with_ops (TREE_CODE (base),
+					make_ssa_name (TREE_TYPE (base), NULL),
+					base, NULL_TREE);
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+      base_ssa = gimple_assign_lhs (g);
+    }
 
   g = gimple_build_assign_with_ops (NOP_EXPR,
 				    make_ssa_name (uintptr_type, NULL),
-				    gimple_assign_lhs (g), NULL_TREE);
+				    base_ssa, NULL_TREE);
   gimple_set_location (g, location);
   gsi_insert_after (&gsi, g, GSI_NEW_STMT);
   base_addr = gimple_assign_lhs (g);
-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 08/10] Factorize condition insertion code out of build_check_stmt
  2012-11-02 22:53 ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
                     ` (6 preceding siblings ...)
  2012-11-02 23:02   ` [PATCH 07/10] Make build_check_stmt accept an SSA_NAME for its base Dodji Seketeli
@ 2012-11-02 23:03   ` Dodji Seketeli
  2012-11-05 15:50     ` Jakub Jelinek
  2012-11-06 17:30     ` Diego Novillo
  2012-11-02 23:05   ` [PATCH 09/10] Instrument built-in memory access function calls Dodji Seketeli
                     ` (3 subsequent siblings)
  11 siblings, 2 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-02 23:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

This patch splits a new create_cond_insert_point_before_iter function
out of build_check_stmt, to be used by a later patch.

Tested by running cc1 -fasan on the test program below with and
without the patch and by inspecting the gimple output to see that
there is no change.

void
foo ()
{
  char foo[1] = {0};

  foo[0] = 1;
}

gcc/

	* asan.c (create_cond_insert_point_before_iter): Factorize out of ...
	(build_check_stmt): ... here.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192844 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |   5 +++
 gcc/asan.c         | 120 +++++++++++++++++++++++++++++++++--------------------
 2 files changed, 81 insertions(+), 44 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 395ba4f..903dc52 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,5 +1,10 @@
 2012-10-26  Dodji Seketeli  <dodji@redhat.com>
 
+	* asan.c (create_cond_insert_point_before_iter): Factorize out of ...
+	(build_check_stmt): ... here.
+
+2012-10-26  Dodji Seketeli  <dodji@redhat.com>
+
 	* asan.c (build_check_stmt): Accept the memory access to be
 	represented by an SSA_NAME.
 
diff --git a/gcc/asan.c b/gcc/asan.c
index 7c99173..cc107f8 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -383,6 +383,75 @@ asan_init_func (void)
 #define PROB_VERY_UNLIKELY	(REG_BR_PROB_BASE / 2000 - 1)
 #define PROB_ALWAYS		(REG_BR_PROB_BASE)
 
+/* Split the current basic block and create a condition statement
+   insertion point right before the statement pointed to by ITER.
+   Return an iterator to the point at which the caller might safely
+   insert the condition statement.
+
+   THEN_BLOCK must be set to the address of an uninitialized instance
+   of basic_block.  The function will then set *THEN_BLOCK to the
+   'then block' of the condition statement to be inserted by the
+   caller.
+
+   Similarly, the function will set *FALLTRHOUGH_BLOCK to the 'else
+   block' of the condition statement to be inserted by the caller.
+
+   Note that *FALLTHROUGH_BLOCK is a new block that contains the
+   statements starting from *ITER, and *THEN_BLOCK is a new empty
+   block.
+
+   *ITER is adjusted to still point to the same statement it was
+   *pointing to initially.  */
+
+static gimple_stmt_iterator
+create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
+				      bool then_more_likely_p,
+				      basic_block *then_block,
+				      basic_block *fallthrough_block)
+{
+  gimple_stmt_iterator gsi = *iter;
+
+  if (!gsi_end_p (gsi))
+    gsi_prev (&gsi);
+
+  basic_block cur_bb = gsi_bb (*iter);
+
+  edge e = split_block (cur_bb, gsi_stmt (gsi));
+
+  /* Get a hold on the 'condition block', the 'then block' and the
+     'else block'.  */
+  basic_block cond_bb = e->src;
+  basic_block fallthru_bb = e->dest;
+  basic_block then_bb = create_empty_bb (cond_bb);
+
+  /* Set up the newly created 'then block'.  */
+  e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  int fallthrough_probability =
+    then_more_likely_p
+    ? PROB_VERY_UNLIKELY
+    : PROB_ALWAYS - PROB_VERY_UNLIKELY;
+  e->probability = PROB_ALWAYS - fallthrough_probability;
+  make_single_succ_edge (then_bb, fallthru_bb, EDGE_FALLTHRU);
+
+  /* Set up the fallthrough basic block.  */
+  e = find_edge (cond_bb, fallthru_bb);
+  e->flags = EDGE_FALSE_VALUE;
+  e->count = cond_bb->count;
+  e->probability = fallthrough_probability;
+
+  /* Update dominance info for the newly created then_bb; note that
+     fallthru_bb's dominance info has already been updated by
+     split_bock.  */
+  if (dom_info_available_p (CDI_DOMINATORS))
+    set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
+
+  *then_block = then_bb;
+  *fallthrough_block = fallthru_bb;
+  *iter = gsi_start_bb (fallthru_bb);
+
+  return gsi_last_bb (cond_bb);
+}
+
 /* Instrument the memory access instruction BASE.  Insert new
    statements before ITER.
 
@@ -397,8 +466,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 		  int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
-  basic_block cond_bb, then_bb, else_bb;
-  edge e;
+  basic_block then_bb, else_bb;
   tree t, base_addr, shadow;
   gimple g;
   tree shadow_ptr_type = shadow_ptr_types[size_in_bytes == 16 ? 1 : 0];
@@ -407,51 +475,15 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
     = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE (base)), 1);
   tree base_ssa = base;
 
-  /* We first need to split the current basic block, and start altering
-     the CFG.  This allows us to insert the statements we're about to
-     construct into the right basic blocks.  */
-
-  cond_bb = gimple_bb (gsi_stmt (*iter));
-  gsi = *iter;
-  gsi_prev (&gsi);
-  if (!gsi_end_p (gsi))
-    e = split_block (cond_bb, gsi_stmt (gsi));
-  else
-    e = split_block_after_labels (cond_bb);
-  cond_bb = e->src;
-  else_bb = e->dest;
-
-  /* A recap at this point: else_bb is the basic block at whose head
-     is the gimple statement for which this check expression is being
-     built.  cond_bb is the (possibly new, synthetic) basic block the
-     end of which will contain the cache-lookup code, and a
-     conditional that jumps to the cache-miss code or, much more
-     likely, over to else_bb.  */
-
-  /* Create the bb that contains the crash block.  */
-  then_bb = create_empty_bb (cond_bb);
-  e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
-  e->probability = PROB_VERY_UNLIKELY;
-  make_single_succ_edge (then_bb, else_bb, EDGE_FALLTHRU);
-
-  /* Mark the pseudo-fallthrough edge from cond_bb to else_bb.  */
-  e = find_edge (cond_bb, else_bb);
-  e->flags = EDGE_FALSE_VALUE;
-  e->count = cond_bb->count;
-  e->probability = PROB_ALWAYS - PROB_VERY_UNLIKELY;
-
-  /* Update dominance info.  Note that bb_join's data was
-     updated by split_block.  */
-  if (dom_info_available_p (CDI_DOMINATORS))
-    {
-      set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
-      set_immediate_dominator (CDI_DOMINATORS, else_bb, cond_bb);
-    }
+  /* Get an iterator on the point where we can add the condition
+     statement for the instrumentation.  */
+  gsi = create_cond_insert_point_before_iter (iter,
+					      /*then_more_likely_p=*/false,
+					      &then_bb,
+					      &else_bb);
 
   base = unshare_expr (base);
 
-  gsi = gsi_last_bb (cond_bb);
-
   /* BASE can already be an SSA_NAME; in that case, do not create a
      new SSA_NAME for it.  */
   if (TREE_CODE (base) != SSA_NAME)
-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 09/10] Instrument built-in memory access function calls
  2012-11-02 22:53 ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
                     ` (7 preceding siblings ...)
  2012-11-02 23:03   ` [PATCH 08/10] Factorize condition insertion code out of build_check_stmt Dodji Seketeli
@ 2012-11-02 23:05   ` Dodji Seketeli
  2012-11-06 17:37     ` Diego Novillo
  2012-11-03  8:22   ` [PATCH 10/10] Import the asan runtime library into GCC tree Dodji Seketeli
                     ` (2 subsequent siblings)
  11 siblings, 1 reply; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-02 23:05 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

This patch instruments many memory access patterns through builtins.

Basically, for a call like:

     __builtin_memset (from, 0, n_bytes);

the patch would only instrument the accesses at the beginning and at
the end of the memory region [from, from + n_bytes].  This is the
strategy used by the llvm implementation of asan.

This instrumentation is done for all the memory access builtin
functions that expose a well specified memory region -- one that
explicitly states the number of bytes accessed in the region.

A special treatment is used for __builtin_strlen.  The patch
instruments the access to the first byte of its argument, as well as
the access to the byte (of the argument) at the offset returned by
strlen.

For the __sync_* and __atomic* calls the patch instruments the access
to the bytes pointed to by the argument.

While doing this, I have added a new parameter to build_check_stmt to
decide whether to insert the instrumentation code before or after the
statement iterator.  This allows us to do away with the
gsi_{next,prev} dance we were doing in the callers of this function.

Tested by running cc1 -fasan on variations of simple programs like:

    int
    foo ()
    {
      char foo[10] = {0};

      foo[0] = 't';
      foo[1] = 'e';
      foo[2] = 's';
      foo[3] = 't';
      int l = __builtin_strlen (foo);
      int n = sizeof (foo);
      __builtin_memset (&foo[4], 0, n - 4);
      __sync_fetch_and_add (&foo[11], 1);

      return l;
    }

and by starring at the gimple output which for this function is:

    ;; Function foo (foo, funcdef_no=0, decl_uid=1714, cgraph_uid=0)

    foo ()
    {
      int n;
      int l;
      char foo[10];
      int D.1725;
      char * D.1724;
      int D.1723;
      long unsigned int D.1722;
      int D.1721;
      long unsigned int D.1720;
      long unsigned int _1;
      int _4;
      long unsigned int _5;
      int _6;
      char * _7;
      int _8;
      char * _9;
      unsigned long _10;
      unsigned long _11;
      unsigned long _12;
      signed char * _13;
      signed char _14;
      _Bool _15;
      unsigned long _16;
      signed char _17;
      _Bool _18;
      _Bool _19;
      char * _20;
      unsigned long _21;
      unsigned long _22;
      unsigned long _23;
      signed char * _24;
      signed char _25;
      _Bool _26;
      unsigned long _27;
      signed char _28;
      _Bool _29;
      _Bool _30;
      char * _31;
      unsigned long _32;
      unsigned long _33;
      unsigned long _34;
      signed char * _35;
      signed char _36;
      _Bool _37;
      unsigned long _38;
      signed char _39;
      _Bool _40;
      _Bool _41;
      char * _42;
      unsigned long _43;
      unsigned long _44;
      unsigned long _45;
      signed char * _46;
      signed char _47;
      _Bool _48;
      unsigned long _49;
      signed char _50;
      _Bool _51;
      _Bool _52;
      char * _53;
      unsigned long _54;
      unsigned long _55;
      unsigned long _56;
      signed char * _57;
      signed char _58;
      _Bool _59;
      unsigned long _60;
      signed char _61;
      _Bool _62;
      _Bool _63;
      char[10] * _64;
      unsigned long _65;
      unsigned long _66;
      unsigned long _67;
      signed char * _68;
      signed char _69;
      _Bool _70;
      unsigned long _71;
      signed char _72;
      _Bool _73;
      _Bool _74;
      unsigned long _75;
      unsigned long _76;
      unsigned long _77;
      signed char * _78;
      signed char _79;
      _Bool _80;
      unsigned long _81;
      signed char _82;
      _Bool _83;
      _Bool _84;
      long unsigned int _85;
      long unsigned int _86;
      char * _87;
      char * _88;
      unsigned long _89;
      unsigned long _90;
      unsigned long _91;
      signed char * _92;
      signed char _93;
      _Bool _94;
      unsigned long _95;
      signed char _96;
      _Bool _97;
      _Bool _98;
      char * _99;
      unsigned long _100;
      unsigned long _101;
      unsigned long _102;
      signed char * _103;
      signed char _104;
      _Bool _105;
      unsigned long _106;
      signed char _107;
      _Bool _108;
      _Bool _109;

      <bb 2>:
      foo = {};
      _9 = &foo[0];
      _10 = (unsigned long) _9;
      _11 = _10 >> 3;
      _12 = _11 + 17592186044416;
      _13 = (signed char *) _12;
      _14 = *_13;
      _15 = _14 != 0;
      _16 = _10 & 7;
      _17 = (signed char) _16;
      _18 = _17 >= _14;
      _19 = _15 & _18;
      if (_19 != 0)
	goto <bb 5>;
      else
	goto <bb 4>;

      <bb 5>:
      __asan_report_store1 (_10);

      <bb 4>:
      foo[0] = 116;
      _20 = &foo[1];
      _21 = (unsigned long) _20;
      _22 = _21 >> 3;
      _23 = _22 + 17592186044416;
      _24 = (signed char *) _23;
      _25 = *_24;
      _26 = _25 != 0;
      _27 = _21 & 7;
      _28 = (signed char) _27;
      _29 = _28 >= _25;
      _30 = _26 & _29;
      if (_30 != 0)
	goto <bb 7>;
      else
	goto <bb 6>;

      <bb 7>:
      __asan_report_store1 (_21);

      <bb 6>:
      foo[1] = 101;
      _31 = &foo[2];
      _32 = (unsigned long) _31;
      _33 = _32 >> 3;
      _34 = _33 + 17592186044416;
      _35 = (signed char *) _34;
      _36 = *_35;
      _37 = _36 != 0;
      _38 = _32 & 7;
      _39 = (signed char) _38;
      _40 = _39 >= _36;
      _41 = _37 & _40;
      if (_41 != 0)
	goto <bb 9>;
      else
	goto <bb 8>;

      <bb 9>:
      __asan_report_store1 (_32);

      <bb 8>:
      foo[2] = 115;
      _42 = &foo[3];
      _43 = (unsigned long) _42;
      _44 = _43 >> 3;
      _45 = _44 + 17592186044416;
      _46 = (signed char *) _45;
      _47 = *_46;
      _48 = _47 != 0;
      _49 = _43 & 7;
      _50 = (signed char) _49;
      _51 = _50 >= _47;
      _52 = _48 & _51;
      if (_52 != 0)
	goto <bb 11>;
      else
	goto <bb 10>;

      <bb 11>:
      __asan_report_store1 (_43);

      <bb 10>:
      foo[3] = 116;
      _53 = (char *) &foo;
      _54 = (unsigned long) _53;
      _55 = _54 >> 3;
      _56 = _55 + 17592186044416;
      _57 = (signed char *) _56;
      _58 = *_57;
      _59 = _58 != 0;
      _60 = _54 & 7;
      _61 = (signed char) _60;
      _62 = _61 >= _58;
      _63 = _59 & _62;
      if (_63 != 0)
	goto <bb 13>;
      else
	goto <bb 12>;

      <bb 13>:
      __asan_report_load1 (_54);

      <bb 12>:
      _1 = __builtin_strlen (&foo);
      _64 = _53 + _1;
      _65 = (unsigned long) _64;
      _66 = _65 >> 3;
      _67 = _66 + 17592186044416;
      _68 = (signed char *) _67;
      _69 = *_68;
      _70 = _69 != 0;
      _71 = _65 & 7;
      _72 = (signed char) _71;
      _73 = _72 >= _69;
      _74 = _70 & _73;
      if (_74 != 0)
	goto <bb 15>;
      else
	goto <bb 14>;

      <bb 15>:
      __asan_report_load1 (_65);

      <bb 14>:
      l_2 = (int) _1;
      n_3 = 10;
      _4 = n_3 + -4;
      _5 = (long unsigned int) _4;
      _6 = l_2 + 1;
      _7 = &foo[_6];
      if (_5 != 0)
	goto <bb 17>;
      else
	goto <bb 16>;

      <bb 17>:
      _75 = (unsigned long) _7;
      _76 = _75 >> 3;
      _77 = _76 + 17592186044416;
      _78 = (signed char *) _77;
      _79 = *_78;
      _80 = _79 != 0;
      _81 = _75 & 7;
      _82 = (signed char) _81;
      _83 = _82 >= _79;
      _84 = _80 & _83;
      _85 = _5;
      _86 = _85 - 1;
      _87 = _7;
      _88 = _87 + _86;
      _89 = (unsigned long) _88;
      _90 = _89 >> 3;
      _91 = _90 + 17592186044416;
      _92 = (signed char *) _91;
      _93 = *_92;
      _94 = _93 != 0;
      _95 = _89 & 7;
      _96 = (signed char) _95;
      _97 = _96 >= _93;
      _98 = _94 & _97;
      if (_98 != 0)
	goto <bb 21>;
      else
	goto <bb 20>;

      <bb 21>:
      __asan_report_store1 (_89);

      <bb 20>:
      if (_84 != 0)
	goto <bb 19>;
      else
	goto <bb 18>;

      <bb 19>:
      __asan_report_store1 (_75);

      <bb 18>:

      <bb 16>:
      __builtin_memset (_7, 0, _5);
      _99 = &foo[11];
      _100 = (unsigned long) _99;
      _101 = _100 >> 3;
      _102 = _101 + 17592186044416;
      _103 = (signed char *) _102;
      _104 = *_103;
      _105 = _104 != 0;
      _106 = _100 & 7;
      _107 = (signed char) _106;
      _108 = _107 >= _104;
      _109 = _105 & _108;
      if (_109 != 0)
	goto <bb 23>;
      else
	goto <bb 22>;

      <bb 23>:
      __asan_report_store1 (_100);

      <bb 22>:
      __sync_fetch_and_add_1 (&foo[11], 1);
      _8 = l_2;
      foo ={v} {CLOBBER};

    <L1>:
      return _8;

    }

    ;; Function _GLOBAL__sub_I_00099_0_foo (_GLOBAL__sub_I_00099_0_foo, funcdef_no=1, decl_uid=1752, cgraph_uid=4)

    _GLOBAL__sub_I_00099_0_foo ()
    {
      <bb 2>:
      __asan_init ();
      return;

    }

gcc/
	* asan.c (insert_if_then_before_iter, instrument_mem_region_access,
	(instrument_strlen_call, maybe_instrument_builtin_call,
	(maybe_instrument_call): New static functions.
	(create_cond_insert_point): Renamed
	create_cond_insert_point_before_iter into this.  Add a new
	parameter to decide whether to insert the condition before or
	after the statement iterator.
	(build_check_stmt): Adjust for the new create_cond_insert_point.
	Add a new parameter to decide whether to add the instrumentation
	code before or after the statement iterator.
	(instrument_assignment): Factorize from ...
	(transform_statements): ... here.  Use maybe_instrument_call to
	instrument builtin function calls as well.
	(instrument_derefs): Adjust for the new parameter of
	build_check_stmt.  Fix detection of bit-field access.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192845 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |  18 ++
 gcc/asan.c         | 612 ++++++++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 601 insertions(+), 29 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 903dc52..0a97b42 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,5 +1,23 @@
 2012-10-26  Dodji Seketeli  <dodji@redhat.com>
 
+	* asan.c (insert_if_then_before_iter, instrument_mem_region_access,
+	(instrument_strlen_call, maybe_instrument_builtin_call,
+	(maybe_instrument_call): New static functions.
+	(create_cond_insert_point): Renamed
+	create_cond_insert_point_before_iter into this.  Add a new
+	parameter to decide whether to insert the condition before or
+	after the statement iterator.
+	(build_check_stmt): Adjust for the new create_cond_insert_point.
+	Add a new parameter to decide whether to add the instrumentation
+	code before or after the statement iterator.
+	(instrument_assignment): Factorize from ...
+	(transform_statements): ... here.  Use maybe_instrument_call to
+	instrument builtin function calls as well.
+	(instrument_derefs): Adjust for the new parameter of
+	build_check_stmt.  Fix detection of bit-field access.
+
+2012-10-26  Dodji Seketeli  <dodji@redhat.com>
+
 	* asan.c (create_cond_insert_point_before_iter): Factorize out of ...
 	(build_check_stmt): ... here.
 
diff --git a/gcc/asan.c b/gcc/asan.c
index cc107f8..7a95cc9 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -384,9 +384,9 @@ asan_init_func (void)
 #define PROB_ALWAYS		(REG_BR_PROB_BASE)
 
 /* Split the current basic block and create a condition statement
-   insertion point right before the statement pointed to by ITER.
-   Return an iterator to the point at which the caller might safely
-   insert the condition statement.
+   insertion point right before or after the statement pointed to by
+   ITER.  Return an iterator to the point at which the caller might
+   safely insert the condition statement.
 
    THEN_BLOCK must be set to the address of an uninitialized instance
    of basic_block.  The function will then set *THEN_BLOCK to the
@@ -400,18 +400,21 @@ asan_init_func (void)
    statements starting from *ITER, and *THEN_BLOCK is a new empty
    block.
 
-   *ITER is adjusted to still point to the same statement it was
-   *pointing to initially.  */
+   *ITER is adjusted to point to always point to the first statement
+    of the basic block * FALLTHROUGH_BLOCK.  That statement is the
+    same as what ITER was pointing to prior to calling this function,
+    if BEFORE_P is true; otherwise, it is its following statement.  */
 
 static gimple_stmt_iterator
-create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
-				      bool then_more_likely_p,
-				      basic_block *then_block,
-				      basic_block *fallthrough_block)
+create_cond_insert_point (gimple_stmt_iterator *iter,
+			  bool before_p,
+			  bool then_more_likely_p,
+			  basic_block *then_block,
+			  basic_block *fallthrough_block)
 {
   gimple_stmt_iterator gsi = *iter;
 
-  if (!gsi_end_p (gsi))
+  if (!gsi_end_p (gsi) && before_p)
     gsi_prev (&gsi);
 
   basic_block cur_bb = gsi_bb (*iter);
@@ -452,18 +455,58 @@ create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
   return gsi_last_bb (cond_bb);
 }
 
+/* Insert an if condition followed by a 'then block' right before the
+   statement pointed to by ITER.  The fallthrough block -- which is the
+   else block of the condition as well as the destination of the
+   outcoming edge of the 'then block' -- starts with the statement
+   pointed to by ITER.
+
+   COND is the condition of the if.  
+
+   If THEN_MORE_LIKELY_P is true, the probability of the edge to the
+   'then block' is higher than the probability of the edge to the
+   fallthrough block.
+
+   Upon completion of the function, *THEN_BB is set to the newly
+   inserted 'then block' and similarly, *FALLTHROUGH_BB is set to the
+   fallthrough block.
+
+   *ITER is adjusted to still point to the same statement it was
+   pointing to initially.  */
+
+static void
+insert_if_then_before_iter (gimple cond,
+			    gimple_stmt_iterator *iter,
+			    bool then_more_likely_p,
+			    basic_block *then_bb,
+			    basic_block *fallthrough_bb)
+{
+  gimple_stmt_iterator cond_insert_point =
+    create_cond_insert_point (iter,
+			      /*before_p=*/true,
+			      then_more_likely_p,
+			      then_bb,
+			      fallthrough_bb);
+  gsi_insert_after (&cond_insert_point, cond, GSI_NEW_STMT);
+}
+
 /* Instrument the memory access instruction BASE.  Insert new
-   statements before ITER.
+   statements before or after ITER.
 
    Note that the memory access represented by BASE can be either an
    SSA_NAME, or a non-SSA expression.  LOCATION is the source code
    location.  IS_STORE is TRUE for a store, FALSE for a load.
-   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
+   BEFORE_P is TRUE for inserting the instrumentation code before
+   ITER, FALSE for inserting it after ITER.  SIZE_IN_BYTES is one of
+   1, 2, 4, 8, 16.
+
+   If BEFORE_P is TRUE, *ITER is arranged to still point to the
+   statement it was pointing to prior to calling this function,
+   otherwise, it points to the statement logically following it.  */
 
 static void
-build_check_stmt (tree base, gimple_stmt_iterator *iter,
-                  location_t location, bool is_store,
-		  int size_in_bytes)
+build_check_stmt (location_t location, tree base, gimple_stmt_iterator *iter,
+		  bool before_p, bool is_store, int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
   basic_block then_bb, else_bb;
@@ -477,10 +520,10 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 
   /* Get an iterator on the point where we can add the condition
      statement for the instrumentation.  */
-  gsi = create_cond_insert_point_before_iter (iter,
-					      /*then_more_likely_p=*/false,
-					      &then_bb,
-					      &else_bb);
+  gsi = create_cond_insert_point (iter, before_p,
+				  /*then_more_likely_p=*/false,
+				  &then_bb,
+				  &else_bb);
 
   base = unshare_expr (base);
 
@@ -612,7 +655,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 
 /* If T represents a memory access, add instrumentation code before ITER.
    LOCATION is source code location.
-   IS_STORE is either 1 (for a store) or 0 (for a load).  */
+   IS_STORE is either TRUE (for a store) or FALSE (for a load).  */
 
 static void
 instrument_derefs (gimple_stmt_iterator *iter, tree t,
@@ -647,11 +690,523 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
   int volatilep = 0, unsignedp = 0;
   get_inner_reference (t, &bitsize, &bitpos, &offset,
 		       &mode, &unsignedp, &volatilep, false);
-  if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
+  if (bitpos % (size_in_bytes * BITS_PER_UNIT)
+      || bitsize != size_in_bytes * BITS_PER_UNIT)
     return;
 
   base = build_fold_addr_expr (t);
-  build_check_stmt (base, iter, location, is_store, size_in_bytes);
+  build_check_stmt (location, base, iter, /*before_p=*/true,
+		    is_store, size_in_bytes);
+}
+
+/* Instrument an access to a contiguous memory region that starts at
+   the address pointed to by BASE, over a length of LEN (expressed in
+   the sizeof (*BASE) bytes).  ITER points to the instruction before
+   which the instrumentation instructions must be inserted.  LOCATION
+   is the source location that the instrumentation instructions must
+   have.  If IS_STORE is true, then the memory access is a store;
+   otherwise, it's a load.  */
+
+static void
+instrument_mem_region_access (tree base, tree len,
+			      gimple_stmt_iterator *iter,
+			      location_t location, bool is_store)
+{
+  if (integer_zerop (len))
+    return;
+
+  gimple_stmt_iterator gsi = *iter;
+
+  basic_block fallthrough_bb = NULL, then_bb = NULL;
+  if (!is_gimple_constant (len))
+    {
+      /* So, the length of the memory area to asan-protect is
+	 non-constant.  Let's guard the generated instrumentation code
+	 like:
+
+	 if (len != 0)
+	   {
+	     //asan instrumentation code goes here.
+           }
+	   // falltrough instructions, starting with *ITER.  */
+
+      gimple g = gimple_build_cond (NE_EXPR,
+				    len,
+				    build_int_cst (TREE_TYPE (len), 0),
+				    NULL_TREE, NULL_TREE);
+      gimple_set_location (g, location);
+      insert_if_then_before_iter (g, iter, /*then_more_likely_p=*/true,
+				  &then_bb, &fallthrough_bb);
+      /* Note that fallthrough_bb starts with the statement that was
+	 pointed to by ITER.  */
+
+      /* The 'then block' of the 'if (len != 0) condition is where
+	 we'll generate the asan instrumentation code now.  */
+      gsi = gsi_start_bb (then_bb);
+    }
+
+  /* Instrument the beginning of the memory region to be accessed,
+     and arrange for the rest of the intrumentation code to be
+     inserted in the then block *after* the current gsi.  */
+  build_check_stmt (location, base, &gsi, /*before_p=*/true, is_store, 1);
+
+  if (then_bb)
+    /* We are in the case where the length of the region is not
+       constant; so instrumentation code is being generated in the
+       'then block' of the 'if (len != 0) condition.  Let's arrange
+       for the subsequent instrumentation statements to go in the
+       'then block'.  */
+    gsi = gsi_last_bb (then_bb);
+  else
+    *iter = gsi;
+
+  /* We want to instrument the access at the end of the memory region,
+     which is at (base + len - 1).  */
+
+  /* offset = len - 1;  */
+  len = unshare_expr (len);
+  gimple offset =
+    gimple_build_assign_with_ops (TREE_CODE (len),
+				  make_ssa_name (TREE_TYPE (len), NULL),
+				  len, NULL);
+  gimple_set_location (offset, location);
+  gsi_insert_before (&gsi, offset, GSI_NEW_STMT);
+
+  offset =
+    gimple_build_assign_with_ops (MINUS_EXPR,
+				  make_ssa_name (size_type_node, NULL),
+				  gimple_assign_lhs (offset),
+				  build_int_cst (size_type_node, 1));
+  gimple_set_location (offset, location);
+  gsi_insert_after (&gsi, offset, GSI_NEW_STMT);
+
+  /* _1 = base;  */
+  base = unshare_expr (base);
+  gimple region_end =
+    gimple_build_assign_with_ops (TREE_CODE (base),
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  base, NULL);
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* _2 = _1 + offset;  */
+  region_end =
+    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  gimple_assign_lhs (region_end), 
+				  gimple_assign_lhs (offset));
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* instrument access at _2;  */
+  build_check_stmt (location, gimple_assign_lhs (region_end),
+		    &gsi, /*before_p=*/false, is_store, 1);
+}
+
+/* Instrument the strlen builtin call pointed to by ITER.
+
+   This function instruments the access to the first byte of the
+   argument, right before the call.  After the call it instruments the
+   access to the last byte of the argument; it uses the result of the
+   call to deduce the offset of that last byte.  */
+
+static void
+instrument_strlen_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+  gcc_assert (is_gimple_call (call));
+
+  tree callee = gimple_call_fndecl (call);
+  gcc_assert (is_builtin_fn (callee)
+	      && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL
+	      && DECL_FUNCTION_CODE (callee) == BUILT_IN_STRLEN);
+
+  tree len = gimple_call_lhs (call);
+  if (len == NULL)
+    /* Some passes might clear the return value of the strlen call;
+       bail out in that case.  */
+    return;
+  gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (len)));
+
+  location_t loc = gimple_location (call);
+  tree str_arg = gimple_call_arg (call, 0);
+
+  /* Instrument the access to the first byte of str_arg.  i.e:
+
+     _1 = str_arg; instrument (_1); */
+  gimple str_arg_ssa =
+    gimple_build_assign_with_ops (NOP_EXPR,
+				  make_ssa_name (build_pointer_type
+						 (char_type_node), NULL),
+				  str_arg, NULL);
+  gimple_set_location (str_arg_ssa, loc);
+  gimple_stmt_iterator gsi = *iter;
+  gsi_insert_before (&gsi, str_arg_ssa, GSI_NEW_STMT);
+  build_check_stmt (loc, gimple_assign_lhs (str_arg_ssa), &gsi,
+		    /*before_p=*/false, /*is_store=*/false, 1);
+
+  /* If we initially had an instruction like:
+
+	 int n = strlen (str)
+
+     we now want to instrument the access to str[n], after the
+     instruction above.*/
+
+  /* So let's build the access to str[n] that is, access through the
+     pointer_plus expr: (_1 + len).  */
+  gimple stmt =
+    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+				  make_ssa_name (TREE_TYPE (str_arg),
+						 NULL),
+				  gimple_assign_lhs (str_arg_ssa),
+				  len);
+  gimple_set_location (stmt, loc);
+  gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
+
+  build_check_stmt (loc, gimple_assign_lhs (stmt), &gsi,
+		    /*before_p=*/false, /*is_store=*/false, 1);
+
+  /* Ensure that iter points to the statement logically following the
+     one it was initially pointing to.  */
+  *iter = gsi;
+}
+
+/* if the statement pointed to by the iterator iter is a call to a
+   builtin memory access function, instrument it and return true.
+   otherwise, return false.  */
+
+static bool
+maybe_instrument_builtin_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+  location_t loc = gimple_location (call);
+
+  if (!is_gimple_call (call))
+    return false;
+
+  tree callee = gimple_call_fndecl (call);
+
+  if (!is_builtin_fn (callee)
+      || DECL_BUILT_IN_CLASS (callee) != BUILT_IN_NORMAL)
+    return false;
+
+  tree source0 = NULL_TREE, source1 = NULL_TREE,
+    dest = NULL_TREE, len = NULL_TREE;
+  bool is_store = true;
+
+  switch (DECL_FUNCTION_CODE (callee))
+    {
+      /* (s, s, n) style memops.  */
+    case BUILT_IN_BCMP:
+    case BUILT_IN_MEMCMP:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      source1 = gimple_call_arg (call, 1);
+      break;
+
+      /* (src, dest, n) style memops.  */
+    case BUILT_IN_BCOPY:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      dest = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, src, n) style memops.  */
+    case BUILT_IN_MEMCPY:
+    case BUILT_IN_MEMCPY_CHK:
+    case BUILT_IN_MEMMOVE:
+    case BUILT_IN_MEMMOVE_CHK:
+    case BUILT_IN_MEMPCPY:
+    case BUILT_IN_MEMPCPY_CHK:
+      dest = gimple_call_arg (call, 0);
+      source0 = gimple_call_arg (call, 1);
+      len = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, n) style memops.  */
+    case BUILT_IN_BZERO:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 1);
+      break;
+
+      /* (dest, x, n) style memops*/
+    case BUILT_IN_MEMSET:
+    case BUILT_IN_MEMSET_CHK:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 2);
+      break;
+
+    case BUILT_IN_STRLEN:
+      instrument_strlen_call (iter);
+      return true;
+
+    /* And now the __atomic* and __sync builtins.
+       These are handled differently from the classical memory memory
+       access builtins above.  */
+
+    case BUILT_IN_ATOMIC_LOAD:
+    case BUILT_IN_ATOMIC_LOAD_1:
+    case BUILT_IN_ATOMIC_LOAD_2:
+    case BUILT_IN_ATOMIC_LOAD_4:
+    case BUILT_IN_ATOMIC_LOAD_8:
+    case BUILT_IN_ATOMIC_LOAD_16:
+      is_store = false;
+      /* fall through.  */
+
+    case BUILT_IN_SYNC_FETCH_AND_ADD_1:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_2:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_4:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_8:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_SUB_1:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_2:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_4:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_8:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_OR_1:
+    case BUILT_IN_SYNC_FETCH_AND_OR_2:
+    case BUILT_IN_SYNC_FETCH_AND_OR_4:
+    case BUILT_IN_SYNC_FETCH_AND_OR_8:
+    case BUILT_IN_SYNC_FETCH_AND_OR_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_AND_1:
+    case BUILT_IN_SYNC_FETCH_AND_AND_2:
+    case BUILT_IN_SYNC_FETCH_AND_AND_4:
+    case BUILT_IN_SYNC_FETCH_AND_AND_8:
+    case BUILT_IN_SYNC_FETCH_AND_AND_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_XOR_1:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_2:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_4:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_8:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_NAND_1:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_2:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_4:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_8:
+
+    case BUILT_IN_SYNC_ADD_AND_FETCH_1:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_2:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_4:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_8:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_SUB_AND_FETCH_1:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_2:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_4:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_8:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_OR_AND_FETCH_1:
+    case BUILT_IN_SYNC_OR_AND_FETCH_2:
+    case BUILT_IN_SYNC_OR_AND_FETCH_4:
+    case BUILT_IN_SYNC_OR_AND_FETCH_8:
+    case BUILT_IN_SYNC_OR_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_AND_AND_FETCH_1:
+    case BUILT_IN_SYNC_AND_AND_FETCH_2:
+    case BUILT_IN_SYNC_AND_AND_FETCH_4:
+    case BUILT_IN_SYNC_AND_AND_FETCH_8:
+    case BUILT_IN_SYNC_AND_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_XOR_AND_FETCH_1:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_2:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_4:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_8:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_NAND_AND_FETCH_1:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_2:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_4:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_8:
+
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_1:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_2:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_4:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_8:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_16:
+
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_1:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_2:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_4:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_8:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_16:
+
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_1:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_2:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_4:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_8:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_16:
+
+    case BUILT_IN_SYNC_LOCK_RELEASE_1:
+    case BUILT_IN_SYNC_LOCK_RELEASE_2:
+    case BUILT_IN_SYNC_LOCK_RELEASE_4:
+    case BUILT_IN_SYNC_LOCK_RELEASE_8:
+    case BUILT_IN_SYNC_LOCK_RELEASE_16:
+
+    case BUILT_IN_ATOMIC_TEST_AND_SET:
+    case BUILT_IN_ATOMIC_CLEAR:
+    case BUILT_IN_ATOMIC_EXCHANGE:
+    case BUILT_IN_ATOMIC_EXCHANGE_1:
+    case BUILT_IN_ATOMIC_EXCHANGE_2:
+    case BUILT_IN_ATOMIC_EXCHANGE_4:
+    case BUILT_IN_ATOMIC_EXCHANGE_8:
+    case BUILT_IN_ATOMIC_EXCHANGE_16:
+
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_2:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_8:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_16:
+
+    case BUILT_IN_ATOMIC_STORE:
+    case BUILT_IN_ATOMIC_STORE_1:
+    case BUILT_IN_ATOMIC_STORE_2:
+    case BUILT_IN_ATOMIC_STORE_4:
+    case BUILT_IN_ATOMIC_STORE_8:
+    case BUILT_IN_ATOMIC_STORE_16:
+
+    case BUILT_IN_ATOMIC_ADD_FETCH_1:
+    case BUILT_IN_ATOMIC_ADD_FETCH_2:
+    case BUILT_IN_ATOMIC_ADD_FETCH_4:
+    case BUILT_IN_ATOMIC_ADD_FETCH_8:
+    case BUILT_IN_ATOMIC_ADD_FETCH_16:
+
+    case BUILT_IN_ATOMIC_SUB_FETCH_1:
+    case BUILT_IN_ATOMIC_SUB_FETCH_2:
+    case BUILT_IN_ATOMIC_SUB_FETCH_4:
+    case BUILT_IN_ATOMIC_SUB_FETCH_8:
+    case BUILT_IN_ATOMIC_SUB_FETCH_16:
+
+    case BUILT_IN_ATOMIC_AND_FETCH_1:
+    case BUILT_IN_ATOMIC_AND_FETCH_2:
+    case BUILT_IN_ATOMIC_AND_FETCH_4:
+    case BUILT_IN_ATOMIC_AND_FETCH_8:
+    case BUILT_IN_ATOMIC_AND_FETCH_16:
+
+    case BUILT_IN_ATOMIC_NAND_FETCH_1:
+    case BUILT_IN_ATOMIC_NAND_FETCH_2:
+    case BUILT_IN_ATOMIC_NAND_FETCH_4:
+    case BUILT_IN_ATOMIC_NAND_FETCH_8:
+    case BUILT_IN_ATOMIC_NAND_FETCH_16:
+
+    case BUILT_IN_ATOMIC_XOR_FETCH_1:
+    case BUILT_IN_ATOMIC_XOR_FETCH_2:
+    case BUILT_IN_ATOMIC_XOR_FETCH_4:
+    case BUILT_IN_ATOMIC_XOR_FETCH_8:
+    case BUILT_IN_ATOMIC_XOR_FETCH_16:
+
+    case BUILT_IN_ATOMIC_OR_FETCH_1:
+    case BUILT_IN_ATOMIC_OR_FETCH_2:
+    case BUILT_IN_ATOMIC_OR_FETCH_4:
+    case BUILT_IN_ATOMIC_OR_FETCH_8:
+    case BUILT_IN_ATOMIC_OR_FETCH_16:
+
+    case BUILT_IN_ATOMIC_FETCH_ADD_1:
+    case BUILT_IN_ATOMIC_FETCH_ADD_2:
+    case BUILT_IN_ATOMIC_FETCH_ADD_4:
+    case BUILT_IN_ATOMIC_FETCH_ADD_8:
+    case BUILT_IN_ATOMIC_FETCH_ADD_16:
+
+    case BUILT_IN_ATOMIC_FETCH_SUB_1:
+    case BUILT_IN_ATOMIC_FETCH_SUB_2:
+    case BUILT_IN_ATOMIC_FETCH_SUB_4:
+    case BUILT_IN_ATOMIC_FETCH_SUB_8:
+    case BUILT_IN_ATOMIC_FETCH_SUB_16:
+
+    case BUILT_IN_ATOMIC_FETCH_AND_1:
+    case BUILT_IN_ATOMIC_FETCH_AND_2:
+    case BUILT_IN_ATOMIC_FETCH_AND_4:
+    case BUILT_IN_ATOMIC_FETCH_AND_8:
+    case BUILT_IN_ATOMIC_FETCH_AND_16:
+
+    case BUILT_IN_ATOMIC_FETCH_NAND_1:
+    case BUILT_IN_ATOMIC_FETCH_NAND_2:
+    case BUILT_IN_ATOMIC_FETCH_NAND_4:
+    case BUILT_IN_ATOMIC_FETCH_NAND_8:
+    case BUILT_IN_ATOMIC_FETCH_NAND_16:
+
+    case BUILT_IN_ATOMIC_FETCH_XOR_1:
+    case BUILT_IN_ATOMIC_FETCH_XOR_2:
+    case BUILT_IN_ATOMIC_FETCH_XOR_4:
+    case BUILT_IN_ATOMIC_FETCH_XOR_8:
+    case BUILT_IN_ATOMIC_FETCH_XOR_16:
+
+    case BUILT_IN_ATOMIC_FETCH_OR_1:
+    case BUILT_IN_ATOMIC_FETCH_OR_2:
+    case BUILT_IN_ATOMIC_FETCH_OR_4:
+    case BUILT_IN_ATOMIC_FETCH_OR_8:
+    case BUILT_IN_ATOMIC_FETCH_OR_16:
+      {
+	dest = gimple_call_arg (call, 0);
+	/* So DEST represents the address of a memory location.
+	   instrument_derefs wants the memory location, so lets
+	   dereference the address DEST before handing it to
+	   instrument_derefs.  */
+	if (TREE_CODE (dest) == ADDR_EXPR)
+	  dest = TREE_OPERAND (dest, 0);
+	else if (TREE_CODE (dest) == SSA_NAME)
+	  dest = build2 (MEM_REF, TREE_TYPE (TREE_TYPE (dest)),
+			 dest, build_int_cst (TREE_TYPE (dest), 0));
+	else
+	  gcc_unreachable ();
+
+	instrument_derefs (iter, dest, loc, is_store);
+	return true;
+      }
+
+    default:
+      /* The other builtins memory access are not instrumented in this
+	 function because they either don't have any length parameter,
+	 or their length parameter is just a limit.  */
+      break;
+    }
+
+  if (len != NULL_TREE)
+    {
+      if (source0 != NULL_TREE)
+	instrument_mem_region_access (source0, len, iter,
+				      loc, /*is_store=*/false);
+      if (source1 != NULL_TREE)
+	instrument_mem_region_access (source1, len, iter,
+				      loc, /*is_store=*/false);
+      else if (dest != NULL_TREE)
+	instrument_mem_region_access (dest, len, iter,
+				      loc, /*is_store=*/true);
+      return true;
+    }
+  return false;
+}
+
+/*  Instrument the assignment statement ITER if it is subject to
+    instrumentation.  */
+
+static void
+instrument_assignment (gimple_stmt_iterator *iter)
+{
+  gimple s = gsi_stmt (*iter);
+
+  gcc_assert (gimple_assign_single_p (s));
+
+  instrument_derefs (iter, gimple_assign_lhs (s),
+		     gimple_location (s), true);
+  instrument_derefs (iter, gimple_assign_rhs1 (s),
+		     gimple_location (s), false);
+}
+
+/* Instrument the function call pointed to by the iterator ITER, if it
+   is subject to instrumentation.  At the moment, the only function
+   calls that are instrumented are some built-in functions that access
+   memory.  Look at maybe_instrument_builtin_call to learn more.  */
+
+static void
+maybe_instrument_call (gimple_stmt_iterator *iter)
+{
+  maybe_instrument_builtin_call (iter);
 }
 
 /* asan: this looks too complex. Can this be done simpler? */
@@ -672,13 +1227,12 @@ transform_statements (void)
       if (bb->index >= saved_last_basic_block) continue;
       for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
         {
-          gimple s = gsi_stmt (i);
-          if (!gimple_assign_single_p (s))
-	    continue;
-          instrument_derefs (&i, gimple_assign_lhs (s),
-                             gimple_location (s), true);
-          instrument_derefs (&i, gimple_assign_rhs1 (s),
-                             gimple_location (s), false);
+	  gimple s = gsi_stmt (i);
+
+	  if (gimple_assign_single_p (s))
+	    instrument_assignment (&i);
+	  else if (is_gimple_call (s))
+	    maybe_instrument_call (&i);
         }
     }
 }
-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH 10/10] Import the asan runtime library into GCC tree
  2012-11-02 22:53 ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
                     ` (8 preceding siblings ...)
  2012-11-02 23:05   ` [PATCH 09/10] Instrument built-in memory access function calls Dodji Seketeli
@ 2012-11-03  8:22   ` Dodji Seketeli
       [not found]   ` <87fw4r7g8w.fsf_-_@redhat.com>
  2012-11-12 16:07   ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
  11 siblings, 0 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-03  8:22 UTC (permalink / raw)
  To: gcc-patches; +Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

[After multiple failed attempts at compressing this huge patch enough to
 let it pass through the drastic mailing engine's 400KB size limit, I am
 sending a link to the patch (at the end of this message) to let people
 download it instead.  Sorry for spamming the people in the CC list.]

This patch imports the runtime library in the GCC tree, ensures that
-lasan is passed to the linker when -faddress-sanitizer is used and
sets up the build system accordingly.

        * configure.ac: Add libsanitizer to target_libraries.
	* Makefile.def: Ditto.
	* configure: Regenerate.
	* Makefile.in: Regenerate.
	* libsanitizer: New directory for asan runtime.  Contains an empty
	tsan directory.

gcc:
	* gcc.c (LINK_COMMAND_SPEC): Add -lasan to link command
	if -faddress-sanitizer is on.

libsanitizer:

	Initial checkin: migrate asan runtime from llvm.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192977 138bc75d-0d04-0410-961f-82ee72b054a4
---
 ChangeLog.asan                                     |    16 +
 Makefile.def                                       |     2 +
 Makefile.in                                        |   487 +-
 configure                                          |     1 +
 configure.ac                                       |     1 +
 gcc/ChangeLog.asan                                 |     5 +
 gcc/gcc.c                                          |     1 +
 libsanitizer/ChangeLog.asan                        |     3 +
 libsanitizer/LICENSE.TXT                           |    97 +
 libsanitizer/Makefile.am                           |    46 +
 libsanitizer/Makefile.in                           |   773 +
 libsanitizer/README.gcc                            |     4 +
 libsanitizer/aclocal.m4                            |  9599 ++++++++++
 libsanitizer/asan/Makefile.am                      |    76 +
 libsanitizer/asan/Makefile.in                      |   631 +
 libsanitizer/asan/asan_allocator.cc                |  1045 ++
 libsanitizer/asan/asan_allocator.h                 |   177 +
 libsanitizer/asan/asan_flags.h                     |   103 +
 libsanitizer/asan/asan_globals.cc                  |   206 +
 libsanitizer/asan/asan_intercepted_functions.h     |   217 +
 libsanitizer/asan/asan_interceptors.cc             |   704 +
 libsanitizer/asan/asan_interceptors.h              |    39 +
 libsanitizer/asan/asan_internal.h                  |   169 +
 libsanitizer/asan/asan_linux.cc                    |   150 +
 libsanitizer/asan/asan_lock.h                      |    40 +
 libsanitizer/asan/asan_mac.cc                      |   526 +
 libsanitizer/asan/asan_mac.h                       |    54 +
 libsanitizer/asan/asan_malloc_linux.cc             |   142 +
 libsanitizer/asan/asan_malloc_mac.cc               |   427 +
 libsanitizer/asan/asan_malloc_win.cc               |   140 +
 libsanitizer/asan/asan_mapping.h                   |   120 +
 libsanitizer/asan/asan_new_delete.cc               |    54 +
 libsanitizer/asan/asan_poisoning.cc                |   151 +
 libsanitizer/asan/asan_posix.cc                    |   118 +
 libsanitizer/asan/asan_report.cc                   |   492 +
 libsanitizer/asan/asan_report.h                    |    51 +
 libsanitizer/asan/asan_rtl.cc                      |   404 +
 libsanitizer/asan/asan_stack.cc                    |    35 +
 libsanitizer/asan/asan_stack.h                     |    52 +
 libsanitizer/asan/asan_stats.cc                    |    86 +
 libsanitizer/asan/asan_stats.h                     |    65 +
 libsanitizer/asan/asan_thread.cc                   |   153 +
 libsanitizer/asan/asan_thread.h                    |   103 +
 libsanitizer/asan/asan_thread_registry.cc          |   188 +
 libsanitizer/asan/asan_thread_registry.h           |    83 +
 libsanitizer/asan/asan_win.cc                      |   190 +
 libsanitizer/asan/libtool-version                  |     6 +
 libsanitizer/config.guess                          |  1530 ++
 libsanitizer/config.sub                            |  1773 ++
 libsanitizer/configure                             | 17589 +++++++++++++++++++
 libsanitizer/configure.ac                          |    42 +
 libsanitizer/depcomp                               |   630 +
 libsanitizer/include/sanitizer/asan_interface.h    |   197 +
 .../include/sanitizer/common_interface_defs.h      |    66 +
 libsanitizer/install-sh                            |   527 +
 libsanitizer/interception/Makefile.am              |    59 +
 libsanitizer/interception/Makefile.in              |   535 +
 libsanitizer/interception/interception.h           |   195 +
 libsanitizer/interception/interception_linux.cc    |    28 +
 libsanitizer/interception/interception_linux.h     |    35 +
 libsanitizer/interception/interception_mac.cc      |    29 +
 libsanitizer/interception/interception_mac.h       |    47 +
 libsanitizer/interception/interception_win.cc      |   149 +
 libsanitizer/interception/interception_win.h       |    43 +
 libsanitizer/libtool-version                       |     6 +
 libsanitizer/ltmain.sh                             |  9661 ++++++++++
 libsanitizer/missing                               |   376 +
 libsanitizer/sanitizer_common/Makefile.am          |    71 +
 libsanitizer/sanitizer_common/Makefile.in          |   564 +
 .../sanitizer_common/sanitizer_allocator.cc        |    83 +
 .../sanitizer_common/sanitizer_allocator64.h       |   573 +
 libsanitizer/sanitizer_common/sanitizer_atomic.h   |    63 +
 .../sanitizer_common/sanitizer_atomic_clang.h      |   120 +
 .../sanitizer_common/sanitizer_atomic_msvc.h       |   134 +
 libsanitizer/sanitizer_common/sanitizer_common.cc  |   151 +
 libsanitizer/sanitizer_common/sanitizer_common.h   |   181 +
 libsanitizer/sanitizer_common/sanitizer_flags.cc   |    95 +
 libsanitizer/sanitizer_common/sanitizer_flags.h    |    25 +
 .../sanitizer_common/sanitizer_internal_defs.h     |   186 +
 libsanitizer/sanitizer_common/sanitizer_libc.cc    |   189 +
 libsanitizer/sanitizer_common/sanitizer_libc.h     |    69 +
 libsanitizer/sanitizer_common/sanitizer_linux.cc   |   296 +
 libsanitizer/sanitizer_common/sanitizer_list.h     |   118 +
 libsanitizer/sanitizer_common/sanitizer_mac.cc     |   249 +
 libsanitizer/sanitizer_common/sanitizer_mutex.h    |   106 +
 .../sanitizer_common/sanitizer_placement_new.h     |    31 +
 libsanitizer/sanitizer_common/sanitizer_posix.cc   |   187 +
 libsanitizer/sanitizer_common/sanitizer_printf.cc  |   196 +
 libsanitizer/sanitizer_common/sanitizer_procmaps.h |    95 +
 .../sanitizer_common/sanitizer_stackdepot.cc       |   194 +
 .../sanitizer_common/sanitizer_stackdepot.h        |    27 +
 .../sanitizer_common/sanitizer_stacktrace.cc       |   245 +
 .../sanitizer_common/sanitizer_stacktrace.h        |    73 +
 .../sanitizer_common/sanitizer_symbolizer.cc       |   311 +
 .../sanitizer_common/sanitizer_symbolizer.h        |    97 +
 .../sanitizer_common/sanitizer_symbolizer_linux.cc |   162 +
 .../sanitizer_common/sanitizer_symbolizer_mac.cc   |    31 +
 .../sanitizer_common/sanitizer_symbolizer_win.cc   |    33 +
 libsanitizer/sanitizer_common/sanitizer_win.cc     |   205 +
 99 files changed, 56908 insertions(+), 1 deletion(-)
 create mode 100644 ChangeLog.asan
 create mode 100644 libsanitizer/ChangeLog.asan
 create mode 100644 libsanitizer/LICENSE.TXT
 create mode 100644 libsanitizer/Makefile.am
 create mode 100644 libsanitizer/Makefile.in
 create mode 100644 libsanitizer/README.gcc
 create mode 100644 libsanitizer/aclocal.m4
 create mode 100644 libsanitizer/asan/Makefile.am
 create mode 100644 libsanitizer/asan/Makefile.in
 create mode 100644 libsanitizer/asan/asan_allocator.cc
 create mode 100644 libsanitizer/asan/asan_allocator.h
 create mode 100644 libsanitizer/asan/asan_flags.h
 create mode 100644 libsanitizer/asan/asan_globals.cc
 create mode 100644 libsanitizer/asan/asan_intercepted_functions.h
 create mode 100644 libsanitizer/asan/asan_interceptors.cc
 create mode 100644 libsanitizer/asan/asan_interceptors.h
 create mode 100644 libsanitizer/asan/asan_internal.h
 create mode 100644 libsanitizer/asan/asan_linux.cc
 create mode 100644 libsanitizer/asan/asan_lock.h
 create mode 100644 libsanitizer/asan/asan_mac.cc
 create mode 100644 libsanitizer/asan/asan_mac.h
 create mode 100644 libsanitizer/asan/asan_malloc_linux.cc
 create mode 100644 libsanitizer/asan/asan_malloc_mac.cc
 create mode 100644 libsanitizer/asan/asan_malloc_win.cc
 create mode 100644 libsanitizer/asan/asan_mapping.h
 create mode 100644 libsanitizer/asan/asan_new_delete.cc
 create mode 100644 libsanitizer/asan/asan_poisoning.cc
 create mode 100644 libsanitizer/asan/asan_posix.cc
 create mode 100644 libsanitizer/asan/asan_report.cc
 create mode 100644 libsanitizer/asan/asan_report.h
 create mode 100644 libsanitizer/asan/asan_rtl.cc
 create mode 100644 libsanitizer/asan/asan_stack.cc
 create mode 100644 libsanitizer/asan/asan_stack.h
 create mode 100644 libsanitizer/asan/asan_stats.cc
 create mode 100644 libsanitizer/asan/asan_stats.h
 create mode 100644 libsanitizer/asan/asan_thread.cc
 create mode 100644 libsanitizer/asan/asan_thread.h
 create mode 100644 libsanitizer/asan/asan_thread_registry.cc
 create mode 100644 libsanitizer/asan/asan_thread_registry.h
 create mode 100644 libsanitizer/asan/asan_win.cc
 create mode 100644 libsanitizer/asan/libtool-version
 create mode 100644 libsanitizer/config.guess
 create mode 100644 libsanitizer/config.sub
 create mode 100755 libsanitizer/configure
 create mode 100644 libsanitizer/configure.ac
 create mode 100644 libsanitizer/depcomp
 create mode 100644 libsanitizer/include/sanitizer/asan_interface.h
 create mode 100644 libsanitizer/include/sanitizer/common_interface_defs.h
 create mode 100644 libsanitizer/install-sh
 create mode 100644 libsanitizer/interception/Makefile.am
 create mode 100644 libsanitizer/interception/Makefile.in
 create mode 100644 libsanitizer/interception/interception.h
 create mode 100644 libsanitizer/interception/interception_linux.cc
 create mode 100644 libsanitizer/interception/interception_linux.h
 create mode 100644 libsanitizer/interception/interception_mac.cc
 create mode 100644 libsanitizer/interception/interception_mac.h
 create mode 100644 libsanitizer/interception/interception_win.cc
 create mode 100644 libsanitizer/interception/interception_win.h
 create mode 100644 libsanitizer/libtool-version
 create mode 100644 libsanitizer/ltmain.sh
 create mode 100644 libsanitizer/missing
 create mode 100644 libsanitizer/sanitizer_common/Makefile.am
 create mode 100644 libsanitizer/sanitizer_common/Makefile.in
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_allocator.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_allocator64.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_atomic.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_atomic_clang.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_atomic_msvc.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_common.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_common.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_flags.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_flags.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_internal_defs.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_libc.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_libc.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_linux.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_list.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_mac.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_mutex.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_placement_new.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_posix.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_printf.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_procmaps.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_stackdepot.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_stackdepot.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_stacktrace.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_stacktrace.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_symbolizer.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_symbolizer.h
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_symbolizer_linux.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_symbolizer_mac.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_symbolizer_win.cc
 create mode 100644 libsanitizer/sanitizer_common/sanitizer_win.cc

http://people.redhat.com/~dseketel/gcc/patches/0010-Import-the-asan-runtime-library-into-GCC-tree.patch

-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 08/10] Factorize condition insertion code out of build_check_stmt
  2012-11-02 23:03   ` [PATCH 08/10] Factorize condition insertion code out of build_check_stmt Dodji Seketeli
@ 2012-11-05 15:50     ` Jakub Jelinek
  2012-11-05 20:25       ` Dodji Seketeli
  2012-11-06 17:30     ` Diego Novillo
  1 sibling, 1 reply; 80+ messages in thread
From: Jakub Jelinek @ 2012-11-05 15:50 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: gcc-patches, dnovillo, wmi, davidxl, konstantin.s.serebryany

On Sat, Nov 03, 2012 at 12:03:45AM +0100, Dodji Seketeli wrote:
> +  int fallthrough_probability =
> +    then_more_likely_p
> +    ? PROB_VERY_UNLIKELY
> +    : PROB_ALWAYS - PROB_VERY_UNLIKELY;

Just a formatting nit, I think = needs to go on the next line, so

  int fallthrough_probability
    = then_more_likely_p
      ? PROB_VERY_UNLIKELY
      : PROB_ALWAYS - PROB_VERY_UNLIKELY;

No need to repost for that.

	Jakub

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 08/10] Factorize condition insertion code out of build_check_stmt
  2012-11-05 15:50     ` Jakub Jelinek
@ 2012-11-05 20:25       ` Dodji Seketeli
  0 siblings, 0 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-05 20:25 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: gcc-patches, dnovillo, wmi, davidxl, konstantin.s.serebryany

Jakub Jelinek <jakub@redhat.com> writes:

> On Sat, Nov 03, 2012 at 12:03:45AM +0100, Dodji Seketeli wrote:
>> +  int fallthrough_probability =
>> +    then_more_likely_p
>> +    ? PROB_VERY_UNLIKELY
>> +    : PROB_ALWAYS - PROB_VERY_UNLIKELY;
>
> Just a formatting nit, I think = needs to go on the next line, so
>
>   int fallthrough_probability
>     = then_more_likely_p
>       ? PROB_VERY_UNLIKELY
>       : PROB_ALWAYS - PROB_VERY_UNLIKELY;

OK.  Patch updated in my local tree.

> No need to repost for that.

OK, thanks.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 01/10] Initial import of asan from the Google branch into trunk
  2012-11-02 22:56   ` [PATCH 01/10] Initial import of asan from the Google branch into trunk Dodji Seketeli
@ 2012-11-06 17:04     ` Diego Novillo
  2012-11-09 13:14     ` Tobias Burnus
  1 sibling, 0 replies; 80+ messages in thread
From: Diego Novillo @ 2012-11-06 17:04 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: gcc-patches, jakub, wmi, davidxl, konstantin.s.serebryany

On 2012-11-02 15:56 , Dodji Seketeli wrote:
> This patch imports the initial state of asan as it was in the
> Google branch.
>
> It provides basic infrastructure for asan to instrument memory
> accesses on the heap, at -O3.  Note that it supports neither stack nor
> global variable protection.
>
> The rest of the patches of the set is intended to further improve this
> base.
>
> 	* Makefile.in: Add asan.c and its dependencies.
> 	* common.opt: Add -fasan option.
> 	* invoke.texi: Document the new flag.
> 	* passes.c: Add the asan pass.
> 	* toplev.c (compile_file): Call asan_finish_file.
> 	* asan.c: New file.
> 	* asan.h: New file.
> 	* tree-pass.h: Declare pass_asan.

OK.


Diego.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 02/10] Initial asan cleanups
  2012-11-02 22:57   ` [PATCH 02/10] Initial asan cleanups Dodji Seketeli
@ 2012-11-06 17:04     ` Diego Novillo
  2012-11-12 11:12       ` Dodji Seketeli
  0 siblings, 1 reply; 80+ messages in thread
From: Diego Novillo @ 2012-11-06 17:04 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: gcc-patches, jakub, wmi, davidxl, konstantin.s.serebryany

On 2012-11-02 15:57 , Dodji Seketeli wrote:

>   /* AddressSanitizer, a fast memory error detector.
> -   Copyright (C) 2011 Free Software Foundation, Inc.
> +   Copyright (C) 2011, 2012 Free Software Foundation, Inc.

I *think* we should only mention 2012, but I don't know if code in 
branches counts for the copyright years.


> +  /* Address Sanitizer needs porting to each target architecture.  */
> +  if (flag_asan && targetm.asan_shadow_offset == NULL)
> +    {
> +      warning (0, "-fasan not supported for this target");

Hm, ASAN's flag is now -fsanitizer=[asan,tsan,memory] or some such.  We 
will need to make that change.  But it can wait until after the initial 
port is in trunk.

This patch is OK.


Diego.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 03/10] Emit GIMPLE directly instead of gimplifying GENERIC.
  2012-11-02 22:58   ` [PATCH 03/10] Emit GIMPLE directly instead of gimplifying GENERIC Dodji Seketeli
@ 2012-11-06 17:08     ` Diego Novillo
  0 siblings, 0 replies; 80+ messages in thread
From: Diego Novillo @ 2012-11-06 17:08 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: gcc-patches, jakub, wmi, davidxl, konstantin.s.serebryany

On 2012-11-02 15:57 , Dodji Seketeli wrote:

> 	* Makefile.in (GTFILES): Add $(srcdir)/asan.c.
> 	(asan.o): Update the dependencies of asan.o.
> 	* asan.c (tm.h, tree.h, tm_p.h, basic-block.h, flags.h
> 	function.h, tree-inline.h, tree-dump.h, diagnostic.h, demangle.h,
> 	langhooks.h, ggc.h, cgraph.h, gimple.h): Remove these unused but
> 	included headers.
> 	(shadow_ptr_types): New variable.
> 	(report_error_func): Change is_store argument to bool, don't append
> 	newline to function name.
> 	(PROB_VERY_UNLIKELY, PROB_ALWAYS): Define.
> 	(build_check_stmt): Change is_store argument to bool.  Emit GIMPLE
> 	directly instead of creating trees and gimplifying them.  Mark
> 	the error reporting function as very unlikely.
> 	(instrument_derefs): Change is_store argument to bool.  Use
> 	int_size_in_bytes to compute size_in_bytes, simplify size check.
> 	Use build_fold_addr_expr instead of build_addr.
> 	(transform_statements): Adjust instrument_derefs caller.
> 	Use gimple_assign_single_p as stmt test.  Don't look at MEM refs
> 	in rhs2.
> 	(asan_init_shadow_ptr_types): New function.
> 	(asan_instrument): Don't push/pop gimplify context.
> 	Call asan_init_shadow_ptr_types if not yet initialized.
> 	* asan.h (ASAN_SHADOW_SHIFT): Adjust comment.

OK.


Diego.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 04/10] Allow asan at -O0
  2012-11-02 22:59   ` [PATCH 04/10] Allow asan at -O0 Dodji Seketeli
@ 2012-11-06 17:12     ` Diego Novillo
  0 siblings, 0 replies; 80+ messages in thread
From: Diego Novillo @ 2012-11-06 17:12 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: gcc-patches, jakub, wmi, davidxl, konstantin.s.serebryany

On 2012-11-02 15:58 , Dodji Seketeli wrote:
> This patch defines a new asan pass gate that is activated at -O0, in
> addition to the pass that was initially activated at -O3 level The
> patch also does some comment cleanups here and there.
>
> 	* asan.c (build_check_stmt): Rename join_bb variable to else_bb.
> 	(gate_asan_O0): New function.
> 	(pass_asan_O0): New variable.
> 	* passes.c (init_optimization_passes): Add pass_asan_O0.
> 	* tree-pass.h (pass_asan_O0): New declaration.

OK.


Diego.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 05/10] Implement protection of stack variables
  2012-11-02 23:00   ` [PATCH 05/10] Implement protection of stack variables Dodji Seketeli
@ 2012-11-06 17:22     ` Diego Novillo
  2012-11-12 11:31       ` Dodji Seketeli
  0 siblings, 1 reply; 80+ messages in thread
From: Diego Novillo @ 2012-11-06 17:22 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: gcc-patches, jakub, wmi, davidxl, konstantin.s.serebryany

On 2012-11-02 16:00 , Dodji Seketeli wrote:
> This patch implements the protection of stack variables.
>
> To understand how this works, lets look at this example on x86_64
> where the stack grows downward:
>
>   int
>   foo ()
>   {
>     char a[23] = {0};
>     int b[2] = {0};
>
>     a[5] = 1;
>     b[1] = 2;
>
>     return a[5] + b[1];
>   }
>
> For this function, the stack protected by asan will be organized as
> follows, from the top of the stack to the bottom:
>
> Slot 1/ [red zone of 32 bytes called 'RIGHT RedZone']
>
> Slot 2/ [24 bytes for variable 'a']
>
> Slot 3/ [8 bytes of red zone, that adds up to the space of 'a' to make
>           the next slot be 32 bytes aligned; this one is called Partial
>           Redzone; this 32 bytes alignment is an asan constraint]
>
> Slot 4/ [red zone of 32 bytes called 'Middle RedZone']
>
> Slot 5/ [8 bytes for variable 'b']
>
> Slot 6/ [24 bytes of Partial Red Zone (similar to slot 3]
>
> Slot 7/ [32 bytes of Red Zone at the bottom of the stack, called 'LEFT
>           RedZone']
>
> [A cultural question I've kept asking myself is Why has address
>   sanitizer authors called these red zones (LEFT, MIDDLE, RIGHT)
>   instead of e.g, (BOTTOM, MIDDLE, TOP).  Maybe they can step up and
>   educate me so that I get less confused in the future.  :-)]

I believe they layout the stack from right to left (top is to the 
right).  Feels like reading a middle earth map.  Kostya, is my 
recollection correct?

> The 32 bytes of LEFT red zone at the bottom of the stack can be
> decomposed as such:
>
>      1/ The first 8 bytes contain a magical asan number that is always
>      0x41B58AB3.
>
>      2/ The following 8 bytes contains a pointer to a string (to be
>      parsed at runtime by the runtime asan library), which format is
>      the following:
>
>       "<function-name> <space> <num-of-variables-on-the-stack>
>       (<32-bytes-aligned-offset-in-bytes-of-variable> <space>
>       <length-of-var-in-bytes> ){n} "
>
> 	where '(...){n}' means the content inside the parenthesis occurs 'n'
> 	times, with 'n' being the number of variables on the stack.
>
>       3/ The following 16 bytes of the red zone have no particular
>       format.
>
> The shadow memory for that stack layout is going to look like this:
>
>      - content of shadow memory 8 bytes for slot 7: 0xFFFFFFFFF1F1F1F1.
>        The F1 byte pattern is a magic number called
>        ASAN_STACK_MAGIC_LEFT and is a way for the runtime to know that
>        the memory for that shadow byte is part of a the LEFT red zone
>        intended to seat at the bottom of the variables on the stack.
>
>      - content of shadow memory 8 bytes for slots 6 and 5:
>        0xFFFFFFFFF4F4F400.  The F4 byte pattern is a magic number
>        called ASAN_STACK_MAGIC_PARTIAL.  It flags the fact that the
>        memory region for this shadow byte is a PARTIAL red zone
>        intended to pad a variable A, so that the slot following
>        {A,padding} is 32 bytes aligned.
>
>        Note that the fact that the least significant byte of this
>        shadow memory content is 00 means that 8 bytes of its
>        corresponding memory (which corresponds to the memory of
>        variable 'b') is addressable.
>
>      - content of shadow memory 8 bytes for slot 4: 0xFFFFFFFFF2F2F2F2.
>        The F2 byte pattern is a magic number called
>        ASAN_STACK_MAGIC_MIDDLE.  It flags the fact that the memory
>        region for this shadow byte is a MIDDLE red zone intended to
>        seat between two 32 aligned slots of {variable,padding}.
>
>      - content of shadow memory 8 bytes for slot 3 and 2:
>        0xFFFFFFFFF4000000.  This represents is the concatenation of
>        variable 'a' and the partial red zone following it, like what we
>        had for variable 'b'.  The least significant 3 bytes being 00
>        means that the 3 bytes of variable 'a' are addressable.
>
>      - content of shadow memory 8 bytes for slot 1: 0xFFFFFFFFF3F3F3F3.
>        The F3 byte pattern is a magic number called
>        ASAN_STACK_MAGIC_RIGHT.  It flags the fact that the memory
>        region for this shadow byte is a RIGHT red zone intended to seat
>        at the top of the variables of the stack.
>

This is a great summary.  Please put it at the top of asan.c or in some 
other prominent place.


> -	  offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
> +	  if (flag_asan && pred)
> +	    {
> +	      HOST_WIDE_INT prev_offset = frame_offset;
> +	      tree repr_decl = NULL_TREE;
> +
> +	      offset
> +		= alloc_stack_frame_space (stack_vars[i].size
> +					   + ASAN_RED_ZONE_SIZE,
> +					   MAX (alignb, ASAN_RED_ZONE_SIZE));
> +	      VEC_safe_push (HOST_WIDE_INT, heap, data->asan_vec,
> +			     prev_offset);
> +	      VEC_safe_push (HOST_WIDE_INT, heap, data->asan_vec,
> +			     offset + stack_vars[i].size);

Oh, gee, thanks.  More VEC() code for me to convert ;)


The patch is OK.


Diego.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 06/10] Implement protection of global variables
  2012-11-02 23:01   ` [PATCH 06/10] Implement protection of global variables Dodji Seketeli
@ 2012-11-06 17:27     ` Diego Novillo
  2012-11-12 11:32       ` Dodji Seketeli
  0 siblings, 1 reply; 80+ messages in thread
From: Diego Novillo @ 2012-11-06 17:27 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: gcc-patches, jakub, wmi, davidxl, konstantin.s.serebryany

On 2012-11-02 16:01 , Dodji Seketeli wrote:

> 	* varasm.c: Include asan.h.
> 	(assemble_noswitch_variable): Grow size by asan_red_zone_size
> 	if decl is asan protected.
> 	(place_block_symbol): Likewise.
> 	(assemble_variable): If decl is asan protected, increase
> 	DECL_ALIGN if needed, and for decls emitted using
> 	assemble_variable_contents append padding zeros after it.
> 	* Makefile.in (varasm.o): Depend on asan.h.
> 	* asan.c: Include output.h.
> 	(asan_pp, asan_pp_initialized, asan_ctor_statements): New variables.
> 	(asan_pp_initialize, asan_pp_string): New functions.
> 	(asan_emit_stack_protection): Use asan_pp{,_initialized}
> 	instead of local pp{,_initialized} vars, use asan_pp_initialize
> 	and asan_pp_string helpers.
> 	(asan_needs_local_alias, asan_protect_global,
> 	asan_global_struct, asan_add_global): New functions.
> 	(asan_finish_file): Protect global vars that can be protected. Use
> 	asan_ctor_statements instead of ctor_statements
> 	* asan.h (asan_protect_global): New prototype.
> 	(asan_red_zone_size): New inline function.

OK.

Please, also put the high-level description in asan.c's documentation.


Diego.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 07/10] Make build_check_stmt accept an SSA_NAME for its base
  2012-11-02 23:02   ` [PATCH 07/10] Make build_check_stmt accept an SSA_NAME for its base Dodji Seketeli
@ 2012-11-06 17:28     ` Diego Novillo
  0 siblings, 0 replies; 80+ messages in thread
From: Diego Novillo @ 2012-11-06 17:28 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: gcc-patches, jakub, wmi, davidxl, konstantin.s.serebryany

On 2012-11-02 16:02 , Dodji Seketeli wrote:

> 	* asan.c (build_check_stmt): Accept the memory access to be
> 	represented by an SSA_NAME.

OK.


Diego.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 08/10] Factorize condition insertion code out of build_check_stmt
  2012-11-02 23:03   ` [PATCH 08/10] Factorize condition insertion code out of build_check_stmt Dodji Seketeli
  2012-11-05 15:50     ` Jakub Jelinek
@ 2012-11-06 17:30     ` Diego Novillo
  1 sibling, 0 replies; 80+ messages in thread
From: Diego Novillo @ 2012-11-06 17:30 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: gcc-patches, jakub, wmi, davidxl, konstantin.s.serebryany

On 2012-11-02 16:03 , Dodji Seketeli wrote:

> 	* asan.c (create_cond_insert_point_before_iter): Factorize out of ...
> 	(build_check_stmt): ... here.

OK.


Diego.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 09/10] Instrument built-in memory access function calls
  2012-11-02 23:05   ` [PATCH 09/10] Instrument built-in memory access function calls Dodji Seketeli
@ 2012-11-06 17:37     ` Diego Novillo
  2012-11-12 11:40       ` Dodji Seketeli
  0 siblings, 1 reply; 80+ messages in thread
From: Diego Novillo @ 2012-11-06 17:37 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: gcc-patches, jakub, wmi, davidxl, konstantin.s.serebryany

On 2012-11-02 16:05 , Dodji Seketeli wrote:

> +static bool
> +maybe_instrument_builtin_call (gimple_stmt_iterator *iter)
> +{
> +  gimple call = gsi_stmt (*iter);
> +  location_t loc = gimple_location (call);
> +
> +  if (!is_gimple_call (call))
> +    return false;

Nit.  Why not factor this out and change the caller to:

if (is_builtin_call (stmt))
    instrument_builtin_call (stmt);

I don't much like functions that do many combined things.


OK, otherwise.


Diego.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 10/10] Import the asan runtime library into GCC tree
       [not found]   ` <87fw4r7g8w.fsf_-_@redhat.com>
@ 2012-11-06 17:41     ` Diego Novillo
  2012-11-12 11:47       ` Dodji Seketeli
  0 siblings, 1 reply; 80+ messages in thread
From: Diego Novillo @ 2012-11-06 17:41 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: gcc-patches, jakub, wmi, davidxl, konstantin.s.serebryany

On 2012-11-02 16:10 , Dodji Seketeli wrote:

>          * configure.ac: Add libsanitizer to target_libraries.
> 	* Makefile.def: Ditto.
> 	* configure: Regenerate.
> 	* Makefile.in: Regenerate.
> 	* libsanitizer: New directory for asan runtime.  Contains an empty
> 	tsan directory.
>
> gcc:
> 	* gcc.c (LINK_COMMAND_SPEC): Add -lasan to link command
> 	if -faddress-sanitizer is on.

OK with Jakub's comments addressed.

References to -fasan in diagnostics should be replaced.  But there's 
been another flag name change upstream, so let's do it together with the 
new flag names.


Diego.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 01/10] Initial import of asan from the Google branch into trunk
  2012-11-02 22:56   ` [PATCH 01/10] Initial import of asan from the Google branch into trunk Dodji Seketeli
  2012-11-06 17:04     ` Diego Novillo
@ 2012-11-09 13:14     ` Tobias Burnus
  2012-11-09 13:58       ` Jakub Jelinek
                         ` (3 more replies)
  1 sibling, 4 replies; 80+ messages in thread
From: Tobias Burnus @ 2012-11-09 13:14 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: gcc-patches, dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany

[-- Attachment #1: Type: text/plain, Size: 1417 bytes --]

Dodji Seketeli wrote:
> This patch imports the initial state of asan as it was in the
> Google branch.
>
> It provides basic infrastructure for asan to instrument memory
> accesses on the heap, at -O3.  Note that it supports neither stack nor
> global variable protection.

I tried the 01/10 to 10/10 patch series but it doesn't trigger for the 
following test case:

#include <stdlib.h>
#include <stdio.h>

int
main() {
   int *i;
   i = malloc(10*sizeof(*i));
   free(i);  /* <<< Free memory. */
   i[10] = 5;  /* <<< out of boundary even if not freed. */
   printf("%d\n", i[11]);  /* <<< out of boundary even if not freed. */
   return 0;
}

(All of them are reported by Clang.) If I look at the dump (or 
assembler), I see the call to __asan_init, __asan_report_store4 and 
__asan_report_load4. However, when running the program ltrace only shows 
the calls to: __libc_start_main, __asan_init, malloc, free and printf. I 
haven't debugged why the condition is false [see attachment for the dump].


Other issues:

* libasan does not seem to be a multilib, at least I only find the 64bit 
version on x86-64-gnu-linux such that "-m32" compilation fails.

* -fno-address-sanitizer doesn't work (it does in Clang); it is 
explicitly disabled via RejectNegative in gcc/common.opt

* Probably fixed on the branch: gcc/gcc.c still has "fasan" instead of 
"faddress-sanitizer" for the spec:
+    %{fasan:-lasan}

Tobias

[-- Attachment #2: hjf.c --]
[-- Type: text/x-csrc, Size: 271 bytes --]

#include <stdlib.h>
#include <stdio.h>

int
main() {
  int *i;
  i = malloc(10*sizeof(*i));
  free(i);  /* <<< Free memory. */
  i[10] = 5;  /* <<< out of boundary even if not freed. */
  printf("%d\n", i[11]);  /* <<< out of boundary even if not freed. */
  return 0;
}

[-- Attachment #3: hjf.c.156t.asan0 --]
[-- Type: text/plain, Size: 1649 bytes --]


;; Function main (main, funcdef_no=2, decl_uid=2680, cgraph_uid=2)

main ()
{
  int * i;
  int D.2687;
  int D.2686;
  int * D.2685;
  int * D.2684;
  int * _2;
  int * _3;
  int _4;
  int _5;
  unsigned long _6;
  unsigned long _7;
  unsigned long _8;
  unsigned char * _9;
  unsigned char _10;
  _Bool _11;
  unsigned long _12;
  unsigned char _13;
  unsigned char _14;
  _Bool _15;
  _Bool _16;
  unsigned long _17;
  unsigned long _18;
  unsigned long _19;
  unsigned char * _20;
  unsigned char _21;
  _Bool _22;
  unsigned long _23;
  unsigned char _24;
  unsigned char _25;
  _Bool _26;
  _Bool _27;

  <bb 2>:
  i_1 = malloc (40);
  free (i_1);
  _2 = i_1 + 40;
  _6 = (unsigned long) _2;
  _7 = _6 >> 3;
  _8 = _7 + 17592186044416;
  _9 = (unsigned char *) _8;
  _10 = *_9;
  _11 = _10 != 0;
  _12 = _6 & 7;
  _13 = (unsigned char) _12;
  _14 = _13 + 3;
  _15 = _14 >= _10;
  _16 = _11 & _15;
  if (_16 != 0)
    goto <bb 5>;
  else
    goto <bb 4>;

  <bb 5>:
  __asan_report_store4 (_6);

  <bb 4>:
  *_2 = 5;
  _3 = i_1 + 44;
  _17 = (unsigned long) _3;
  _18 = _17 >> 3;
  _19 = _18 + 17592186044416;
  _20 = (unsigned char *) _19;
  _21 = *_20;
  _22 = _21 != 0;
  _23 = _17 & 7;
  _24 = (unsigned char) _23;
  _25 = _24 + 3;
  _26 = _25 >= _21;
  _27 = _22 & _26;
  if (_27 != 0)
    goto <bb 7>;
  else
    goto <bb 6>;

  <bb 7>:
  __asan_report_load4 (_17);

  <bb 6>:
  _4 = *_3;
  printf ("%d\n", _4);
  _5 = 0;

<L0>:
  return _5;

}



;; Function _GLOBAL__sub_I_00099_0_main (_GLOBAL__sub_I_00099_0_main, funcdef_no=3, decl_uid=2700, cgraph_uid=0)

_GLOBAL__sub_I_00099_0_main ()
{
  <bb 2>:
  __asan_init ();
  return;

}



^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 01/10] Initial import of asan from the Google branch into trunk
  2012-11-09 13:14     ` Tobias Burnus
@ 2012-11-09 13:58       ` Jakub Jelinek
  2012-11-09 16:53         ` Xinliang David Li
  2012-11-09 17:13         ` Tobias Burnus
  2012-11-09 17:18       ` Wei Mi
                         ` (2 subsequent siblings)
  3 siblings, 2 replies; 80+ messages in thread
From: Jakub Jelinek @ 2012-11-09 13:58 UTC (permalink / raw)
  To: Tobias Burnus
  Cc: Dodji Seketeli, gcc-patches, dnovillo, wmi, davidxl,
	konstantin.s.serebryany

On Fri, Nov 09, 2012 at 02:14:04PM +0100, Tobias Burnus wrote:
> Dodji Seketeli wrote:
> >This patch imports the initial state of asan as it was in the
> >Google branch.
> >
> >It provides basic infrastructure for asan to instrument memory
> >accesses on the heap, at -O3.  Note that it supports neither stack nor
> >global variable protection.
> 
> I tried the 01/10 to 10/10 patch series but it doesn't trigger for
> the following test case:
> 
> #include <stdlib.h>
> #include <stdio.h>
> 
> int
> main() {
>   int *i;
>   i = malloc(10*sizeof(*i));
>   free(i);  /* <<< Free memory. */
>   i[10] = 5;  /* <<< out of boundary even if not freed. */
>   printf("%d\n", i[11]);  /* <<< out of boundary even if not freed. */
>   return 0;
> }
> 
> (All of them are reported by Clang.) If I look at the dump (or
> assembler), I see the call to __asan_init, __asan_report_store4 and
> __asan_report_load4. However, when running the program ltrace only
> shows the calls to: __libc_start_main, __asan_init, malloc, free and
> printf. I haven't debugged why the condition is false [see
> attachment for the dump].

Can't reproduce that (admittedly with asan SVN branch rather than the
patchset):

./xgcc -B ./ -O2 -fasan -o a a.c -Wl,-rpath,/usr/src/gcc-asan/obj/x86_64-unknown-linux-gnu/libsanitizer/asan/.libs/ \
				 -L /usr/src/gcc-asan/obj/x86_64-unknown-linux-gnu/libsanitizer/asan/.libs/
./a
=================================================================
==20614== ERROR: AddressSanitizer heap-use-after-free on address
0x7f7d8245afec at pc 0x4006f8 bp 0x7fff9beda4c0 sp 0x7fff9beda4b8
READ of size 4 at 0x7f7d8245afec thread T0
    #0 0x4006f7 (/usr/src/gcc-asan/obj/gcc/a+0x4006f7)
0x7f7d8245afec is located 4 bytes to the right of 40-byte region
[0x7f7d8245afc0,0x7f7d8245afe8)
freed by thread T0 here:
    #0 0x7f7d82796585
    #(/usr/src/gcc-asan/obj/x86_64-unknown-linux-gnu/libsanitizer/asan/.libs/libasan.so.0.0.0+0xf585)
    #1 0x4006b5 (/usr/src/gcc-asan/obj/gcc/a+0x4006b5)
previously allocated by thread T0 here:
    #0 0x7f7d82796645
    #(/usr/src/gcc-asan/obj/x86_64-unknown-linux-gnu/libsanitizer/asan/.libs/libasan.so.0.0.0+0xf645)
    #1 0x4006aa (/usr/src/gcc-asan/obj/gcc/a+0x4006aa)
Shadow byte and word:
  0x1fefb048b5fd: fd
  0x1fefb048b5f8: fd fd fd fd fd fd fd fd
More shadow bytes:
  0x1fefb048b5d8: fa fa fa fa fa fa fa fa
  0x1fefb048b5e0: fa fa fa fa fa fa fa fa
  0x1fefb048b5e8: fa fa fa fa fa fa fa fa
  0x1fefb048b5f0: fa fa fa fa fa fa fa fa
=>0x1fefb048b5f8: fd fd fd fd fd fd fd fd
  0x1fefb048b600: fa fa fa fa fa fa fa fa
  0x1fefb048b608: fa fa fa fa fa fa fa fa
  0x1fefb048b610: fa fa fa fa fa fa fa fa
  0x1fefb048b618: fa fa fa fa fa fa fa fa
Stats: 0M malloced (0M for red zones) by 1 calls
Stats: 0M realloced by 0 calls
Stats: 0M freed by 1 calls
Stats: 0M really freed by 0 calls
Stats: 0M (128 full pages) mmaped in 1 calls
  mmaps   by size class: 7:4095; 
  mallocs by size class: 7:1; 
  frees   by size class: 7:1; 
  rfrees  by size class: 
Stats: malloc large: 0 small slow: 1
==20614== ABORTING

	Jakub

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 01/10] Initial import of asan from the Google branch into trunk
  2012-11-09 13:58       ` Jakub Jelinek
@ 2012-11-09 16:53         ` Xinliang David Li
  2012-11-09 17:13         ` Tobias Burnus
  1 sibling, 0 replies; 80+ messages in thread
From: Xinliang David Li @ 2012-11-09 16:53 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Tobias Burnus, Dodji Seketeli, GCC Patches, Diego Novillo,
	Wei Mi, Konstantin Serebryany

It seems that my one line fix in asan branch (r192605) is not included
in Dodji's patch set.

David


On Fri, Nov 9, 2012 at 5:58 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, Nov 09, 2012 at 02:14:04PM +0100, Tobias Burnus wrote:
>> Dodji Seketeli wrote:
>> >This patch imports the initial state of asan as it was in the
>> >Google branch.
>> >
>> >It provides basic infrastructure for asan to instrument memory
>> >accesses on the heap, at -O3.  Note that it supports neither stack nor
>> >global variable protection.
>>
>> I tried the 01/10 to 10/10 patch series but it doesn't trigger for
>> the following test case:
>>
>> #include <stdlib.h>
>> #include <stdio.h>
>>
>> int
>> main() {
>>   int *i;
>>   i = malloc(10*sizeof(*i));
>>   free(i);  /* <<< Free memory. */
>>   i[10] = 5;  /* <<< out of boundary even if not freed. */
>>   printf("%d\n", i[11]);  /* <<< out of boundary even if not freed. */
>>   return 0;
>> }
>>
>> (All of them are reported by Clang.) If I look at the dump (or
>> assembler), I see the call to __asan_init, __asan_report_store4 and
>> __asan_report_load4. However, when running the program ltrace only
>> shows the calls to: __libc_start_main, __asan_init, malloc, free and
>> printf. I haven't debugged why the condition is false [see
>> attachment for the dump].
>
> Can't reproduce that (admittedly with asan SVN branch rather than the
> patchset):
>
> ./xgcc -B ./ -O2 -fasan -o a a.c -Wl,-rpath,/usr/src/gcc-asan/obj/x86_64-unknown-linux-gnu/libsanitizer/asan/.libs/ \
>                                  -L /usr/src/gcc-asan/obj/x86_64-unknown-linux-gnu/libsanitizer/asan/.libs/
> ./a
> =================================================================
> ==20614== ERROR: AddressSanitizer heap-use-after-free on address
> 0x7f7d8245afec at pc 0x4006f8 bp 0x7fff9beda4c0 sp 0x7fff9beda4b8
> READ of size 4 at 0x7f7d8245afec thread T0
>     #0 0x4006f7 (/usr/src/gcc-asan/obj/gcc/a+0x4006f7)
> 0x7f7d8245afec is located 4 bytes to the right of 40-byte region
> [0x7f7d8245afc0,0x7f7d8245afe8)
> freed by thread T0 here:
>     #0 0x7f7d82796585
>     #(/usr/src/gcc-asan/obj/x86_64-unknown-linux-gnu/libsanitizer/asan/.libs/libasan.so.0.0.0+0xf585)
>     #1 0x4006b5 (/usr/src/gcc-asan/obj/gcc/a+0x4006b5)
> previously allocated by thread T0 here:
>     #0 0x7f7d82796645
>     #(/usr/src/gcc-asan/obj/x86_64-unknown-linux-gnu/libsanitizer/asan/.libs/libasan.so.0.0.0+0xf645)
>     #1 0x4006aa (/usr/src/gcc-asan/obj/gcc/a+0x4006aa)
> Shadow byte and word:
>   0x1fefb048b5fd: fd
>   0x1fefb048b5f8: fd fd fd fd fd fd fd fd
> More shadow bytes:
>   0x1fefb048b5d8: fa fa fa fa fa fa fa fa
>   0x1fefb048b5e0: fa fa fa fa fa fa fa fa
>   0x1fefb048b5e8: fa fa fa fa fa fa fa fa
>   0x1fefb048b5f0: fa fa fa fa fa fa fa fa
> =>0x1fefb048b5f8: fd fd fd fd fd fd fd fd
>   0x1fefb048b600: fa fa fa fa fa fa fa fa
>   0x1fefb048b608: fa fa fa fa fa fa fa fa
>   0x1fefb048b610: fa fa fa fa fa fa fa fa
>   0x1fefb048b618: fa fa fa fa fa fa fa fa
> Stats: 0M malloced (0M for red zones) by 1 calls
> Stats: 0M realloced by 0 calls
> Stats: 0M freed by 1 calls
> Stats: 0M really freed by 0 calls
> Stats: 0M (128 full pages) mmaped in 1 calls
>   mmaps   by size class: 7:4095;
>   mallocs by size class: 7:1;
>   frees   by size class: 7:1;
>   rfrees  by size class:
> Stats: malloc large: 0 small slow: 1
> ==20614== ABORTING
>
>         Jakub

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 01/10] Initial import of asan from the Google branch into trunk
  2012-11-09 13:58       ` Jakub Jelinek
  2012-11-09 16:53         ` Xinliang David Li
@ 2012-11-09 17:13         ` Tobias Burnus
  1 sibling, 0 replies; 80+ messages in thread
From: Tobias Burnus @ 2012-11-09 17:13 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Dodji Seketeli, gcc-patches, dnovillo, wmi, davidxl,
	konstantin.s.serebryany

Jakub Jelinek wrote:
> On Fri, Nov 09, 2012 at 02:14:04PM +0100, Tobias Burnus wrote:
>> I tried the 01/10 to 10/10 patch series but it doesn't trigger for
>> the following test case:
[...]
> Can't reproduce that (admittedly with asan SVN branch rather than the patchset):

I can reproduce both; comparing the asan0 dumps and the two asan.c 
files, I came up with the following patch. If I apply it on top of 
Dodji's branch, it now also aborts.


--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1536,10 +1536,10 @@ static void
asan_init_shadow_ptr_types (void)
{
asan_shadow_set = new_alias_set ();
- shadow_ptr_types[0] = build_distinct_type_copy (unsigned_char_type_node);
+ shadow_ptr_types[0] = build_distinct_type_copy (signed_char_type_node);
TYPE_ALIAS_SET (shadow_ptr_types[0]) = asan_shadow_set;
shadow_ptr_types[0] = build_pointer_type (shadow_ptr_types[0]);
- shadow_ptr_types[1] = build_distinct_type_copy (short_unsigned_type_node);
+ shadow_ptr_types[1] = build_distinct_type_copy (short_integer_type_node);
TYPE_ALIAS_SET (shadow_ptr_types[1]) = asan_shadow_set;
shadow_ptr_types[1] = build_pointer_type (shadow_ptr_types[1]);
}



Other remarks:

* libsanitizer: It currently installs under "lib" even under a 
x86-64-gnu-linux system where it should be "lib64"; that's probably 
automatically fix by enabling the multilib support. Maybe, removing the 
"#" before "#AM_ENABLE_MULTILIB" in libsanitizer/configure.ac is sufficient

* invoke.texi: The fbranch-target-load-optimize2 and fauto-inc-dec lost 
the "-", additionally, there are several "--fÂ…" items.

Tobias

PS: Has someone an idea why I cannot run the -faddress-sanitizer 
executable on gcc20 of the GCC compile farm? Fails with the following 
message, but "ulimit -v" is set to "unlimited".

==4784== ReserveShadowMemoryRange failed while trying to map 
0x20000001000 bytes. Perhaps you're using ulimit -v

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 01/10] Initial import of asan from the Google branch into trunk
  2012-11-09 13:14     ` Tobias Burnus
  2012-11-09 13:58       ` Jakub Jelinek
@ 2012-11-09 17:18       ` Wei Mi
  2012-11-12 11:09       ` [PATCH 03/11] Emit GIMPLE directly instead of gimplifying GENERIC Dodji Seketeli
  2012-11-12 11:20       ` [PATCH 01/10] Initial import of asan from the Google branch into trunk Dodji Seketeli
  3 siblings, 0 replies; 80+ messages in thread
From: Wei Mi @ 2012-11-09 17:18 UTC (permalink / raw)
  To: Tobias Burnus
  Cc: Dodji Seketeli, gcc-patches, dnovillo, jakub, davidxl,
	konstantin.s.serebryany

> Other issues:
>
> * libasan does not seem to be a multilib, at least I only find the 64bit
> version on x86-64-gnu-linux such that "-m32" compilation fails.
>

That is because originally configure file is shared between asan and
tsan (tsan doesn't support 32 bit). Diego has suggested me to split
the configure, so we will send a patch to support 32bit version asan
after Dodji's patches checkin to trunk.

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 03/11] Emit GIMPLE directly instead of gimplifying GENERIC.
  2012-11-09 13:14     ` Tobias Burnus
  2012-11-09 13:58       ` Jakub Jelinek
  2012-11-09 17:18       ` Wei Mi
@ 2012-11-12 11:09       ` Dodji Seketeli
  2012-11-12 11:20       ` [PATCH 01/10] Initial import of asan from the Google branch into trunk Dodji Seketeli
  3 siblings, 0 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-12 11:09 UTC (permalink / raw)
  To: Tobias Burnus; +Cc: gcc-patches, dnovillo, jakub, wmi, davidxl

[I am replying to several emails at once as I believe they are logically
connected]

Tobias Burnus <burnus@net-b.de> writes:

> I tried the 01/10 to 10/10 patch series but it doesn't trigger for the
> following test case:
>
> #include <stdlib.h>
> #include <stdio.h>
>
> int
> main() {
>   int *i;
>   i = malloc(10*sizeof(*i));
>   free(i);  /* <<< Free memory. */
>   i[10] = 5;  /* <<< out of boundary even if not freed. */
>   printf("%d\n", i[11]);  /* <<< out of boundary even if not freed. */
>   return 0;
> }
>
> (All of them are reported by Clang.) If I look at the dump (or
> assembler), I see the call to __asan_init, __asan_report_store4 and
> __asan_report_load4. However, when running the program ltrace only
> shows the calls to: __libc_start_main, __asan_init, malloc, free and
> printf. I haven't debugged why the condition is false [see attachment
> for the dump].

Right. As David Xinliang says ...

Xinliang David Li <davidxl@google.com> writes:

> It seems that my one line fix in asan branch (r192605) is not included
> in Dodji's patch set.

That's right.

... I meant to 'squash' this patch into ...

    From 72ed51c18dd269ee10fa1b70a919d77de498fc06 Mon Sep 17 00:00:00 2001
    From: Jakub Jelinek <jakub@redhat.com>
    Date: Fri, 2 Nov 2012 23:29:30 +0100
    Subject: [PATCH 03/11] Emit GIMPLE directly instead of gimplifying GENERIC.

    This patch cleanups the instrumentation code generation by emitting
    GIMPLE directly, as opposed to emitting GENERIC tree and then
    gimplifying them.  It also does some cleanups here and there

            * Makefile.in (GTFILES): Add $(srcdir)/asan.c.
            (asan.o): Update the dependencies of asan.o.
            * asan.c (tm.h, tree.h, tm_p.h, basic-block.h, flags.h
            function.h, tree-inline.h, tree-dump.h, diagnostic.h, demangle.h,
            langhooks.h, ggc.h, cgraph.h, gimple.h): Remove these unused but
            included headers.
            (shadow_ptr_types): New variable.
            (report_error_func): Change is_store argument to bool, don't append
            newline to function name.
            (PROB_VERY_UNLIKELY, PROB_ALWAYS): Define.
            (build_check_stmt): Change is_store argument to bool.  Emit GIMPLE
            directly instead of creating trees and gimplifying them.  Mark
            the error reporting function as very unlikely.
            (instrument_derefs): Change is_store argument to bool.  Use
            int_size_in_bytes to compute size_in_bytes, simplify size check.
            Use build_fold_addr_expr instead of build_addr.
            (transform_statements): Adjust instrument_derefs caller.
            Use gimple_assign_single_p as stmt test.  Don't look at MEM refs
            in rhs2.
            (asan_init_shadow_ptr_types): New function.
            (asan_instrument): Don't push/pop gimplify context.
            Call asan_init_shadow_ptr_types if not yet initialized.
            * asan.h (ASAN_SHADOW_SHIFT): Adjust comment.

... this one, to comply with Joseph's comment (in another sub-thread)
which was requesting to avoid sending patches that are knowingly broken
for the sake of sending them chronologically; but I obviously forgot to
do so.

It's now fixed and here is the updated patch that combines both commits:

	* Makefile.in (GTFILES): Add $(srcdir)/asan.c.
	(asan.o): Update the dependencies of asan.o.
	* asan.c (tm.h, tree.h, tm_p.h, basic-block.h, flags.h
	function.h, tree-inline.h, tree-dump.h, diagnostic.h, demangle.h,
	langhooks.h, ggc.h, cgraph.h, gimple.h): Remove these unused but
	included headers.
	(shadow_ptr_types): New variable.
	(report_error_func): Change is_store argument to bool, don't append
	newline to function name.
	(PROB_VERY_UNLIKELY, PROB_ALWAYS): Define.
	(build_check_stmt): Change is_store argument to bool.  Emit GIMPLE
	directly instead of creating trees and gimplifying them.  Mark
	the error reporting function as very unlikely.
	(instrument_derefs): Change is_store argument to bool.  Use
	int_size_in_bytes to compute size_in_bytes, simplify size check.
	Use build_fold_addr_expr instead of build_addr.
	(transform_statements): Adjust instrument_derefs caller.
	Use gimple_assign_single_p as stmt test.  Don't look at MEM refs
	in rhs2.
	(asan_init_shadow_ptr_types): New function.
	(asan_instrument): Don't push/pop gimplify context.
	Call asan_init_shadow_ptr_types if not yet initialized.
	* asan.h (ASAN_SHADOW_SHIFT): Adjust comment.
---
 gcc/ChangeLog.asan |  28 ++++++
 gcc/Makefile.in    |   9 +-
 gcc/asan.c         | 284 +++++++++++++++++++++++++++++++----------------------
 gcc/asan.h         |   2 +-
 4 files changed, 200 insertions(+), 123 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index d13a584..0345ac7 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,4 +1,32 @@
 2012-10-11  Jakub Jelinek  <jakub@redhat.com>
+	    Xinliang David Li  <davidxl@google.com>
+	    Dodji Seketeli <dodji@redhat.com>
+
+	* Makefile.in (GTFILES): Add $(srcdir)/asan.c.
+	(asan.o): Update the dependencies of asan.o.
+	* asan.c (tm.h, tree.h, tm_p.h, basic-block.h, flags.h
+	function.h, tree-inline.h, tree-dump.h, diagnostic.h, demangle.h,
+	langhooks.h, ggc.h, cgraph.h, gimple.h): Remove these unused but
+	included headers.
+	(shadow_ptr_types): New variable.
+	(report_error_func): Change is_store argument to bool, don't append
+	newline to function name.
+	(PROB_VERY_UNLIKELY, PROB_ALWAYS): Define.
+	(build_check_stmt): Change is_store argument to bool.  Emit GIMPLE
+	directly instead of creating trees and gimplifying them.  Mark
+	the error reporting function as very unlikely.
+	(instrument_derefs): Change is_store argument to bool.  Use
+	int_size_in_bytes to compute size_in_bytes, simplify size check.
+	Use build_fold_addr_expr instead of build_addr.
+	(transform_statements): Adjust instrument_derefs caller.
+	Use gimple_assign_single_p as stmt test.  Don't look at MEM refs
+	in rhs2.
+	(asan_init_shadow_ptr_types): New function.
+	(asan_instrument): Don't push/pop gimplify context.
+	Call asan_init_shadow_ptr_types if not yet initialized.
+	* asan.h (ASAN_SHADOW_SHIFT): Adjust comment.
+
+2012-10-11  Jakub Jelinek  <jakub@redhat.com>
 
 	* toplev.c (process_options): Warn and turn off
 	-faddress-sanitizer if not supported by target.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 469c72f..b9a1e74 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2208,10 +2208,10 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
    $(TREE_H) $(PARAMS_H) $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(RTL_H) \
    $(GGC_H) $(TM_P_H) $(TARGET_H) langhooks.h $(REGS_H) gt-stor-layout.h \
    $(DIAGNOSTIC_CORE_H) $(CGRAPH_H) $(TREE_INLINE_H) $(TREE_DUMP_H) $(GIMPLE_H)
-asan.o : asan.c asan.h $(CONFIG_H) pointer-set.h \
-   $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \
-   output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \
-   tree-pretty-print.h $(TARGET_H)
+asan.o : asan.c asan.h $(CONFIG_H) $(SYSTEM_H) $(GIMPLE_H) \
+   output.h coretypes.h $(GIMPLE_PRETTY_PRINT_H) \
+   tree-iterator.h $(TREE_FLOW_H) $(TREE_PASS_H) \
+   $(TARGET_H)
 tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
    $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
    $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) \
@@ -3725,6 +3725,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/lto-streamer.h \
   $(srcdir)/target-globals.h \
   $(srcdir)/ipa-inline.h \
+  $(srcdir)/asan.c \
   @all_gtfiles@
 
 # Compute the list of GT header files from the corresponding C sources,
diff --git a/gcc/asan.c b/gcc/asan.c
index 9655b11..04b11e5 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -22,24 +22,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
-#include "tm.h"
-#include "tree.h"
-#include "tm_p.h"
-#include "basic-block.h"
-#include "flags.h"
-#include "function.h"
-#include "tree-inline.h"
 #include "gimple.h"
 #include "tree-iterator.h"
 #include "tree-flow.h"
-#include "tree-dump.h"
 #include "tree-pass.h"
-#include "diagnostic.h"
-#include "demangle.h"
-#include "langhooks.h"
-#include "ggc.h"
-#include "cgraph.h"
-#include "gimple.h"
 #include "asan.h"
 #include "gimple-pretty-print.h"
 #include "target.h"
@@ -79,18 +65,22 @@ along with GCC; see the file COPYING3.  If not see
  to create redzones for stack and global object and poison them.
 */
 
+/* Pointer types to 1 resp. 2 byte integers in shadow memory.  A separate
+   alias set is used for all shadow memory accesses.  */
+static GTY(()) tree shadow_ptr_types[2];
+
 /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
    IS_STORE is either 1 (for a store) or 0 (for a load).
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
 
 static tree
-report_error_func (int is_store, int size_in_bytes)
+report_error_func (bool is_store, int size_in_bytes)
 {
   tree fn_type;
   tree def;
   char name[100];
 
-  sprintf (name, "__asan_report_%s%d\n",
+  sprintf (name, "__asan_report_%s%d",
            is_store ? "store" : "load", size_in_bytes);
   fn_type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
   def = build_fn_decl (name, fn_type);
@@ -118,6 +108,9 @@ asan_init_func (void)
 }
 
 
+#define PROB_VERY_UNLIKELY	(REG_BR_PROB_BASE / 2000 - 1)
+#define PROB_ALWAYS		(REG_BR_PROB_BASE)
+
 /* Instrument the memory access instruction BASE.
    Insert new statements before ITER.
    LOCATION is source code location.
@@ -127,21 +120,17 @@ asan_init_func (void)
 static void
 build_check_stmt (tree base,
                   gimple_stmt_iterator *iter,
-                  location_t location, int is_store, int size_in_bytes)
+                  location_t location, bool is_store, int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
   basic_block cond_bb, then_bb, join_bb;
   edge e;
-  tree cond, t, u;
-  tree base_addr;
-  tree shadow_value;
+  tree t, base_addr, shadow;
   gimple g;
-  gimple_seq seq, stmts;
-  tree shadow_type = size_in_bytes == 16 ?
-      short_integer_type_node : char_type_node;
-  tree shadow_ptr_type = build_pointer_type (shadow_type);
-  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode,
-                                                      /*unsignedp=*/true);
+  tree shadow_ptr_type = shadow_ptr_types[size_in_bytes == 16 ? 1 : 0];
+  tree shadow_type = TREE_TYPE (shadow_ptr_type);
+  tree uintptr_type
+    = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE (base)), 1);
 
   /* We first need to split the current basic block, and start altering
      the CFG.  This allows us to insert the statements we're about to
@@ -166,14 +155,15 @@ build_check_stmt (tree base,
 
   /* Create the bb that contains the crash block.  */
   then_bb = create_empty_bb (cond_bb);
-  make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  e->probability = PROB_VERY_UNLIKELY;
   make_single_succ_edge (then_bb, join_bb, EDGE_FALLTHRU);
 
   /* Mark the pseudo-fallthrough edge from cond_bb to join_bb.  */
   e = find_edge (cond_bb, join_bb);
   e->flags = EDGE_FALSE_VALUE;
   e->count = cond_bb->count;
-  e->probability = REG_BR_PROB_BASE;
+  e->probability = PROB_ALWAYS - PROB_VERY_UNLIKELY;
 
   /* Update dominance info.  Note that bb_join's data was
      updated by split_block.  */
@@ -183,75 +173,125 @@ build_check_stmt (tree base,
       set_immediate_dominator (CDI_DOMINATORS, join_bb, cond_bb);
     }
 
-  base_addr = create_tmp_reg (uintptr_type, "__asan_base_addr");
+  base = unshare_expr (base);
 
-  seq = NULL; 
-  t = fold_convert_loc (location, uintptr_type,
-                        unshare_expr (base));
-  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
-  gimple_seq_add_seq (&seq, stmts);
-  g = gimple_build_assign (base_addr, t);
+  gsi = gsi_last_bb (cond_bb);
+  g = gimple_build_assign_with_ops (TREE_CODE (base),
+				    make_ssa_name (TREE_TYPE (base), NULL),
+				    base, NULL_TREE);
   gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
-  /* Build
-     (base_addr >> ASAN_SHADOW_SHIFT) | targetm.asan_shadow_offset ().  */
-
-  t = build2 (RSHIFT_EXPR, uintptr_type, base_addr,
-	      build_int_cst (uintptr_type, ASAN_SHADOW_SHIFT));
-  t = build2 (PLUS_EXPR, uintptr_type, t,
-	      build_int_cst (uintptr_type, targetm.asan_shadow_offset ()));
-  t = build1 (INDIRECT_REF, shadow_type,
-              build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t));
-  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
-  gimple_seq_add_seq (&seq, stmts);
-  shadow_value = create_tmp_reg (shadow_type, "__asan_shadow");
-  g = gimple_build_assign (shadow_value, t);
+  g = gimple_build_assign_with_ops (NOP_EXPR,
+				    make_ssa_name (uintptr_type, NULL),
+				    gimple_assign_lhs (g), NULL_TREE);
   gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
-  t = build2 (NE_EXPR, boolean_type_node, shadow_value,
-              build_int_cst (shadow_type, 0));
-  if (size_in_bytes < 8)
-    {
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+  base_addr = gimple_assign_lhs (g);
 
-      /* Slow path for 1-, 2- and 4- byte accesses.
-         Build ((base_addr & 7) + (size_in_bytes - 1)) >= shadow_value.  */
+  /* Build
+     (base_addr >> ASAN_SHADOW_SHIFT) + targetm.asan_shadow_offset ().  */
 
-      u = build2 (BIT_AND_EXPR, uintptr_type,
-                  base_addr,
-                  build_int_cst (uintptr_type, 7));
-      u = build1 (CONVERT_EXPR, shadow_type, u);
-      u = build2 (PLUS_EXPR, shadow_type, u,
-                  build_int_cst (shadow_type, size_in_bytes - 1));
-      u = build2 (GE_EXPR, uintptr_type, u, shadow_value);
-    }
-  else
-      u = build_int_cst (boolean_type_node, 1);
-  t = build2 (TRUTH_AND_EXPR, boolean_type_node, t, u);
-  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
-  gimple_seq_add_seq (&seq, stmts);
-  cond = create_tmp_reg (boolean_type_node, "__asan_crash_cond");
-  g = gimple_build_assign  (cond, t);
+  t = build_int_cst (uintptr_type, ASAN_SHADOW_SHIFT);
+  g = gimple_build_assign_with_ops (RSHIFT_EXPR,
+				    make_ssa_name (uintptr_type, NULL),
+				    base_addr, t);
   gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
-  g = gimple_build_cond (NE_EXPR, cond, boolean_false_node, NULL_TREE,
-                         NULL_TREE);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+  t = build_int_cst (uintptr_type, targetm.asan_shadow_offset ());
+  g = gimple_build_assign_with_ops (PLUS_EXPR,
+				    make_ssa_name (uintptr_type, NULL),
+				    gimple_assign_lhs (g), t);
   gimple_set_location (g, location);
-  gimple_seq_add_stmt (&seq, g);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
-  /* Generate call to the run-time library (e.g. __asan_report_load8).  */
+  g = gimple_build_assign_with_ops (NOP_EXPR,
+				    make_ssa_name (shadow_ptr_type, NULL),
+				    gimple_assign_lhs (g), NULL_TREE);
+  gimple_set_location (g, location);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
-  gsi = gsi_last_bb (cond_bb);
-  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
-  seq = NULL; 
-  g = gimple_build_call (report_error_func (is_store, size_in_bytes),
-                         1, base_addr);
-  gimple_seq_add_stmt (&seq, g);
+  t = build2 (MEM_REF, shadow_type, gimple_assign_lhs (g),
+	      build_int_cst (shadow_ptr_type, 0));
+  g = gimple_build_assign_with_ops (MEM_REF,
+				    make_ssa_name (shadow_type, NULL),
+				    t, NULL_TREE);
+  gimple_set_location (g, location);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+  shadow = gimple_assign_lhs (g);
+
+  if (size_in_bytes < 8)
+    {
+      /* Slow path for 1, 2 and 4 byte accesses.
+	 Test (shadow != 0)
+	      & ((base_addr & 7) + (size_in_bytes - 1)) >= shadow).  */
+      g = gimple_build_assign_with_ops (NE_EXPR,
+					make_ssa_name (boolean_type_node,
+						       NULL),
+					shadow,
+					build_int_cst (shadow_type, 0));
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+      t = gimple_assign_lhs (g);
+
+      g = gimple_build_assign_with_ops (BIT_AND_EXPR,
+					make_ssa_name (uintptr_type,
+						       NULL),
+					base_addr,
+					build_int_cst (uintptr_type, 7));
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+      g = gimple_build_assign_with_ops (NOP_EXPR,
+					make_ssa_name (shadow_type,
+						       NULL),
+					gimple_assign_lhs (g), NULL_TREE);
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+      if (size_in_bytes > 1)
+	{
+	  g = gimple_build_assign_with_ops (PLUS_EXPR,
+					    make_ssa_name (shadow_type,
+							   NULL),
+					    gimple_assign_lhs (g),
+					    build_int_cst (shadow_type,
+							   size_in_bytes - 1));
+	  gimple_set_location (g, location);
+	  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+	}
+
+      g = gimple_build_assign_with_ops (GE_EXPR,
+					make_ssa_name (boolean_type_node,
+						       NULL),
+					gimple_assign_lhs (g),
+					shadow);
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+      g = gimple_build_assign_with_ops (BIT_AND_EXPR,
+					make_ssa_name (boolean_type_node,
+						       NULL),
+					t, gimple_assign_lhs (g));
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+      t = gimple_assign_lhs (g);
+    }
+  else
+    t = shadow;
 
-  /* Insert the check code in the THEN block.  */
+  g = gimple_build_cond (NE_EXPR, t, build_int_cst (TREE_TYPE (t), 0),
+			 NULL_TREE, NULL_TREE);
+  gimple_set_location (g, location);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
+  /* Generate call to the run-time library (e.g. __asan_report_load8).  */
   gsi = gsi_start_bb (then_bb);
-  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
+  g = gimple_build_call (report_error_func (is_store, size_in_bytes),
+			 1, base_addr);
+  gimple_set_location (g, location);
+  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
 
   *iter = gsi_start_bb (join_bb);
 }
@@ -262,14 +302,12 @@ build_check_stmt (tree base,
 
 static void
 instrument_derefs (gimple_stmt_iterator *iter, tree t,
-                  location_t location, int is_store)
+                  location_t location, bool is_store)
 {
   tree type, base;
-  int size_in_bytes;
+  HOST_WIDE_INT size_in_bytes;
 
   type = TREE_TYPE (t);
-  if (type == error_mark_node)
-    return;
   switch (TREE_CODE (t))
     {
     case ARRAY_REF:
@@ -280,25 +318,25 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
     default:
       return;
     }
-  size_in_bytes = tree_low_cst (TYPE_SIZE (type), 0) / BITS_PER_UNIT;
-  if (size_in_bytes != 1 && size_in_bytes != 2 &&
-      size_in_bytes != 4 && size_in_bytes != 8 && size_in_bytes != 16)
-      return;
-  {
-    /* For now just avoid instrumenting bit field acceses.
+
+  size_in_bytes = int_size_in_bytes (type);
+  if ((size_in_bytes & (size_in_bytes - 1)) != 0
+      || (unsigned HOST_WIDE_INT) size_in_bytes - 1 >= 16)
+    return;
+
+  /* For now just avoid instrumenting bit field acceses.
      Fixing it is doable, but expected to be messy.  */
 
-    HOST_WIDE_INT bitsize, bitpos;
-    tree offset;
-    enum machine_mode mode;
-    int volatilep = 0, unsignedp = 0;
-    get_inner_reference (t, &bitsize, &bitpos, &offset,
-                         &mode, &unsignedp, &volatilep, false);
-    if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
-        return;
-  }
-
-  base = build_addr (t, current_function_decl);
+  HOST_WIDE_INT bitsize, bitpos;
+  tree offset;
+  enum machine_mode mode;
+  int volatilep = 0, unsignedp = 0;
+  get_inner_reference (t, &bitsize, &bitpos, &offset,
+		       &mode, &unsignedp, &volatilep, false);
+  if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
+    return;
+
+  base = build_fold_addr_expr (t);
   build_check_stmt (base, iter, location, is_store, size_in_bytes);
 }
 
@@ -314,7 +352,6 @@ transform_statements (void)
   basic_block bb;
   gimple_stmt_iterator i;
   int saved_last_basic_block = last_basic_block;
-  enum gimple_rhs_class grhs_class;
 
   FOR_EACH_BB (bb)
     {
@@ -322,16 +359,12 @@ transform_statements (void)
       for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
         {
           gimple s = gsi_stmt (i);
-          if (gimple_code (s) != GIMPLE_ASSIGN)
-              continue;
+          if (!gimple_assign_single_p (s))
+	    continue;
           instrument_derefs (&i, gimple_assign_lhs (s),
-                             gimple_location (s), 1);
+                             gimple_location (s), true);
           instrument_derefs (&i, gimple_assign_rhs1 (s),
-                             gimple_location (s), 0);
-          grhs_class = get_gimple_rhs_class (gimple_assign_rhs_code (s));
-          if (grhs_class == GIMPLE_BINARY_RHS)
-            instrument_derefs (&i, gimple_assign_rhs2 (s),
-                               gimple_location (s), 0);
+                             gimple_location (s), false);
         }
     }
 }
@@ -351,15 +384,28 @@ asan_finish_file (void)
                              MAX_RESERVED_INIT_PRIORITY - 1);
 }
 
+/* Initialize shadow_ptr_types array.  */
+
+static void
+asan_init_shadow_ptr_types (void)
+{
+  alias_set_type set = new_alias_set ();
+  shadow_ptr_types[0] = build_distinct_type_copy (signed_char_type_node);
+  TYPE_ALIAS_SET (shadow_ptr_types[0]) = set;
+  shadow_ptr_types[0] = build_pointer_type (shadow_ptr_types[0]);
+  shadow_ptr_types[1] = build_distinct_type_copy (short_integer_type_node);
+  TYPE_ALIAS_SET (shadow_ptr_types[1]) = set;
+  shadow_ptr_types[1] = build_pointer_type (shadow_ptr_types[1]);
+}
+
 /* Instrument the current function.  */
 
 static unsigned int
 asan_instrument (void)
 {
-  struct gimplify_ctx gctx;
-  push_gimplify_context (&gctx);
+  if (shadow_ptr_types[0] == NULL_TREE)
+    asan_init_shadow_ptr_types ();
   transform_statements ();
-  pop_gimplify_context (NULL);
   return 0;
 }
 
@@ -386,6 +432,8 @@ struct gimple_opt_pass pass_asan =
   0,                                    /* properties_destroyed  */
   0,                                    /* todo_flags_start  */
   TODO_verify_flow | TODO_verify_stmts
-  | TODO_update_ssa    /* todo_flags_finish  */
+  | TODO_update_ssa			/* todo_flags_finish  */
  }
 };
+
+#include "gt-asan.h"
diff --git a/gcc/asan.h b/gcc/asan.h
index 699820b..0d9ab8b 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -24,7 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 extern void asan_finish_file(void);
 
 /* Shadow memory is found at
-   (address >> ASAN_SHADOW_SHIFT) | targetm.asan_shadow_offset ().  */
+   (address >> ASAN_SHADOW_SHIFT) + targetm.asan_shadow_offset ().  */
 #define ASAN_SHADOW_SHIFT	3
 
 #endif /* TREE_ASAN */
-- 
1.7.11.7


-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 02/10] Initial asan cleanups
  2012-11-06 17:04     ` Diego Novillo
@ 2012-11-12 11:12       ` Dodji Seketeli
  0 siblings, 0 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-12 11:12 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc-patches, jakub, wmi, davidxl

> On 2012-11-02 15:57 , Dodji Seketeli wrote:
> 
> >   /* AddressSanitizer, a fast memory error detector.
> > -   Copyright (C) 2011 Free Software Foundation, Inc.
> > +   Copyright (C) 2011, 2012 Free Software Foundation, Inc.
> 
> I *think* we should only mention 2012, but I don't know if code in
> branches counts for the copyright years.

I have updated this to 2012 only.

> > +  /* Address Sanitizer needs porting to each target architecture.  */
> > +  if (flag_asan && targetm.asan_shadow_offset == NULL)
> > +    {
> > +      warning (0, "-fasan not supported for this target");
> 
> Hm, ASAN's flag is now -fsanitizer=[asan,tsan,memory] or some such.

Oh, right.  Sorry, this flew below my radar. In any case, -fasan is
wrong.  It should be at least -faddress-sanitizer as the rest of the
current patch set.  So I am updating the tree accordingly.

> We will need to make that change.  But it can wait until after the
> initial port is in trunk.

I took the opportunity to just update the message to
-faddress-sanitizer for now.  If that is not OK, please let me know.

> This patch is OK.

Thanks.  Below is what I have in my tree.


	* toplev.c (process_options): Warn and turn off
	-faddress-sanitizer if not supported by target.
	* asan.c: Include target.h.
	(asan_scale, asan_offset_log_32, asan_offset_log_64,
	asan_offset_log): Removed.
	(build_check_stmt): Use ASAN_SHADOW_SHIFT and
	targetm.asan_shadow_offset ().
	(asan_instrument): Don't initialize asan_offset_log.
	* asan.h (ASAN_SHADOW_SHIFT): Define.
	* target.def (TARGET_ASAN_SHADOW_OFFSET): New hook.
	* doc/tm.texi.in (TARGET_ASAN_SHADOW_OFFSET): Add it.
	* doc/tm.texi: Regenerated.
	* Makefile.in (asan.o): Depend on $(TARGET_H).
	* config/i386/i386.c (ix86_asan_shadow_offset): New function.
	(TARGET_ASAN_SHADOW_OFFSET): Define.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192372 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan     | 18 ++++++++++++++++++
 gcc/Makefile.in        |  2 +-
 gcc/asan.c             | 25 ++++++-------------------
 gcc/asan.h             |  6 +++++-
 gcc/config/i386/i386.c | 11 +++++++++++
 gcc/doc/tm.texi        |  6 ++++++
 gcc/doc/tm.texi.in     |  2 ++
 gcc/target.def         | 11 +++++++++++
 gcc/toplev.c           |  7 +++++++
 9 files changed, 67 insertions(+), 21 deletions(-)

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
index 704aa61..d13a584 100644
--- a/gcc/ChangeLog.asan
+++ b/gcc/ChangeLog.asan
@@ -1,3 +1,21 @@
+2012-10-11  Jakub Jelinek  <jakub@redhat.com>
+
+	* toplev.c (process_options): Warn and turn off
+	-faddress-sanitizer if not supported by target.
+	* asan.c: Include target.h.
+	(asan_scale, asan_offset_log_32, asan_offset_log_64,
+	asan_offset_log): Removed.
+	(build_check_stmt): Use ASAN_SHADOW_SHIFT and
+	targetm.asan_shadow_offset ().
+	(asan_instrument): Don't initialize asan_offset_log.
+	* asan.h (ASAN_SHADOW_SHIFT): Define.
+	* target.def (TARGET_ASAN_SHADOW_OFFSET): New hook.
+	* doc/tm.texi.in (TARGET_ASAN_SHADOW_OFFSET): Add it.
+	* doc/tm.texi: Regenerated.
+	* Makefile.in (asan.o): Depend on $(TARGET_H).
+	* config/i386/i386.c (ix86_asan_shadow_offset): New function.
+	(TARGET_ASAN_SHADOW_OFFSET): Define.
+
 2012-10-10  Wei Mi <wmi@google.com>
 	    Diego Novillo <dnovillo@google.com>
 	    Dodji Seketeli <dodji@redhat.com>
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index dde9b50..469c72f 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2211,7 +2211,7 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
 asan.o : asan.c asan.h $(CONFIG_H) pointer-set.h \
    $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \
    output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \
-   tree-pretty-print.h
+   tree-pretty-print.h $(TARGET_H)
 tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
    $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
    $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) \
diff --git a/gcc/asan.c b/gcc/asan.c
index 4b07c96..9655b11 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1,5 +1,5 @@
 /* AddressSanitizer, a fast memory error detector.
-   Copyright (C) 2011 Free Software Foundation, Inc.
+   Copyright (C) 2012 Free Software Foundation, Inc.
    Contributed by Kostya Serebryany <kcc@google.com>
 
 This file is part of GCC.
@@ -42,6 +42,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple.h"
 #include "asan.h"
 #include "gimple-pretty-print.h"
+#include "target.h"
 
 /*
  AddressSanitizer finds out-of-bounds and use-after-free bugs 
@@ -78,15 +79,6 @@ along with GCC; see the file COPYING3.  If not see
  to create redzones for stack and global object and poison them.
 */
 
-/* The shadow address is computed as (X>>asan_scale) + (1<<asan_offset_log).
- We may want to add command line flags to change these values.  */
-
-static const int asan_scale = 3;
-static const int asan_offset_log_32 = 29;
-static const int asan_offset_log_64 = 44;
-static int asan_offset_log;
-
-
 /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
    IS_STORE is either 1 (for a store) or 0 (for a load).
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
@@ -202,15 +194,13 @@ build_check_stmt (tree base,
   gimple_set_location (g, location);
   gimple_seq_add_stmt (&seq, g);
 
-  /* Build (base_addr >> asan_scale) + (1 << asan_offset_log).  */
+  /* Build
+     (base_addr >> ASAN_SHADOW_SHIFT) | targetm.asan_shadow_offset ().  */
 
   t = build2 (RSHIFT_EXPR, uintptr_type, base_addr,
-              build_int_cst (uintptr_type, asan_scale));
+	      build_int_cst (uintptr_type, ASAN_SHADOW_SHIFT));
   t = build2 (PLUS_EXPR, uintptr_type, t,
-              build2 (LSHIFT_EXPR, uintptr_type,
-                      build_int_cst (uintptr_type, 1),
-                      build_int_cst (uintptr_type, asan_offset_log)
-                     ));
+	      build_int_cst (uintptr_type, targetm.asan_shadow_offset ()));
   t = build1 (INDIRECT_REF, shadow_type,
               build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t));
   t = force_gimple_operand (t, &stmts, false, NULL_TREE);
@@ -367,9 +357,6 @@ static unsigned int
 asan_instrument (void)
 {
   struct gimplify_ctx gctx;
-  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode, true);
-  int is_64 = tree_low_cst (TYPE_SIZE (uintptr_type), 0) == 64;
-  asan_offset_log = is_64 ? asan_offset_log_64 : asan_offset_log_32;
   push_gimplify_context (&gctx);
   transform_statements ();
   pop_gimplify_context (NULL);
diff --git a/gcc/asan.h b/gcc/asan.h
index 590cf35..699820b 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -1,5 +1,5 @@
 /* AddressSanitizer, a fast memory error detector.
-   Copyright (C) 2011 Free Software Foundation, Inc.
+   Copyright (C) 2011, 2012 Free Software Foundation, Inc.
    Contributed by Kostya Serebryany <kcc@google.com>
 
 This file is part of GCC.
@@ -23,4 +23,8 @@ along with GCC; see the file COPYING3.  If not see
 
 extern void asan_finish_file(void);
 
+/* Shadow memory is found at
+   (address >> ASAN_SHADOW_SHIFT) | targetm.asan_shadow_offset ().  */
+#define ASAN_SHADOW_SHIFT	3
+
 #endif /* TREE_ASAN */
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 3a6f494..01c7a11 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5186,6 +5186,14 @@ ix86_legitimate_combined_insn (rtx insn)
   return true;
 }
 \f
+/* Implement the TARGET_ASAN_SHADOW_OFFSET hook.  */
+
+static unsigned HOST_WIDE_INT
+ix86_asan_shadow_offset (void)
+{
+  return (unsigned HOST_WIDE_INT) 1 << (TARGET_LP64 ? 44 : 29);
+}
+\f
 /* Argument support functions.  */
 
 /* Return true when register may be used to pass function parameters.  */
@@ -42012,6 +42020,9 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val)
 #undef TARGET_LEGITIMATE_COMBINED_INSN
 #define TARGET_LEGITIMATE_COMBINED_INSN ix86_legitimate_combined_insn
 
+#undef TARGET_ASAN_SHADOW_OFFSET
+#define TARGET_ASAN_SHADOW_OFFSET ix86_asan_shadow_offset
+
 #undef TARGET_GIMPLIFY_VA_ARG_EXPR
 #define TARGET_GIMPLIFY_VA_ARG_EXPR ix86_gimplify_va_arg
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index dbf6c20..eeb3f08 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -11357,6 +11357,12 @@ MIPS, where add-immediate takes a 16-bit signed value,
 is zero, which disables this optimization.
 @end deftypevr
 
+@deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_ASAN_SHADOW_OFFSET (void)
+Return the offset bitwise ored into shifted address to get corresponding
+Address Sanitizer shadow memory address.  NULL if Address Sanitizer is not
+supported by the target.
+@end deftypefn
+
 @deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_MEMMODEL_CHECK (unsigned HOST_WIDE_INT @var{val})
 Validate target specific memory model mask bits. When NULL no target specific
 memory model bits are allowed.
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 575cc73..2c9cb10 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -11199,6 +11199,8 @@ MIPS, where add-immediate takes a 16-bit signed value,
 is zero, which disables this optimization.
 @end deftypevr
 
+@hook TARGET_ASAN_SHADOW_OFFSET
+
 @hook TARGET_MEMMODEL_CHECK
 Validate target specific memory model mask bits. When NULL no target specific
 memory model bits are allowed.
diff --git a/gcc/target.def b/gcc/target.def
index 2801aea..6d00262 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2056,6 +2056,17 @@ DEFHOOK
  "",
  unsigned HOST_WIDE_INT, (unsigned HOST_WIDE_INT val), NULL)
 
+/* Defines an offset bitwise ored into shifted address to get corresponding
+   Address Sanitizer shadow address, or -1 if Address Sanitizer is not
+   supported by the target.  */
+DEFHOOK
+(asan_shadow_offset,
+ "Return the offset bitwise ored into shifted address to get corresponding\n\
+Address Sanitizer shadow memory address.  NULL if Address Sanitizer is not\n\
+supported by the target.",
+ unsigned HOST_WIDE_INT, (void),
+ NULL)
+
 /* Functions relating to calls - argument passing, returns, etc.  */
 /* Members of struct call have no special macro prefix.  */
 HOOK_VECTOR (TARGET_CALLS, calls)
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 3ca0736..d9dfb2a 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1541,6 +1541,13 @@ process_options (void)
       flag_omit_frame_pointer = 0;
     }
 
+  /* Address Sanitizer needs porting to each target architecture.  */
+  if (flag_asan && targetm.asan_shadow_offset == NULL)
+    {
+      warning (0, "-faddress-sanitizer not supported for this target");
+      flag_asan = 0;
+    }
+
   /* Enable -Werror=coverage-mismatch when -Werror and -Wno-error
      have not been set.  */
   if (!global_options_set.x_warnings_are_errors
-- 
1.7.11.7


-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 01/10] Initial import of asan from the Google branch into trunk
  2012-11-09 13:14     ` Tobias Burnus
                         ` (2 preceding siblings ...)
  2012-11-12 11:09       ` [PATCH 03/11] Emit GIMPLE directly instead of gimplifying GENERIC Dodji Seketeli
@ 2012-11-12 11:20       ` Dodji Seketeli
  3 siblings, 0 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-12 11:20 UTC (permalink / raw)
  To: Tobias Burnus; +Cc: gcc-patches, dnovillo, jakub, wmi, davidxl

Tobias Burnus <burnus@net-b.de> writes:

> * -fno-address-sanitizer doesn't work (it does in Clang); it is
> explicitly disabled via RejectNegative in gcc/common.opt
>

Fixed in common.opt by removing the RejectNegative.

I am thus sending the updated patch.

	* common.opt: Add -faddress-sanitizer option.
	* invoke.texi: Document the new flag.
	* passes.c: Add the asan pass.
	* toplev.c (compile_file): Call asan_finish_file.
	* asan.c: New file.
	* asan.h: New file.
	* tree-pass.h: Declare pass_asan.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192360 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan  |  12 ++
 gcc/Makefile.in     |   5 +
 gcc/asan.c          | 404 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/asan.h          |  26 ++++
 gcc/common.opt      |   4 +
 gcc/doc/invoke.texi |  13 +-
 gcc/passes.c        |   1 +
 gcc/toplev.c        |   5 +
 gcc/tree-pass.h     |   1 +
 9 files changed, 468 insertions(+), 3 deletions(-)
 create mode 100644 gcc/ChangeLog.asan
 create mode 100644 gcc/asan.c
 create mode 100644 gcc/asan.h

diff --git a/gcc/ChangeLog.asan b/gcc/ChangeLog.asan
new file mode 100644
index 0000000..704aa61
--- /dev/null
+++ b/gcc/ChangeLog.asan
@@ -0,0 +1,12 @@
+2012-10-10  Wei Mi <wmi@google.com>
+	    Diego Novillo <dnovillo@google.com>
+	    Dodji Seketeli <dodji@redhat.com>
+
+	* Makefile.in: Add asan.c and its dependencies.
+	* common.opt: Add -faddress-sanitizer option.
+	* invoke.texi: Document the new flag.
+	* passes.c: Add the asan pass.
+	* toplev.c (compile_file): Call asan_finish_file.
+	* asan.c: New file.
+	* asan.h: New file.
+	* tree-pass.h: Declare pass_asan.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 24791a4..dde9b50 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1349,6 +1349,7 @@ OBJS = \
 	tracer.o \
 	trans-mem.o \
 	tree-affine.o \
+	asan.o \
 	tree-call-cdce.o \
 	tree-cfg.o \
 	tree-cfgcleanup.o \
@@ -2207,6 +2208,10 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
    $(TREE_H) $(PARAMS_H) $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(RTL_H) \
    $(GGC_H) $(TM_P_H) $(TARGET_H) langhooks.h $(REGS_H) gt-stor-layout.h \
    $(DIAGNOSTIC_CORE_H) $(CGRAPH_H) $(TREE_INLINE_H) $(TREE_DUMP_H) $(GIMPLE_H)
+asan.o : asan.c asan.h $(CONFIG_H) pointer-set.h \
+   $(SYSTEM_H) $(TREE_H) $(GIMPLE_H) \
+   output.h $(DIAGNOSTIC_H) coretypes.h $(TREE_DUMP_H) $(FLAGS_H) \
+   tree-pretty-print.h
 tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
    $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
    $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) \
diff --git a/gcc/asan.c b/gcc/asan.c
new file mode 100644
index 0000000..4b07c96
--- /dev/null
+++ b/gcc/asan.c
@@ -0,0 +1,404 @@
+/* AddressSanitizer, a fast memory error detector.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Kostya Serebryany <kcc@google.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "tm_p.h"
+#include "basic-block.h"
+#include "flags.h"
+#include "function.h"
+#include "tree-inline.h"
+#include "gimple.h"
+#include "tree-iterator.h"
+#include "tree-flow.h"
+#include "tree-dump.h"
+#include "tree-pass.h"
+#include "diagnostic.h"
+#include "demangle.h"
+#include "langhooks.h"
+#include "ggc.h"
+#include "cgraph.h"
+#include "gimple.h"
+#include "asan.h"
+#include "gimple-pretty-print.h"
+
+/*
+ AddressSanitizer finds out-of-bounds and use-after-free bugs 
+ with <2x slowdown on average.
+
+ The tool consists of two parts:
+ instrumentation module (this file) and a run-time library.
+ The instrumentation module adds a run-time check before every memory insn.
+   For a 8- or 16- byte load accessing address X:
+     ShadowAddr = (X >> 3) + Offset
+     ShadowValue = *(char*)ShadowAddr;  // *(short*) for 16-byte access.
+     if (ShadowValue)
+       __asan_report_load8(X);
+   For a load of N bytes (N=1, 2 or 4) from address X:
+     ShadowAddr = (X >> 3) + Offset
+     ShadowValue = *(char*)ShadowAddr;
+     if (ShadowValue)
+       if ((X & 7) + N - 1 > ShadowValue)
+         __asan_report_loadN(X);
+ Stores are instrumented similarly, but using __asan_report_storeN functions.
+ A call too __asan_init() is inserted to the list of module CTORs.
+
+ The run-time library redefines malloc (so that redzone are inserted around
+ the allocated memory) and free (so that reuse of free-ed memory is delayed),
+ provides __asan_report* and __asan_init functions.
+
+ Read more:
+ http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm
+
+ Future work:
+ The current implementation supports only detection of out-of-bounds and
+ use-after-free bugs in heap.
+ In order to support out-of-bounds for stack and globals we will need
+ to create redzones for stack and global object and poison them.
+*/
+
+/* The shadow address is computed as (X>>asan_scale) + (1<<asan_offset_log).
+ We may want to add command line flags to change these values.  */
+
+static const int asan_scale = 3;
+static const int asan_offset_log_32 = 29;
+static const int asan_offset_log_64 = 44;
+static int asan_offset_log;
+
+
+/* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
+   IS_STORE is either 1 (for a store) or 0 (for a load).
+   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
+
+static tree
+report_error_func (int is_store, int size_in_bytes)
+{
+  tree fn_type;
+  tree def;
+  char name[100];
+
+  sprintf (name, "__asan_report_%s%d\n",
+           is_store ? "store" : "load", size_in_bytes);
+  fn_type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
+  def = build_fn_decl (name, fn_type);
+  TREE_NOTHROW (def) = 1;
+  TREE_THIS_VOLATILE (def) = 1;  /* Attribute noreturn. Surprise!  */
+  DECL_ATTRIBUTES (def) = tree_cons (get_identifier ("leaf"), 
+                                     NULL, DECL_ATTRIBUTES (def));
+  DECL_ASSEMBLER_NAME (def);
+  return def;
+}
+
+/* Construct a function tree for __asan_init().  */
+
+static tree
+asan_init_func (void)
+{
+  tree fn_type;
+  tree def;
+
+  fn_type = build_function_type_list (void_type_node, NULL_TREE);
+  def = build_fn_decl ("__asan_init", fn_type);
+  TREE_NOTHROW (def) = 1;
+  DECL_ASSEMBLER_NAME (def);
+  return def;
+}
+
+
+/* Instrument the memory access instruction BASE.
+   Insert new statements before ITER.
+   LOCATION is source code location.
+   IS_STORE is either 1 (for a store) or 0 (for a load).
+   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
+
+static void
+build_check_stmt (tree base,
+                  gimple_stmt_iterator *iter,
+                  location_t location, int is_store, int size_in_bytes)
+{
+  gimple_stmt_iterator gsi;
+  basic_block cond_bb, then_bb, join_bb;
+  edge e;
+  tree cond, t, u;
+  tree base_addr;
+  tree shadow_value;
+  gimple g;
+  gimple_seq seq, stmts;
+  tree shadow_type = size_in_bytes == 16 ?
+      short_integer_type_node : char_type_node;
+  tree shadow_ptr_type = build_pointer_type (shadow_type);
+  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode,
+                                                      /*unsignedp=*/true);
+
+  /* We first need to split the current basic block, and start altering
+     the CFG.  This allows us to insert the statements we're about to
+     construct into the right basic blocks.  */
+
+  cond_bb = gimple_bb (gsi_stmt (*iter));
+  gsi = *iter;
+  gsi_prev (&gsi);
+  if (!gsi_end_p (gsi))
+    e = split_block (cond_bb, gsi_stmt (gsi));
+  else
+    e = split_block_after_labels (cond_bb);
+  cond_bb = e->src;
+  join_bb = e->dest;
+
+  /* A recap at this point: join_bb is the basic block at whose head
+     is the gimple statement for which this check expression is being
+     built.  cond_bb is the (possibly new, synthetic) basic block the
+     end of which will contain the cache-lookup code, and a
+     conditional that jumps to the cache-miss code or, much more
+     likely, over to join_bb.  */
+
+  /* Create the bb that contains the crash block.  */
+  then_bb = create_empty_bb (cond_bb);
+  make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  make_single_succ_edge (then_bb, join_bb, EDGE_FALLTHRU);
+
+  /* Mark the pseudo-fallthrough edge from cond_bb to join_bb.  */
+  e = find_edge (cond_bb, join_bb);
+  e->flags = EDGE_FALSE_VALUE;
+  e->count = cond_bb->count;
+  e->probability = REG_BR_PROB_BASE;
+
+  /* Update dominance info.  Note that bb_join's data was
+     updated by split_block.  */
+  if (dom_info_available_p (CDI_DOMINATORS))
+    {
+      set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
+      set_immediate_dominator (CDI_DOMINATORS, join_bb, cond_bb);
+    }
+
+  base_addr = create_tmp_reg (uintptr_type, "__asan_base_addr");
+
+  seq = NULL; 
+  t = fold_convert_loc (location, uintptr_type,
+                        unshare_expr (base));
+  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
+  gimple_seq_add_seq (&seq, stmts);
+  g = gimple_build_assign (base_addr, t);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+
+  /* Build (base_addr >> asan_scale) + (1 << asan_offset_log).  */
+
+  t = build2 (RSHIFT_EXPR, uintptr_type, base_addr,
+              build_int_cst (uintptr_type, asan_scale));
+  t = build2 (PLUS_EXPR, uintptr_type, t,
+              build2 (LSHIFT_EXPR, uintptr_type,
+                      build_int_cst (uintptr_type, 1),
+                      build_int_cst (uintptr_type, asan_offset_log)
+                     ));
+  t = build1 (INDIRECT_REF, shadow_type,
+              build1 (VIEW_CONVERT_EXPR, shadow_ptr_type, t));
+  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
+  gimple_seq_add_seq (&seq, stmts);
+  shadow_value = create_tmp_reg (shadow_type, "__asan_shadow");
+  g = gimple_build_assign (shadow_value, t);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+  t = build2 (NE_EXPR, boolean_type_node, shadow_value,
+              build_int_cst (shadow_type, 0));
+  if (size_in_bytes < 8)
+    {
+
+      /* Slow path for 1-, 2- and 4- byte accesses.
+         Build ((base_addr & 7) + (size_in_bytes - 1)) >= shadow_value.  */
+
+      u = build2 (BIT_AND_EXPR, uintptr_type,
+                  base_addr,
+                  build_int_cst (uintptr_type, 7));
+      u = build1 (CONVERT_EXPR, shadow_type, u);
+      u = build2 (PLUS_EXPR, shadow_type, u,
+                  build_int_cst (shadow_type, size_in_bytes - 1));
+      u = build2 (GE_EXPR, uintptr_type, u, shadow_value);
+    }
+  else
+      u = build_int_cst (boolean_type_node, 1);
+  t = build2 (TRUTH_AND_EXPR, boolean_type_node, t, u);
+  t = force_gimple_operand (t, &stmts, false, NULL_TREE);
+  gimple_seq_add_seq (&seq, stmts);
+  cond = create_tmp_reg (boolean_type_node, "__asan_crash_cond");
+  g = gimple_build_assign  (cond, t);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+  g = gimple_build_cond (NE_EXPR, cond, boolean_false_node, NULL_TREE,
+                         NULL_TREE);
+  gimple_set_location (g, location);
+  gimple_seq_add_stmt (&seq, g);
+
+  /* Generate call to the run-time library (e.g. __asan_report_load8).  */
+
+  gsi = gsi_last_bb (cond_bb);
+  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
+  seq = NULL; 
+  g = gimple_build_call (report_error_func (is_store, size_in_bytes),
+                         1, base_addr);
+  gimple_seq_add_stmt (&seq, g);
+
+  /* Insert the check code in the THEN block.  */
+
+  gsi = gsi_start_bb (then_bb);
+  gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING);
+
+  *iter = gsi_start_bb (join_bb);
+}
+
+/* If T represents a memory access, add instrumentation code before ITER.
+   LOCATION is source code location.
+   IS_STORE is either 1 (for a store) or 0 (for a load).  */
+
+static void
+instrument_derefs (gimple_stmt_iterator *iter, tree t,
+                  location_t location, int is_store)
+{
+  tree type, base;
+  int size_in_bytes;
+
+  type = TREE_TYPE (t);
+  if (type == error_mark_node)
+    return;
+  switch (TREE_CODE (t))
+    {
+    case ARRAY_REF:
+    case COMPONENT_REF:
+    case INDIRECT_REF:
+    case MEM_REF:
+      break;
+    default:
+      return;
+    }
+  size_in_bytes = tree_low_cst (TYPE_SIZE (type), 0) / BITS_PER_UNIT;
+  if (size_in_bytes != 1 && size_in_bytes != 2 &&
+      size_in_bytes != 4 && size_in_bytes != 8 && size_in_bytes != 16)
+      return;
+  {
+    /* For now just avoid instrumenting bit field acceses.
+     Fixing it is doable, but expected to be messy.  */
+
+    HOST_WIDE_INT bitsize, bitpos;
+    tree offset;
+    enum machine_mode mode;
+    int volatilep = 0, unsignedp = 0;
+    get_inner_reference (t, &bitsize, &bitpos, &offset,
+                         &mode, &unsignedp, &volatilep, false);
+    if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
+        return;
+  }
+
+  base = build_addr (t, current_function_decl);
+  build_check_stmt (base, iter, location, is_store, size_in_bytes);
+}
+
+/* asan: this looks too complex. Can this be done simpler? */
+/* Transform
+   1) Memory references.
+   2) BUILTIN_ALLOCA calls.
+*/
+
+static void
+transform_statements (void)
+{
+  basic_block bb;
+  gimple_stmt_iterator i;
+  int saved_last_basic_block = last_basic_block;
+  enum gimple_rhs_class grhs_class;
+
+  FOR_EACH_BB (bb)
+    {
+      if (bb->index >= saved_last_basic_block) continue;
+      for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+        {
+          gimple s = gsi_stmt (i);
+          if (gimple_code (s) != GIMPLE_ASSIGN)
+              continue;
+          instrument_derefs (&i, gimple_assign_lhs (s),
+                             gimple_location (s), 1);
+          instrument_derefs (&i, gimple_assign_rhs1 (s),
+                             gimple_location (s), 0);
+          grhs_class = get_gimple_rhs_class (gimple_assign_rhs_code (s));
+          if (grhs_class == GIMPLE_BINARY_RHS)
+            instrument_derefs (&i, gimple_assign_rhs2 (s),
+                               gimple_location (s), 0);
+        }
+    }
+}
+
+/* Module-level instrumentation.
+   - Insert __asan_init() into the list of CTORs.
+   - TODO: insert redzones around globals.
+ */
+
+void
+asan_finish_file (void)
+{
+  tree ctor_statements = NULL_TREE;
+  append_to_statement_list (build_call_expr (asan_init_func (), 0),
+                            &ctor_statements);
+  cgraph_build_static_cdtor ('I', ctor_statements,
+                             MAX_RESERVED_INIT_PRIORITY - 1);
+}
+
+/* Instrument the current function.  */
+
+static unsigned int
+asan_instrument (void)
+{
+  struct gimplify_ctx gctx;
+  tree uintptr_type = lang_hooks.types.type_for_mode (ptr_mode, true);
+  int is_64 = tree_low_cst (TYPE_SIZE (uintptr_type), 0) == 64;
+  asan_offset_log = is_64 ? asan_offset_log_64 : asan_offset_log_32;
+  push_gimplify_context (&gctx);
+  transform_statements ();
+  pop_gimplify_context (NULL);
+  return 0;
+}
+
+static bool
+gate_asan (void)
+{
+  return flag_asan != 0;
+}
+
+struct gimple_opt_pass pass_asan =
+{
+ {
+  GIMPLE_PASS,
+  "asan",                               /* name  */
+  OPTGROUP_NONE,                        /* optinfo_flags */
+  gate_asan,                            /* gate  */
+  asan_instrument,                      /* execute  */
+  NULL,                                 /* sub  */
+  NULL,                                 /* next  */
+  0,                                    /* static_pass_number  */
+  TV_NONE,                              /* tv_id  */
+  PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required  */
+  0,                                    /* properties_provided  */
+  0,                                    /* properties_destroyed  */
+  0,                                    /* todo_flags_start  */
+  TODO_verify_flow | TODO_verify_stmts
+  | TODO_update_ssa    /* todo_flags_finish  */
+ }
+};
diff --git a/gcc/asan.h b/gcc/asan.h
new file mode 100644
index 0000000..590cf35
--- /dev/null
+++ b/gcc/asan.h
@@ -0,0 +1,26 @@
+/* AddressSanitizer, a fast memory error detector.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Kostya Serebryany <kcc@google.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef TREE_ASAN
+#define TREE_ASAN
+
+extern void asan_finish_file(void);
+
+#endif /* TREE_ASAN */
diff --git a/gcc/common.opt b/gcc/common.opt
index f947a72..6088d1a 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -837,6 +837,10 @@ fargument-noalias-anything
 Common Ignore
 Does nothing. Preserved for backward compatibility.
 
+faddress-sanitizer
+Common Report Var(flag_asan)
+Enable AddressSanitizer, a memory error detector
+
 fasynchronous-unwind-tables
 Common Report Var(flag_asynchronous_unwind_tables) Optimization
 Generate unwind tables that are exact at each instruction boundary
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 715f60a..83af4d4 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -353,9 +353,10 @@ Objective-C and Objective-C++ Dialects}.
 @item Optimization Options
 @xref{Optimize Options,,Options that Control Optimization}.
 @gccoptlist{-falign-functions[=@var{n}] -falign-jumps[=@var{n}] @gol
--falign-labels[=@var{n}] -falign-loops[=@var{n}] -fassociative-math @gol
--fauto-inc-dec -fbranch-probabilities -fbranch-target-load-optimize @gol
--fbranch-target-load-optimize2 -fbtr-bb-exclusive -fcaller-saves @gol
+-falign-labels[=@var{n}] -falign-loops[=@var{n}] -faddress-sanitizer @gol
+--fassociative-math fauto-inc-dec -fbranch-probabilities @gol
+--fbranch-target-load-optimize fbranch-target-load-optimize2 @gol
+--fbtr-bb-exclusive -fcaller-saves @gol
 -fcheck-data-deps -fcombine-stack-adjustments -fconserve-stack @gol
 -fcompare-elim -fcprop-registers -fcrossjumping @gol
 -fcse-follow-jumps -fcse-skip-blocks -fcx-fortran-rules @gol
@@ -6834,6 +6835,12 @@ assumptions based on that.
 
 The default is @option{-fzero-initialized-in-bss}.
 
+@item -faddress-sanitizer
+Enable AddressSanitizer, a fast memory error detector.
+Memory access instructions will be instrumented to detect
+out-of-bounds and use-after-free bugs. So far only heap bugs will be detected.
+See @uref{http://code.google.com/p/address-sanitizer/} for more details.
+
 @item -fmudflap -fmudflapth -fmudflapir
 @opindex fmudflap
 @opindex fmudflapth
diff --git a/gcc/passes.c b/gcc/passes.c
index 67aae52..66a2f74 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1456,6 +1456,7 @@ init_optimization_passes (void)
       NEXT_PASS (pass_split_crit_edges);
       NEXT_PASS (pass_pre);
       NEXT_PASS (pass_sink_code);
+      NEXT_PASS (pass_asan);
       NEXT_PASS (pass_tree_loop);
 	{
 	  struct opt_pass **p = &pass_tree_loop.pass.sub;
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 5cbb364..3ca0736 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -72,6 +72,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "value-prof.h"
 #include "alloc-pool.h"
 #include "tree-mudflap.h"
+#include "asan.h"
 #include "gimple.h"
 #include "tree-ssa-alias.h"
 #include "plugin.h"
@@ -570,6 +571,10 @@ compile_file (void)
       if (flag_mudflap)
 	mudflap_finish_file ();
 
+      /* File-scope initialization for AddressSanitizer.  */
+      if (flag_asan)
+        asan_finish_file ();
+
       output_shared_constant_pool ();
       output_object_blocks ();
       finish_tm_clone_pairs ();
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 09ec531..0e61856 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -259,6 +259,7 @@ struct register_pass_info
 
 extern struct gimple_opt_pass pass_mudflap_1;
 extern struct gimple_opt_pass pass_mudflap_2;
+extern struct gimple_opt_pass pass_asan;
 extern struct gimple_opt_pass pass_lower_cf;
 extern struct gimple_opt_pass pass_refactor_eh;
 extern struct gimple_opt_pass pass_lower_eh;
-- 
1.7.11.7


-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 05/10] Implement protection of stack variables
  2012-11-06 17:22     ` Diego Novillo
@ 2012-11-12 11:31       ` Dodji Seketeli
  2012-11-12 11:51         ` Jakub Jelinek
  0 siblings, 1 reply; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-12 11:31 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc-patches, jakub, wmi, davidxl, konstantin.s.serebryany

Diego Novillo <dnovillo@google.com> writes:

> I believe they layout the stack from right to left (top is to the
> right).  Feels like reading a middle earth map.  Kostya, is my
> recollection correct?

Yes, Konstantin replied to this already but I forgot to update the
patch cover letter (that I keep in the commit log of my commits)
accordingly.  It's now updated to link to the paper where the stack
layout from right to left.

> This is a great summary.  Please put it at the top of asan.c or in
> some other prominent place.

OK, I have put it at the top of asan.c.

> 
> 
> > -	  offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
> > +	  if (flag_asan && pred)
> > +	    {
> > +	      HOST_WIDE_INT prev_offset = frame_offset;
> > +	      tree repr_decl = NULL_TREE;
> > +
> > +	      offset
> > +		= alloc_stack_frame_space (stack_vars[i].size
> > +					   + ASAN_RED_ZONE_SIZE,
> > +					   MAX (alignb, ASAN_RED_ZONE_SIZE));
> > +	      VEC_safe_push (HOST_WIDE_INT, heap, data->asan_vec,
> > +			     prev_offset);
> > +	      VEC_safe_push (HOST_WIDE_INT, heap, data->asan_vec,
> > +			     offset + stack_vars[i].size);
> 
> Oh, gee, thanks.  More VEC() code for me to convert ;)

Sorry.

> The patch is OK.

Thanks.  Below is the modified patch I have in my tree.

	* Makefile.in (asan.o): Depend on $(EXPR_H) $(OPTABS_H).
	(cfgexpand.o): Depend on asan.h.
	* asan.c: Include expr.h and optabs.h.
	(asan_shadow_set): New variable.
	(asan_shadow_cst, asan_emit_stack_protection): New functions.
	(asan_init_shadow_ptr_types): Initialize also asan_shadow_set.
	* cfgexpand.c: Include asan.h.  Define HOST_WIDE_INT heap vector.
	(partition_stack_vars): If i is large alignment and j small
	alignment or vice versa, break out of the loop instead of continue,
	and put the test earlier.  If flag_asan, break out of the loop
	if for small alignment size is different.
	(struct stack_vars_data): New type.
	(expand_stack_vars): Add DATA argument.  Change PRED type to
	function taking size_t argument instead of tree.  Adjust pred
	calls.  Fill DATA in and add needed padding in between variables
	if -faddress-sanitizer.
	(defer_stack_allocation): Defer everything for flag_asan.
	(stack_protect_decl_phase_1, stack_protect_decl_phase_2): Take
	size_t index into stack_vars array instead of the decl directly.
	(asan_decl_phase_3): New function.
	(expand_used_vars): Return var destruction sequence.  Adjust
	expand_stack_vars calls, add another one for flag_asan.  Call
	asan_emit_stack_protection if expand_stack_vars added anything
	to the vectors.
	(expand_gimple_basic_block): Add disable_tail_calls argument.
	(gimple_expand_cfg): Pass true to it if expand_used_vars returned
	non-NULL.  Emit the sequence returned by expand_used_vars after
	return_label.
	* asan.h (asan_emit_stack_protection): New prototype.
	(asan_shadow_set): New decl.
	(ASAN_RED_ZONE_SIZE, ASAN_STACK_MAGIC_LEFT, ASAN_STACK_MAGIC_MIDDLE,
	ASAN_STACK_MAGIC_RIGHT, ASAN_STACK_FRAME_MAGIC): Define.
	(asan_protect_stack_decl): New inline.
	* toplev.c (process_options): Also disable -faddress-sanitizer on
	!FRAME_GROWS_DOWNWARDS targets.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192540 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |  38 +++++++
 gcc/Makefile.in    |   4 +-
 gcc/asan.c         | 306 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 gcc/asan.h         |  31 +++++-
 gcc/cfgexpand.c    | 165 +++++++++++++++++++++++++----
 gcc/toplev.c       |   4 +-
 6 files changed, 514 insertions(+), 34 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b9a1e74..7a0b074 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2211,7 +2211,7 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
 asan.o : asan.c asan.h $(CONFIG_H) $(SYSTEM_H) $(GIMPLE_H) \
    output.h coretypes.h $(GIMPLE_PRETTY_PRINT_H) \
    tree-iterator.h $(TREE_FLOW_H) $(TREE_PASS_H) \
-   $(TARGET_H)
+   $(TARGET_H) $(EXPR_H) $(OPTABS_H)
 tree-ssa-tail-merge.o: tree-ssa-tail-merge.c \
    $(SYSTEM_H) $(CONFIG_H) coretypes.h $(TM_H) $(BITMAP_H) \
    $(FLAGS_H) $(TM_P_H) $(BASIC_BLOCK_H) \
@@ -3082,7 +3082,7 @@ cfgexpand.o : cfgexpand.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
    $(DIAGNOSTIC_H) toplev.h $(DIAGNOSTIC_CORE_H) $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
    value-prof.h $(TREE_INLINE_H) $(TARGET_H) $(SSAEXPAND_H) $(REGS_H) \
    $(GIMPLE_PRETTY_PRINT_H) $(BITMAP_H) sbitmap.h \
-   $(INSN_ATTR_H) $(CFGLOOP_H)
+   $(INSN_ATTR_H) $(CFGLOOP_H) asan.h
 cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_ERROR_H) \
    $(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
    $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) $(INSN_ATTR_H) \
diff --git a/gcc/asan.c b/gcc/asan.c
index 398b837..a3bb47b 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -29,6 +29,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "asan.h"
 #include "gimple-pretty-print.h"
 #include "target.h"
+#include "expr.h"
+#include "optabs.h"
 
 /*
  AddressSanitizer finds out-of-bounds and use-after-free bugs 
@@ -58,17 +60,303 @@ along with GCC; see the file COPYING3.  If not see
  Read more:
  http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm
 
- Future work:
- The current implementation supports only detection of out-of-bounds and
- use-after-free bugs in heap.
- In order to support out-of-bounds for stack and globals we will need
- to create redzones for stack and global object and poison them.
-*/
+ The current implementation supports detection of out-of-bounds and
+ use-after-free in the heap, on the stack and for global variables.
+
+ [Protection of stack variables]
+
+ To understand how detection of out-of-bounds and use-after-free works
+ for stack variables, lets look at this example on x86_64 where the
+ stack grows downward:
+
+     int
+     foo ()
+     {
+       char a[23] = {0};
+       int b[2] = {0};
+
+       a[5] = 1;
+       b[1] = 2;
+
+       return a[5] + b[1];
+     }
+
+ For this function, the stack protected by asan will be organized as
+ follows, from the top of the stack to the bottom:
+
+ Slot 1/ [red zone of 32 bytes called 'RIGHT RedZone']
+
+ Slot 2/ [24 bytes for variable 'a']
+
+ Slot 3/ [8 bytes of red zone, that adds up to the space of 'a' to make
+	  the next slot be 32 bytes aligned; this one is called Partial
+	  Redzone; this 32 bytes alignment is an asan constraint]
+
+ Slot 4/ [red zone of 32 bytes called 'Middle RedZone']
+
+ Slot 5/ [8 bytes for variable 'b']
+
+ Slot 6/ [24 bytes of Partial Red Zone (similar to slot 3]
+
+ Slot 7/ [32 bytes of Red Zone at the bottom of the stack, called 'LEFT
+	  RedZone']
+
+ The 32 bytes of LEFT red zone at the bottom of the stack can be
+ decomposed as such:
+
+     1/ The first 8 bytes contain a magical asan number that is always
+     0x41B58AB3.
+
+     2/ The following 8 bytes contains a pointer to a string (to be
+     parsed at runtime by the runtime asan library), which format is
+     the following:
+
+      "<function-name> <space> <num-of-variables-on-the-stack>
+      (<32-bytes-aligned-offset-in-bytes-of-variable> <space>
+      <length-of-var-in-bytes> ){n} "
+
+	where '(...){n}' means the content inside the parenthesis occurs 'n'
+	times, with 'n' being the number of variables on the stack.
+
+      3/ The following 16 bytes of the red zone have no particular
+      format.
+
+ The shadow memory for that stack layout is going to look like this:
+
+     - content of shadow memory 8 bytes for slot 7: 0xFFFFFFFFF1F1F1F1.
+       The F1 byte pattern is a magic number called
+       ASAN_STACK_MAGIC_LEFT and is a way for the runtime to know that
+       the memory for that shadow byte is part of a the LEFT red zone
+       intended to seat at the bottom of the variables on the stack.
+
+     - content of shadow memory 8 bytes for slots 6 and 5:
+       0xFFFFFFFFF4F4F400.  The F4 byte pattern is a magic number
+       called ASAN_STACK_MAGIC_PARTIAL.  It flags the fact that the
+       memory region for this shadow byte is a PARTIAL red zone
+       intended to pad a variable A, so that the slot following
+       {A,padding} is 32 bytes aligned.
+
+       Note that the fact that the least significant byte of this
+       shadow memory content is 00 means that 8 bytes of its
+       corresponding memory (which corresponds to the memory of
+       variable 'b') is addressable.
+
+     - content of shadow memory 8 bytes for slot 4: 0xFFFFFFFFF2F2F2F2.
+       The F2 byte pattern is a magic number called
+       ASAN_STACK_MAGIC_MIDDLE.  It flags the fact that the memory
+       region for this shadow byte is a MIDDLE red zone intended to
+       seat between two 32 aligned slots of {variable,padding}.
+
+     - content of shadow memory 8 bytes for slot 3 and 2:
+       0xFFFFFFFFF4000000.  This represents is the concatenation of
+       variable 'a' and the partial red zone following it, like what we
+       had for variable 'b'.  The least significant 3 bytes being 00
+       means that the 3 bytes of variable 'a' are addressable.
+
+     - content of shadow memory 8 bytes for slot 1: 0xFFFFFFFFF3F3F3F3.
+       The F3 byte pattern is a magic number called
+       ASAN_STACK_MAGIC_RIGHT.  It flags the fact that the memory
+       region for this shadow byte is a RIGHT red zone intended to seat
+       at the top of the variables of the stack.
+
+ Note that the real variable layout is done in expand_used_vars in
+ cfgexpand.c.  As far as Address Sanitizer is concerned, it lays out
+ stack variables as well as the different red zones, emits some
+ prologue code to populate the shadow memory as to poison (mark as
+ non-accessible) the regions of the red zones and mark the regions of
+ stack variables as accessible, and emit some epilogue code to
+ un-poison (mark as accessible) the regions of red zones right before
+ the function exits.  */
+
+alias_set_type asan_shadow_set = -1;
 
 /* Pointer types to 1 resp. 2 byte integers in shadow memory.  A separate
    alias set is used for all shadow memory accesses.  */
 static GTY(()) tree shadow_ptr_types[2];
 
+/* Return a CONST_INT representing 4 subsequent shadow memory bytes.  */
+
+static rtx
+asan_shadow_cst (unsigned char shadow_bytes[4])
+{
+  int i;
+  unsigned HOST_WIDE_INT val = 0;
+  gcc_assert (WORDS_BIG_ENDIAN == BYTES_BIG_ENDIAN);
+  for (i = 0; i < 4; i++)
+    val |= (unsigned HOST_WIDE_INT) shadow_bytes[BYTES_BIG_ENDIAN ? 3 - i : i]
+	   << (BITS_PER_UNIT * i);
+  return GEN_INT (trunc_int_for_mode (val, SImode));
+}
+
+/* Insert code to protect stack vars.  The prologue sequence should be emitted
+   directly, epilogue sequence returned.  BASE is the register holding the
+   stack base, against which OFFSETS array offsets are relative to, OFFSETS
+   array contains pairs of offsets in reverse order, always the end offset
+   of some gap that needs protection followed by starting offset,
+   and DECLS is an array of representative decls for each var partition.
+   LENGTH is the length of the OFFSETS array, DECLS array is LENGTH / 2 - 1
+   elements long (OFFSETS include gap before the first variable as well
+   as gaps after each stack variable).  */
+
+rtx
+asan_emit_stack_protection (rtx base, HOST_WIDE_INT *offsets, tree *decls,
+			    int length)
+{
+  rtx shadow_base, shadow_mem, ret, mem;
+  unsigned char shadow_bytes[4];
+  HOST_WIDE_INT base_offset = offsets[length - 1], offset, prev_offset;
+  HOST_WIDE_INT last_offset, last_size;
+  int l;
+  unsigned char cur_shadow_byte = ASAN_STACK_MAGIC_LEFT;
+  static pretty_printer pp;
+  static bool pp_initialized;
+  const char *buf;
+  size_t len;
+  tree str_cst;
+
+  /* First of all, prepare the description string.  */
+  if (!pp_initialized)
+    {
+      pp_construct (&pp, /* prefix */NULL, /* line-width */0);
+      pp_initialized = true;
+    }
+  pp_clear_output_area (&pp);
+  if (DECL_NAME (current_function_decl))
+    pp_base_tree_identifier (&pp, DECL_NAME (current_function_decl));
+  else
+    pp_string (&pp, "<unknown>");
+  pp_space (&pp);
+  pp_decimal_int (&pp, length / 2 - 1);
+  pp_space (&pp);
+  for (l = length - 2; l; l -= 2)
+    {
+      tree decl = decls[l / 2 - 1];
+      pp_wide_integer (&pp, offsets[l] - base_offset);
+      pp_space (&pp);
+      pp_wide_integer (&pp, offsets[l - 1] - offsets[l]);
+      pp_space (&pp);
+      if (DECL_P (decl) && DECL_NAME (decl))
+	{
+	  pp_decimal_int (&pp, IDENTIFIER_LENGTH (DECL_NAME (decl)));
+	  pp_space (&pp);
+	  pp_base_tree_identifier (&pp, DECL_NAME (decl));
+	}
+      else
+	pp_string (&pp, "9 <unknown>");
+      pp_space (&pp);
+    }
+  buf = pp_base_formatted_text (&pp);
+  len = strlen (buf);
+  str_cst = build_string (len + 1, buf);
+  TREE_TYPE (str_cst)
+    = build_array_type (char_type_node, build_index_type (size_int (len)));
+  TREE_READONLY (str_cst) = 1;
+  TREE_STATIC (str_cst) = 1;
+  str_cst = build1 (ADDR_EXPR, build_pointer_type (char_type_node), str_cst);
+
+  /* Emit the prologue sequence.  */
+  base = expand_binop (Pmode, add_optab, base, GEN_INT (base_offset),
+		       NULL_RTX, 1, OPTAB_DIRECT);
+  mem = gen_rtx_MEM (ptr_mode, base);
+  emit_move_insn (mem, GEN_INT (ASAN_STACK_FRAME_MAGIC));
+  mem = adjust_address (mem, VOIDmode, GET_MODE_SIZE (ptr_mode));
+  emit_move_insn (mem, expand_normal (str_cst));
+  shadow_base = expand_binop (Pmode, lshr_optab, base,
+			      GEN_INT (ASAN_SHADOW_SHIFT),
+			      NULL_RTX, 1, OPTAB_DIRECT);
+  shadow_base = expand_binop (Pmode, add_optab, shadow_base,
+			      GEN_INT (targetm.asan_shadow_offset ()),
+			      NULL_RTX, 1, OPTAB_DIRECT);
+  gcc_assert (asan_shadow_set != -1
+	      && (ASAN_RED_ZONE_SIZE >> ASAN_SHADOW_SHIFT) == 4);
+  shadow_mem = gen_rtx_MEM (SImode, shadow_base);
+  set_mem_alias_set (shadow_mem, asan_shadow_set);
+  prev_offset = base_offset;
+  for (l = length; l; l -= 2)
+    {
+      if (l == 2)
+	cur_shadow_byte = ASAN_STACK_MAGIC_RIGHT;
+      offset = offsets[l - 1];
+      if ((offset - base_offset) & (ASAN_RED_ZONE_SIZE - 1))
+	{
+	  int i;
+	  HOST_WIDE_INT aoff
+	    = base_offset + ((offset - base_offset)
+			     & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1));
+	  shadow_mem = adjust_address (shadow_mem, VOIDmode,
+				       (aoff - prev_offset)
+				       >> ASAN_SHADOW_SHIFT);
+	  prev_offset = aoff;
+	  for (i = 0; i < 4; i++, aoff += (1 << ASAN_SHADOW_SHIFT))
+	    if (aoff < offset)
+	      {
+		if (aoff < offset - (1 << ASAN_SHADOW_SHIFT) + 1)
+		  shadow_bytes[i] = 0;
+		else
+		  shadow_bytes[i] = offset - aoff;
+	      }
+	    else
+	      shadow_bytes[i] = ASAN_STACK_MAGIC_PARTIAL;
+	  emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
+	  offset = aoff;
+	}
+      while (offset <= offsets[l - 2] - ASAN_RED_ZONE_SIZE)
+	{
+	  shadow_mem = adjust_address (shadow_mem, VOIDmode,
+				       (offset - prev_offset)
+				       >> ASAN_SHADOW_SHIFT);
+	  prev_offset = offset;
+	  memset (shadow_bytes, cur_shadow_byte, 4);
+	  emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
+	  offset += ASAN_RED_ZONE_SIZE;
+	}
+      cur_shadow_byte = ASAN_STACK_MAGIC_MIDDLE;
+    }
+  do_pending_stack_adjust ();
+
+  /* Construct epilogue sequence.  */
+  start_sequence ();
+
+  shadow_mem = gen_rtx_MEM (BLKmode, shadow_base);
+  set_mem_alias_set (shadow_mem, asan_shadow_set);
+  prev_offset = base_offset;
+  last_offset = base_offset;
+  last_size = 0;
+  for (l = length; l; l -= 2)
+    {
+      offset = base_offset + ((offsets[l - 1] - base_offset)
+			     & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1));
+      if (last_offset + last_size != offset)
+	{
+	  shadow_mem = adjust_address (shadow_mem, VOIDmode,
+				       (last_offset - prev_offset)
+				       >> ASAN_SHADOW_SHIFT);
+	  prev_offset = last_offset;
+	  clear_storage (shadow_mem, GEN_INT (last_size >> ASAN_SHADOW_SHIFT),
+			 BLOCK_OP_NORMAL);
+	  last_offset = offset;
+	  last_size = 0;
+	}
+      last_size += base_offset + ((offsets[l - 2] - base_offset)
+				  & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1))
+		   - offset;
+    }
+  if (last_size)
+    {
+      shadow_mem = adjust_address (shadow_mem, VOIDmode,
+				   (last_offset - prev_offset)
+				   >> ASAN_SHADOW_SHIFT);
+      clear_storage (shadow_mem, GEN_INT (last_size >> ASAN_SHADOW_SHIFT),
+		     BLOCK_OP_NORMAL);
+    }
+
+  do_pending_stack_adjust ();
+
+  ret = get_insns ();
+  end_sequence ();
+  return ret;
+}
+
 /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
    IS_STORE is either 1 (for a store) or 0 (for a load).
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
@@ -389,12 +677,12 @@ asan_finish_file (void)
 static void
 asan_init_shadow_ptr_types (void)
 {
-  alias_set_type set = new_alias_set ();
+  asan_shadow_set = new_alias_set ();
   shadow_ptr_types[0] = build_distinct_type_copy (signed_char_type_node);
-  TYPE_ALIAS_SET (shadow_ptr_types[0]) = set;
+  TYPE_ALIAS_SET (shadow_ptr_types[0]) = asan_shadow_set;
   shadow_ptr_types[0] = build_pointer_type (shadow_ptr_types[0]);
   shadow_ptr_types[1] = build_distinct_type_copy (short_integer_type_node);
-  TYPE_ALIAS_SET (shadow_ptr_types[1]) = set;
+  TYPE_ALIAS_SET (shadow_ptr_types[1]) = asan_shadow_set;
   shadow_ptr_types[1] = build_pointer_type (shadow_ptr_types[1]);
 }
 
diff --git a/gcc/asan.h b/gcc/asan.h
index 0d9ab8b..6f0edbf 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -21,10 +21,39 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef TREE_ASAN
 #define TREE_ASAN
 
-extern void asan_finish_file(void);
+extern void asan_finish_file (void);
+extern rtx asan_emit_stack_protection (rtx, HOST_WIDE_INT *, tree *, int);
+
+/* Alias set for accessing the shadow memory.  */
+extern alias_set_type asan_shadow_set;
 
 /* Shadow memory is found at
    (address >> ASAN_SHADOW_SHIFT) + targetm.asan_shadow_offset ().  */
 #define ASAN_SHADOW_SHIFT	3
 
+/* Red zone size, stack and global variables are padded by ASAN_RED_ZONE_SIZE
+   up to 2 * ASAN_RED_ZONE_SIZE - 1 bytes.  */
+#define ASAN_RED_ZONE_SIZE	32
+
+/* Shadow memory values for stack protection.  Left is below protected vars,
+   the first pointer in stack corresponding to that offset contains
+   ASAN_STACK_FRAME_MAGIC word, the second pointer to a string describing
+   the frame.  Middle is for padding in between variables, right is
+   above the last protected variable and partial immediately after variables
+   up to ASAN_RED_ZONE_SIZE alignment.  */
+#define ASAN_STACK_MAGIC_LEFT		0xf1
+#define ASAN_STACK_MAGIC_MIDDLE		0xf2
+#define ASAN_STACK_MAGIC_RIGHT		0xf3
+#define ASAN_STACK_MAGIC_PARTIAL	0xf4
+
+#define ASAN_STACK_FRAME_MAGIC	0x41b58ab3
+
+/* Return true if DECL should be guarded on the stack.  */
+
+static inline bool
+asan_protect_stack_decl (tree decl)
+{
+  return DECL_P (decl) && !DECL_ARTIFICIAL (decl);
+}
+
 #endif /* TREE_ASAN */
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index e501b4b..16fd0fb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -47,6 +47,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "regs.h" /* For reg_renumber.  */
 #include "insn-attr.h" /* For INSN_SCHEDULING.  */
+#include "asan.h"
 
 /* This variable holds information helping the rewriting of SSA trees
    into RTL.  */
@@ -736,6 +737,7 @@ partition_stack_vars (void)
     {
       size_t i = stack_vars_sorted[si];
       unsigned int ialign = stack_vars[i].alignb;
+      HOST_WIDE_INT isize = stack_vars[i].size;
 
       /* Ignore objects that aren't partition representatives. If we
          see a var that is not a partition representative, it must
@@ -747,19 +749,28 @@ partition_stack_vars (void)
 	{
 	  size_t j = stack_vars_sorted[sj];
 	  unsigned int jalign = stack_vars[j].alignb;
+	  HOST_WIDE_INT jsize = stack_vars[j].size;
 
 	  /* Ignore objects that aren't partition representatives.  */
 	  if (stack_vars[j].representative != j)
 	    continue;
 
-	  /* Ignore conflicting objects.  */
-	  if (stack_var_conflict_p (i, j))
-	    continue;
-
 	  /* Do not mix objects of "small" (supported) alignment
 	     and "large" (unsupported) alignment.  */
 	  if ((ialign * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
 	      != (jalign * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT))
+	    break;
+
+	  /* For Address Sanitizer do not mix objects with different
+	     sizes, as the shorter vars wouldn't be adequately protected.
+	     Don't do that for "large" (unsupported) alignment objects,
+	     those aren't protected anyway.  */
+	  if (flag_asan && isize != jsize
+	      && ialign * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
+	    break;
+
+	  /* Ignore conflicting objects.  */
+	  if (stack_var_conflict_p (i, j))
 	    continue;
 
 	  /* UNION the objects, placing J at OFFSET.  */
@@ -837,12 +848,26 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align,
   set_rtl (decl, x);
 }
 
+DEF_VEC_I(HOST_WIDE_INT);
+DEF_VEC_ALLOC_I(HOST_WIDE_INT,heap);
+
+struct stack_vars_data
+{
+  /* Vector of offset pairs, always end of some padding followed
+     by start of the padding that needs Address Sanitizer protection.
+     The vector is in reversed, highest offset pairs come first.  */
+  VEC(HOST_WIDE_INT, heap) *asan_vec;
+
+  /* Vector of partition representative decls in between the paddings.  */
+  VEC(tree, heap) *asan_decl_vec;
+};
+
 /* A subroutine of expand_used_vars.  Give each partition representative
    a unique location within the stack frame.  Update each partition member
    with that location.  */
 
 static void
-expand_stack_vars (bool (*pred) (tree))
+expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
 {
   size_t si, i, j, n = stack_vars_num;
   HOST_WIDE_INT large_size = 0, large_alloc = 0;
@@ -913,13 +938,45 @@ expand_stack_vars (bool (*pred) (tree))
 
       /* Check the predicate to see whether this variable should be
 	 allocated in this pass.  */
-      if (pred && !pred (decl))
+      if (pred && !pred (i))
 	continue;
 
       alignb = stack_vars[i].alignb;
       if (alignb * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
 	{
-	  offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
+	  if (flag_asan && pred)
+	    {
+	      HOST_WIDE_INT prev_offset = frame_offset;
+	      tree repr_decl = NULL_TREE;
+
+	      offset
+		= alloc_stack_frame_space (stack_vars[i].size
+					   + ASAN_RED_ZONE_SIZE,
+					   MAX (alignb, ASAN_RED_ZONE_SIZE));
+	      VEC_safe_push (HOST_WIDE_INT, heap, data->asan_vec,
+			     prev_offset);
+	      VEC_safe_push (HOST_WIDE_INT, heap, data->asan_vec,
+			     offset + stack_vars[i].size);
+	      /* Find best representative of the partition.
+		 Prefer those with DECL_NAME, even better
+		 satisfying asan_protect_stack_decl predicate.  */
+	      for (j = i; j != EOC; j = stack_vars[j].next)
+		if (asan_protect_stack_decl (stack_vars[j].decl)
+		    && DECL_NAME (stack_vars[j].decl))
+		  {
+		    repr_decl = stack_vars[j].decl;
+		    break;
+		  }
+		else if (repr_decl == NULL_TREE
+			 && DECL_P (stack_vars[j].decl)
+			 && DECL_NAME (stack_vars[j].decl))
+		  repr_decl = stack_vars[j].decl;
+	      if (repr_decl == NULL_TREE)
+		repr_decl = stack_vars[i].decl;
+	      VEC_safe_push (tree, heap, data->asan_decl_vec, repr_decl);
+	    }
+	  else
+	    offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
 	  base = virtual_stack_vars_rtx;
 	  base_align = crtl->max_used_stack_slot_alignment;
 	}
@@ -1057,8 +1114,9 @@ static bool
 defer_stack_allocation (tree var, bool toplevel)
 {
   /* If stack protection is enabled, *all* stack variables must be deferred,
-     so that we can re-order the strings to the top of the frame.  */
-  if (flag_stack_protect)
+     so that we can re-order the strings to the top of the frame.
+     Similarly for Address Sanitizer.  */
+  if (flag_stack_protect || flag_asan)
     return true;
 
   /* We handle "large" alignment via dynamic allocation.  We want to handle
@@ -1329,15 +1387,31 @@ stack_protect_decl_phase (tree decl)
    as callbacks for expand_stack_vars.  */
 
 static bool
-stack_protect_decl_phase_1 (tree decl)
+stack_protect_decl_phase_1 (size_t i)
+{
+  return stack_protect_decl_phase (stack_vars[i].decl) == 1;
+}
+
+static bool
+stack_protect_decl_phase_2 (size_t i)
 {
-  return stack_protect_decl_phase (decl) == 1;
+  return stack_protect_decl_phase (stack_vars[i].decl) == 2;
 }
 
+/* And helper function that checks for asan phase (with stack protector
+   it is phase 3).  This is used as callback for expand_stack_vars.
+   Returns true if any of the vars in the partition need to be protected.  */
+
 static bool
-stack_protect_decl_phase_2 (tree decl)
+asan_decl_phase_3 (size_t i)
 {
-  return stack_protect_decl_phase (decl) == 2;
+  while (i != EOC)
+    {
+      if (asan_protect_stack_decl (stack_vars[i].decl))
+	return true;
+      i = stack_vars[i].next;
+    }
+  return false;
 }
 
 /* Ensure that variables in different stack protection phases conflict
@@ -1448,11 +1522,12 @@ estimated_stack_frame_size (struct cgraph_node *node)
 
 /* Expand all variables used in the function.  */
 
-static void
+static rtx
 expand_used_vars (void)
 {
   tree var, outer_block = DECL_INITIAL (current_function_decl);
   VEC(tree,heap) *maybe_local_decls = NULL;
+  rtx var_end_seq = NULL_RTX;
   struct pointer_map_t *ssa_name_decls;
   unsigned i;
   unsigned len;
@@ -1603,6 +1678,11 @@ expand_used_vars (void)
   /* Assign rtl to each variable based on these partitions.  */
   if (stack_vars_num > 0)
     {
+      struct stack_vars_data data;
+
+      data.asan_vec = NULL;
+      data.asan_decl_vec = NULL;
+
       /* Reorder decls to be protected by iterating over the variables
 	 array multiple times, and allocating out of each phase in turn.  */
       /* ??? We could probably integrate this into the qsort we did
@@ -1611,14 +1691,41 @@ expand_used_vars (void)
       if (has_protected_decls)
 	{
 	  /* Phase 1 contains only character arrays.  */
-	  expand_stack_vars (stack_protect_decl_phase_1);
+	  expand_stack_vars (stack_protect_decl_phase_1, &data);
 
 	  /* Phase 2 contains other kinds of arrays.  */
 	  if (flag_stack_protect == 2)
-	    expand_stack_vars (stack_protect_decl_phase_2);
+	    expand_stack_vars (stack_protect_decl_phase_2, &data);
 	}
 
-      expand_stack_vars (NULL);
+      if (flag_asan)
+	/* Phase 3, any partitions that need asan protection
+	   in addition to phase 1 and 2.  */
+	expand_stack_vars (asan_decl_phase_3, &data);
+
+      if (!VEC_empty (HOST_WIDE_INT, data.asan_vec))
+	{
+	  HOST_WIDE_INT prev_offset = frame_offset;
+	  HOST_WIDE_INT offset
+	    = alloc_stack_frame_space (ASAN_RED_ZONE_SIZE,
+				       ASAN_RED_ZONE_SIZE);
+	  VEC_safe_push (HOST_WIDE_INT, heap, data.asan_vec, prev_offset);
+	  VEC_safe_push (HOST_WIDE_INT, heap, data.asan_vec, offset);
+
+	  var_end_seq
+	    = asan_emit_stack_protection (virtual_stack_vars_rtx,
+					  VEC_address (HOST_WIDE_INT,
+						       data.asan_vec),
+					  VEC_address (tree,
+						       data.asan_decl_vec),
+					  VEC_length (HOST_WIDE_INT,
+						      data.asan_vec));
+	}
+
+      expand_stack_vars (NULL, &data);
+
+      VEC_free (HOST_WIDE_INT, heap, data.asan_vec);
+      VEC_free (tree, heap, data.asan_decl_vec);
     }
 
   fini_vars_expansion ();
@@ -1645,6 +1752,8 @@ expand_used_vars (void)
 	frame_offset += align - 1;
       frame_offset &= -align;
     }
+
+  return var_end_seq;
 }
 
 
@@ -3662,7 +3771,7 @@ expand_debug_locations (void)
 /* Expand basic block BB from GIMPLE trees to RTL.  */
 
 static basic_block
-expand_gimple_basic_block (basic_block bb)
+expand_gimple_basic_block (basic_block bb, bool disable_tail_calls)
 {
   gimple_stmt_iterator gsi;
   gimple_seq stmts;
@@ -3950,6 +4059,11 @@ expand_gimple_basic_block (basic_block bb)
 	}
       else
 	{
+	  if (is_gimple_call (stmt)
+	      && gimple_call_tail_p (stmt)
+	      && disable_tail_calls)
+	    gimple_call_set_tail (stmt, false);
+
 	  if (is_gimple_call (stmt) && gimple_call_tail_p (stmt))
 	    {
 	      bool can_fallthru;
@@ -4309,7 +4423,7 @@ gimple_expand_cfg (void)
   sbitmap blocks;
   edge_iterator ei;
   edge e;
-  rtx var_seq;
+  rtx var_seq, var_ret_seq;
   unsigned i;
 
   timevar_push (TV_OUT_OF_SSA);
@@ -4369,7 +4483,7 @@ gimple_expand_cfg (void)
   timevar_push (TV_VAR_EXPAND);
   start_sequence ();
 
-  expand_used_vars ();
+  var_ret_seq = expand_used_vars ();
 
   var_seq = get_insns ();
   end_sequence ();
@@ -4495,7 +4609,7 @@ gimple_expand_cfg (void)
 
   lab_rtx_for_bb = pointer_map_create ();
   FOR_BB_BETWEEN (bb, init_block->next_bb, EXIT_BLOCK_PTR, next_bb)
-    bb = expand_gimple_basic_block (bb);
+    bb = expand_gimple_basic_block (bb, var_ret_seq != NULL_RTX);
 
   if (MAY_HAVE_DEBUG_INSNS)
     expand_debug_locations ();
@@ -4523,6 +4637,15 @@ gimple_expand_cfg (void)
   construct_exit_block ();
   insn_locations_finalize ();
 
+  if (var_ret_seq)
+    {
+      rtx after = return_label;
+      rtx next = NEXT_INSN (after);
+      if (next && NOTE_INSN_BASIC_BLOCK_P (next))
+	after = next;
+      emit_insn_after (var_ret_seq, after);
+    }
+
   /* Zap the tree EH table.  */
   set_eh_throw_stmt_table (cfun, NULL);
 
diff --git a/gcc/toplev.c b/gcc/toplev.c
index d9dfb2a..8911ca3 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1542,7 +1542,9 @@ process_options (void)
     }
 
   /* Address Sanitizer needs porting to each target architecture.  */
-  if (flag_asan && targetm.asan_shadow_offset == NULL)
+  if (flag_asan
+      && (targetm.asan_shadow_offset == NULL
+	  || !FRAME_GROWS_DOWNWARD))
     {
       warning (0, "-faddress-sanitizer not supported for this target");
       flag_asan = 0;
-- 
1.7.11.7


-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 06/10] Implement protection of global variables
  2012-11-06 17:27     ` Diego Novillo
@ 2012-11-12 11:32       ` Dodji Seketeli
  0 siblings, 0 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-12 11:32 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc-patches, jakub, wmi, davidxl, konstantin.s.serebryany

Diego Novillo <dnovillo@google.com> writes:

> On 2012-11-02 16:01 , Dodji Seketeli wrote:
> 
> > 	* varasm.c: Include asan.h.
> > 	(assemble_noswitch_variable): Grow size by asan_red_zone_size
> > 	if decl is asan protected.
> > 	(place_block_symbol): Likewise.
> > 	(assemble_variable): If decl is asan protected, increase
> > 	DECL_ALIGN if needed, and for decls emitted using
> > 	assemble_variable_contents append padding zeros after it.
> > 	* Makefile.in (varasm.o): Depend on asan.h.
> > 	* asan.c: Include output.h.
> > 	(asan_pp, asan_pp_initialized, asan_ctor_statements): New variables.
> > 	(asan_pp_initialize, asan_pp_string): New functions.
> > 	(asan_emit_stack_protection): Use asan_pp{,_initialized}
> > 	instead of local pp{,_initialized} vars, use asan_pp_initialize
> > 	and asan_pp_string helpers.
> > 	(asan_needs_local_alias, asan_protect_global,
> > 	asan_global_struct, asan_add_global): New functions.
> > 	(asan_finish_file): Protect global vars that can be protected. Use
> > 	asan_ctor_statements instead of ctor_statements
> > 	* asan.h (asan_protect_global): New prototype.
> > 	(asan_red_zone_size): New inline function.
> 
> OK.

Thanks.

> Please, also put the high-level description in asan.c's documentation.

Done.  Below is the updated patch.

This patch implements the protection of global variables.  See the
comments appended to the beginning of the asan.c file.

	* varasm.c: Include asan.h.
	(assemble_noswitch_variable): Grow size by asan_red_zone_size
	if decl is asan protected.
	(place_block_symbol): Likewise.
	(assemble_variable): If decl is asan protected, increase
	DECL_ALIGN if needed, and for decls emitted using
	assemble_variable_contents append padding zeros after it.
	* Makefile.in (varasm.o): Depend on asan.h.
	* asan.c: Include output.h.
	(asan_pp, asan_pp_initialized, asan_ctor_statements): New variables.
	(asan_pp_initialize, asan_pp_string): New functions.
	(asan_emit_stack_protection): Use asan_pp{,_initialized}
	instead of local pp{,_initialized} vars, use asan_pp_initialize
	and asan_pp_string helpers.
	(asan_needs_local_alias, asan_protect_global,
	asan_global_struct, asan_add_global): New functions.
	(asan_finish_file): Protect global vars that can be protected. Use
	asan_ctor_statements instead of ctor_statements
	* asan.h (asan_protect_global): New prototype.
	(asan_red_zone_size): New inline function.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192541 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.asan |  24 ++++
 gcc/Makefile.in    |   2 +-
 gcc/asan.c         | 344 +++++++++++++++++++++++++++++++++++++++++++++++------
 gcc/asan.h         |  11 ++
 gcc/varasm.c       |  22 ++++
 5 files changed, 365 insertions(+), 38 deletions(-)

 	* Makefile.in (asan.o): Depend on $(EXPR_H) $(OPTABS_H).
 	(cfgexpand.o): Depend on asan.h.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 7a0b074..83a424e 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2719,7 +2719,7 @@ varasm.o : varasm.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
    output.h $(DIAGNOSTIC_CORE_H) xcoffout.h debug.h $(GGC_H) $(TM_P_H) \
    $(HASHTAB_H) $(TARGET_H) langhooks.h gt-varasm.h $(BASIC_BLOCK_H) \
    $(CGRAPH_H) $(TARGET_DEF_H) tree-mudflap.h \
-   pointer-set.h $(COMMON_TARGET_H)
+   pointer-set.h $(COMMON_TARGET_H) asan.h
 function.o : function.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_ERROR_H) \
    $(TREE_H) $(GIMPLE_H) $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) \
    $(OPTABS_H) $(LIBFUNCS_H) $(REGS_H) hard-reg-set.h insn-config.h $(RECOG_H) \
diff --git a/gcc/asan.c b/gcc/asan.c
index a3bb47b..c88f59d 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "target.h"
 #include "expr.h"
 #include "optabs.h"
+#include "output.h"
 
 /*
  AddressSanitizer finds out-of-bounds and use-after-free bugs 
@@ -166,7 +167,43 @@ along with GCC; see the file COPYING3.  If not see
  non-accessible) the regions of the red zones and mark the regions of
  stack variables as accessible, and emit some epilogue code to
  un-poison (mark as accessible) the regions of red zones right before
- the function exits.  */
+ the function exits.
+
+ [Protection of global variables]
+
+ The basic idea is to insert a red zone between two global variables
+ and install a constructor function that calls the asan runtime to do
+ the populating of the relevant shadow memory regions at load time.
+
+ So the global variables are laid out as to insert a red zone between
+ them. The size of the red zones is so that each variable starts on a
+ 32 bytes boundary.
+
+ Then a constructor function is installed so that, for each global
+ variable, it calls the runtime asan library function
+ __asan_register_globals_with an instance of this type:
+
+     struct __asan_global
+     {
+       // Address of the beginning of the global variable.
+       const void *__beg;
+
+       // Initial size of the global variable.
+       uptr __size;
+
+       // Size of the global variable + size of the red zone.  This
+       //   size is 32 bytes aligned.
+       uptr __size_with_redzone;
+
+       // Name of the global variable.
+       const void *__name;
+
+       // This is always set to NULL for now.
+       uptr __has_dynamic_init;
+     }
+
+ A destructor function that calls the runtime asan library function
+ _asan_unregister_globals is also installed.  */
 
 alias_set_type asan_shadow_set = -1;
 
@@ -174,6 +211,34 @@ alias_set_type asan_shadow_set = -1;
    alias set is used for all shadow memory accesses.  */
 static GTY(()) tree shadow_ptr_types[2];
 
+/* Asan pretty-printer, used for buidling of the description STRING_CSTs.  */
+static pretty_printer asan_pp;
+static bool asan_pp_initialized;
+
+/* Initialize asan_pp.  */
+
+static void
+asan_pp_initialize (void)
+{
+  pp_construct (&asan_pp, /* prefix */NULL, /* line-width */0);
+  asan_pp_initialized = true;
+}
+
+/* Create ADDR_EXPR of STRING_CST with asan_pp text.  */
+
+static tree
+asan_pp_string (void)
+{
+  const char *buf = pp_base_formatted_text (&asan_pp);
+  size_t len = strlen (buf);
+  tree ret = build_string (len + 1, buf);
+  TREE_TYPE (ret)
+    = build_array_type (char_type_node, build_index_type (size_int (len)));
+  TREE_READONLY (ret) = 1;
+  TREE_STATIC (ret) = 1;
+  return build1 (ADDR_EXPR, build_pointer_type (char_type_node), ret);
+}
+
 /* Return a CONST_INT representing 4 subsequent shadow memory bytes.  */
 
 static rtx
@@ -208,51 +273,38 @@ asan_emit_stack_protection (rtx base, HOST_WIDE_INT *offsets, tree *decls,
   HOST_WIDE_INT last_offset, last_size;
   int l;
   unsigned char cur_shadow_byte = ASAN_STACK_MAGIC_LEFT;
-  static pretty_printer pp;
-  static bool pp_initialized;
-  const char *buf;
-  size_t len;
   tree str_cst;
 
   /* First of all, prepare the description string.  */
-  if (!pp_initialized)
-    {
-      pp_construct (&pp, /* prefix */NULL, /* line-width */0);
-      pp_initialized = true;
-    }
-  pp_clear_output_area (&pp);
+  if (!asan_pp_initialized)
+    asan_pp_initialize ();
+
+  pp_clear_output_area (&asan_pp);
   if (DECL_NAME (current_function_decl))
-    pp_base_tree_identifier (&pp, DECL_NAME (current_function_decl));
+    pp_base_tree_identifier (&asan_pp, DECL_NAME (current_function_decl));
   else
-    pp_string (&pp, "<unknown>");
-  pp_space (&pp);
-  pp_decimal_int (&pp, length / 2 - 1);
-  pp_space (&pp);
+    pp_string (&asan_pp, "<unknown>");
+  pp_space (&asan_pp);
+  pp_decimal_int (&asan_pp, length / 2 - 1);
+  pp_space (&asan_pp);
   for (l = length - 2; l; l -= 2)
     {
       tree decl = decls[l / 2 - 1];
-      pp_wide_integer (&pp, offsets[l] - base_offset);
-      pp_space (&pp);
-      pp_wide_integer (&pp, offsets[l - 1] - offsets[l]);
-      pp_space (&pp);
+      pp_wide_integer (&asan_pp, offsets[l] - base_offset);
+      pp_space (&asan_pp);
+      pp_wide_integer (&asan_pp, offsets[l - 1] - offsets[l]);
+      pp_space (&asan_pp);
       if (DECL_P (decl) && DECL_NAME (decl))
 	{
-	  pp_decimal_int (&pp, IDENTIFIER_LENGTH (DECL_NAME (decl)));
-	  pp_space (&pp);
-	  pp_base_tree_identifier (&pp, DECL_NAME (decl));
+	  pp_decimal_int (&asan_pp, IDENTIFIER_LENGTH (DECL_NAME (decl)));
+	  pp_space (&asan_pp);
+	  pp_base_tree_identifier (&asan_pp, DECL_NAME (decl));
 	}
       else
-	pp_string (&pp, "9 <unknown>");
-      pp_space (&pp);
+	pp_string (&asan_pp, "9 <unknown>");
+      pp_space (&asan_pp);
     }
-  buf = pp_base_formatted_text (&pp);
-  len = strlen (buf);
-  str_cst = build_string (len + 1, buf);
-  TREE_TYPE (str_cst)
-    = build_array_type (char_type_node, build_index_type (size_int (len)));
-  TREE_READONLY (str_cst) = 1;
-  TREE_STATIC (str_cst) = 1;
-  str_cst = build1 (ADDR_EXPR, build_pointer_type (char_type_node), str_cst);
+  str_cst = asan_pp_string ();
 
   /* Emit the prologue sequence.  */
   base = expand_binop (Pmode, add_optab, base, GEN_INT (base_offset),
@@ -357,6 +409,75 @@ asan_emit_stack_protection (rtx base, HOST_WIDE_INT *offsets, tree *decls,
   return ret;
 }
 
+/* Return true if DECL, a global var, might be overridden and needs
+   therefore a local alias.  */
+
+static bool
+asan_needs_local_alias (tree decl)
+{
+  return DECL_WEAK (decl) || !targetm.binds_local_p (decl);
+}
+
+/* Return true if DECL is a VAR_DECL that should be protected
+   by Address Sanitizer, by appending a red zone with protected
+   shadow memory after it and aligning it to at least
+   ASAN_RED_ZONE_SIZE bytes.  */
+
+bool
+asan_protect_global (tree decl)
+{
+  rtx rtl, symbol;
+  section *sect;
+
+  if (TREE_CODE (decl) != VAR_DECL
+      /* TLS vars aren't statically protectable.  */
+      || DECL_THREAD_LOCAL_P (decl)
+      /* Externs will be protected elsewhere.  */
+      || DECL_EXTERNAL (decl)
+      || !TREE_ASM_WRITTEN (decl)
+      || !DECL_RTL_SET_P (decl)
+      /* Comdat vars pose an ABI problem, we can't know if
+	 the var that is selected by the linker will have
+	 padding or not.  */
+      || DECL_ONE_ONLY (decl)
+      /* Similarly for common vars.  People can use -fno-common.  */
+      || DECL_COMMON (decl)
+      /* Don't protect if using user section, often vars placed
+	 into user section from multiple TUs are then assumed
+	 to be an array of such vars, putting padding in there
+	 breaks this assumption.  */
+      || (DECL_SECTION_NAME (decl) != NULL_TREE
+	  && !DECL_HAS_IMPLICIT_SECTION_NAME_P (decl))
+      || DECL_SIZE (decl) == 0
+      || ASAN_RED_ZONE_SIZE * BITS_PER_UNIT > MAX_OFILE_ALIGNMENT
+      || !valid_constant_size_p (DECL_SIZE_UNIT (decl))
+      || DECL_ALIGN_UNIT (decl) > 2 * ASAN_RED_ZONE_SIZE)
+    return false;
+
+  rtl = DECL_RTL (decl);
+  if (!MEM_P (rtl) || GET_CODE (XEXP (rtl, 0)) != SYMBOL_REF)
+    return false;
+  symbol = XEXP (rtl, 0);
+
+  if (CONSTANT_POOL_ADDRESS_P (symbol)
+      || TREE_CONSTANT_POOL_ADDRESS_P (symbol))
+    return false;
+
+  sect = get_variable_section (decl, false);
+  if (sect->common.flags & SECTION_COMMON)
+    return false;
+
+  if (lookup_attribute ("weakref", DECL_ATTRIBUTES (decl)))
+    return false;
+
+#ifndef ASM_OUTPUT_DEF
+  if (asan_needs_local_alias (decl))
+    return false;
+#endif
+
+  return true;    
+}
+
 /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
    IS_STORE is either 1 (for a store) or 0 (for a load).
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
@@ -657,6 +778,105 @@ transform_statements (void)
     }
 }
 
+/* Build
+   struct __asan_global
+   {
+     const void *__beg;
+     uptr __size;
+     uptr __size_with_redzone;
+     const void *__name;
+     uptr __has_dynamic_init;
+   } type.  */
+
+static tree
+asan_global_struct (void)
+{
+  static const char *field_names[5]
+    = { "__beg", "__size", "__size_with_redzone",
+	"__name", "__has_dynamic_init" };
+  tree fields[5], ret;
+  int i;
+
+  ret = make_node (RECORD_TYPE);
+  for (i = 0; i < 5; i++)
+    {
+      fields[i]
+	= build_decl (UNKNOWN_LOCATION, FIELD_DECL,
+		      get_identifier (field_names[i]),
+		      (i == 0 || i == 3) ? const_ptr_type_node
+		      : build_nonstandard_integer_type (POINTER_SIZE, 1));
+      DECL_CONTEXT (fields[i]) = ret;
+      if (i)
+	DECL_CHAIN (fields[i - 1]) = fields[i];
+    }
+  TYPE_FIELDS (ret) = fields[0];
+  TYPE_NAME (ret) = get_identifier ("__asan_global");
+  layout_type (ret);
+  return ret;
+}
+
+/* Append description of a single global DECL into vector V.
+   TYPE is __asan_global struct type as returned by asan_global_struct.  */
+
+static void
+asan_add_global (tree decl, tree type, VEC(constructor_elt, gc) *v)
+{
+  tree init, uptr = TREE_TYPE (DECL_CHAIN (TYPE_FIELDS (type)));
+  unsigned HOST_WIDE_INT size;
+  tree str_cst, refdecl = decl;
+  VEC(constructor_elt, gc) *vinner = NULL;
+
+  if (!asan_pp_initialized)
+    asan_pp_initialize ();
+
+  pp_clear_output_area (&asan_pp);
+  if (DECL_NAME (decl))
+    pp_base_tree_identifier (&asan_pp, DECL_NAME (decl));
+  else
+    pp_string (&asan_pp, "<unknown>");
+  pp_space (&asan_pp);
+  pp_left_paren (&asan_pp);
+  pp_string (&asan_pp, main_input_filename);
+  pp_right_paren (&asan_pp);
+  str_cst = asan_pp_string ();
+
+  if (asan_needs_local_alias (decl))
+    {
+      char buf[20];
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LASAN",
+				   VEC_length (constructor_elt, v) + 1);
+      refdecl = build_decl (DECL_SOURCE_LOCATION (decl),
+			    VAR_DECL, get_identifier (buf), TREE_TYPE (decl));
+      TREE_ADDRESSABLE (refdecl) = TREE_ADDRESSABLE (decl);
+      TREE_READONLY (refdecl) = TREE_READONLY (decl);
+      TREE_THIS_VOLATILE (refdecl) = TREE_THIS_VOLATILE (decl);
+      DECL_GIMPLE_REG_P (refdecl) = DECL_GIMPLE_REG_P (decl);
+      DECL_ARTIFICIAL (refdecl) = DECL_ARTIFICIAL (decl);
+      DECL_IGNORED_P (refdecl) = DECL_IGNORED_P (decl);
+      TREE_STATIC (refdecl) = 1;
+      TREE_PUBLIC (refdecl) = 0;
+      TREE_USED (refdecl) = 1;
+      assemble_alias (refdecl, DECL_ASSEMBLER_NAME (decl));
+    }
+
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE,
+			  fold_convert (const_ptr_type_node,
+					build_fold_addr_expr (refdecl)));
+  size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE, build_int_cst (uptr, size));
+  size += asan_red_zone_size (size);
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE, build_int_cst (uptr, size));
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE,
+			  fold_convert (const_ptr_type_node, str_cst));
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE, build_int_cst (uptr, 0));
+  init = build_constructor (type, vinner);
+  CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, init);
+}
+
+/* Needs to be GTY(()), because cgraph_build_static_cdtor may
+   invoke ggc_collect.  */
+static GTY(()) tree asan_ctor_statements;
+
 /* Module-level instrumentation.
    - Insert __asan_init() into the list of CTORs.
    - TODO: insert redzones around globals.
@@ -665,11 +885,61 @@ transform_statements (void)
 void
 asan_finish_file (void)
 {
-  tree ctor_statements = NULL_TREE;
+  struct varpool_node *vnode;
+  unsigned HOST_WIDE_INT gcount = 0;
+
   append_to_statement_list (build_call_expr (asan_init_func (), 0),
-                            &ctor_statements);
-  cgraph_build_static_cdtor ('I', ctor_statements,
-                             MAX_RESERVED_INIT_PRIORITY - 1);
+			    &asan_ctor_statements);
+  FOR_EACH_DEFINED_VARIABLE (vnode)
+    if (asan_protect_global (vnode->symbol.decl))
+      ++gcount;
+  if (gcount)
+    {
+      tree type = asan_global_struct (), var, ctor, decl;
+      tree uptr = build_nonstandard_integer_type (POINTER_SIZE, 1);
+      tree dtor_statements = NULL_TREE;
+      VEC(constructor_elt, gc) *v;
+      char buf[20];
+
+      type = build_array_type_nelts (type, gcount);
+      ASM_GENERATE_INTERNAL_LABEL (buf, "LASAN", 0);
+      var = build_decl (UNKNOWN_LOCATION, VAR_DECL, get_identifier (buf),
+			type);
+      TREE_STATIC (var) = 1;
+      TREE_PUBLIC (var) = 0;
+      DECL_ARTIFICIAL (var) = 1;
+      DECL_IGNORED_P (var) = 1;
+      v = VEC_alloc (constructor_elt, gc, gcount);
+      FOR_EACH_DEFINED_VARIABLE (vnode)
+	if (asan_protect_global (vnode->symbol.decl))
+	  asan_add_global (vnode->symbol.decl, TREE_TYPE (type), v);
+      ctor = build_constructor (type, v);
+      TREE_CONSTANT (ctor) = 1;
+      TREE_STATIC (ctor) = 1;
+      DECL_INITIAL (var) = ctor;
+      varpool_assemble_decl (varpool_node_for_decl (var));
+
+      type = build_function_type_list (void_type_node,
+				       build_pointer_type (TREE_TYPE (type)),
+				       uptr, NULL_TREE);
+      decl = build_fn_decl ("__asan_register_globals", type);
+      TREE_NOTHROW (decl) = 1;
+      append_to_statement_list (build_call_expr (decl, 2,
+						 build_fold_addr_expr (var),
+						 build_int_cst (uptr, gcount)),
+				&asan_ctor_statements);
+
+      decl = build_fn_decl ("__asan_unregister_globals", type);
+      TREE_NOTHROW (decl) = 1;
+      append_to_statement_list (build_call_expr (decl, 2,
+						 build_fold_addr_expr (var),
+						 build_int_cst (uptr, gcount)),
+				&dtor_statements);
+      cgraph_build_static_cdtor ('D', dtor_statements,
+				 MAX_RESERVED_INIT_PRIORITY - 1);
+    }
+  cgraph_build_static_cdtor ('I', asan_ctor_statements,
+			     MAX_RESERVED_INIT_PRIORITY - 1);
 }
 
 /* Initialize shadow_ptr_types array.  */
diff --git a/gcc/asan.h b/gcc/asan.h
index 6f0edbf..d9368a8 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 
 extern void asan_finish_file (void);
 extern rtx asan_emit_stack_protection (rtx, HOST_WIDE_INT *, tree *, int);
+extern bool asan_protect_global (tree);
 
 /* Alias set for accessing the shadow memory.  */
 extern alias_set_type asan_shadow_set;
@@ -56,4 +57,14 @@ asan_protect_stack_decl (tree decl)
   return DECL_P (decl) && !DECL_ARTIFICIAL (decl);
 }
 
+/* Return the size of padding needed to insert after a protected
+   decl of SIZE.  */
+
+static inline unsigned int
+asan_red_zone_size (unsigned int size)
+{
+  unsigned int c = size & (ASAN_RED_ZONE_SIZE - 1);
+  return c ? 2 * ASAN_RED_ZONE_SIZE - c : ASAN_RED_ZONE_SIZE;
+}
+
 #endif /* TREE_ASAN */
diff --git a/gcc/varasm.c b/gcc/varasm.c
index b300348..641ce0c 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -51,6 +51,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-mudflap.h"
 #include "cgraph.h"
 #include "pointer-set.h"
+#include "asan.h"
 
 #ifdef XCOFF_DEBUGGING_INFO
 #include "xcoffout.h"		/* Needed for external data
@@ -1831,6 +1832,9 @@ assemble_noswitch_variable (tree decl, const char *name, section *sect)
   size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
   rounded = size;
 
+  if (flag_asan && asan_protect_global (decl))
+    size += asan_red_zone_size (size);
+
   /* Don't allocate zero bytes of common,
      since that means "undefined external" in the linker.  */
   if (size == 0)
@@ -1897,6 +1901,7 @@ assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED,
   const char *name;
   rtx decl_rtl, symbol;
   section *sect;
+  bool asan_protected = false;
 
   /* This function is supposed to handle VARIABLES.  Ensure we have one.  */
   gcc_assert (TREE_CODE (decl) == VAR_DECL);
@@ -1984,6 +1989,15 @@ assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED,
   /* Compute the alignment of this data.  */
 
   align_variable (decl, dont_output_data);
+
+  if (flag_asan
+      && asan_protect_global (decl))
+    {
+      asan_protected = true;
+      DECL_ALIGN (decl) = MAX (DECL_ALIGN (decl), 
+                               ASAN_RED_ZONE_SIZE * BITS_PER_UNIT);
+    }
+
   set_mem_align (decl_rtl, DECL_ALIGN (decl));
 
   if (TREE_PUBLIC (decl))
@@ -2022,6 +2036,12 @@ assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED,
       if (DECL_ALIGN (decl) > BITS_PER_UNIT)
 	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (DECL_ALIGN_UNIT (decl)));
       assemble_variable_contents (decl, name, dont_output_data);
+      if (asan_protected)
+	{
+	  unsigned HOST_WIDE_INT int size
+	    = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
+	  assemble_zeros (asan_red_zone_size (size));
+	}
     }
 }
 
@@ -6926,6 +6946,8 @@ place_block_symbol (rtx symbol)
       decl = SYMBOL_REF_DECL (symbol);
       alignment = DECL_ALIGN (decl);
       size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
+      if (flag_asan && asan_protect_global (decl))
+	size += asan_red_zone_size (size);
     }
 
   /* Calculate the object's offset from the start of the block.  */
-- 
1.7.11.7


-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 09/10] Instrument built-in memory access function calls
  2012-11-06 17:37     ` Diego Novillo
@ 2012-11-12 11:40       ` Dodji Seketeli
  0 siblings, 0 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-12 11:40 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc-patches, jakub, wmi, davidxl, konstantin.s.serebryany

Diego Novillo <dnovillo@google.com> writes:

> On 2012-11-02 16:05 , Dodji Seketeli wrote:
>
>> +static bool
>> +maybe_instrument_builtin_call (gimple_stmt_iterator *iter)
>> +{
>> +  gimple call = gsi_stmt (*iter);
>> +  location_t loc = gimple_location (call);
>> +
>> +  if (!is_gimple_call (call))
>> +    return false;
>
> Nit.  Why not factor this out and change the caller to:
>
> if (is_builtin_call (stmt))
>    instrument_builtin_call (stmt);
>
> I don't much like functions that do many combined things.

OK, I have done that in the first patch below.

The second patch applies on top of this one, and is an update in reply
to the thread brought by Tobias, which title is:

    09-nov. [Tobias Burnus     ] [asan] Patch - fix an ICE in asan.c

It's a patch that Jakub posted in that sub-thread that I have rebased
(and slightly changed some comments) on top of the first patch.

I think both patches should be squashed to make just one patch.

gcc/
	* gimple.h (is_gimple_builtin_call): Declare ...
	* gimple.c (is_gimple_builtin_call): ... New public function.
	* asan.c (insert_if_then_before_iter, instrument_mem_region_access,
	instrument_strlen_call, maybe_instrument_builtin_call,
	instrument_call): New static functions.
	(create_cond_insert_point): Renamed
	create_cond_insert_point_before_iter into this.  Add a new
	parameter to decide whether to insert the condition before or
	after the statement iterator.
	(build_check_stmt): Adjust for the new create_cond_insert_point.
	Add a new parameter to decide whether to add the instrumentation
	code before or after the statement iterator.
	(instrument_assignment): Factorize from ...
	(transform_statements): ... here.  Use maybe_instrument_call to
	instrument builtin function calls as well.
	(instrument_derefs): Adjust for the new parameter of
	build_check_stmt.  Fix detection of bit-field access.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/asan@192845 138bc75d-0d04-0410-961f-82ee72b054a4

fixup! Instrument built-in memory access function calls
---
 gcc/ChangeLog.asan |  20 ++
 gcc/asan.c         | 604 ++++++++++++++++++++++++++++++++++++++++++++++++++---
 gcc/gimple.c       |  16 ++
 gcc/gimple.h       |   3 +
 4 files changed, 614 insertions(+), 29 deletions(-)


 	* asan.c (create_cond_insert_point_before_iter): Factorize out of ...
 	(build_check_stmt): ... here.
 
diff --git a/gcc/asan.c b/gcc/asan.c
index 527405b..ef855fb 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -521,9 +521,9 @@ asan_init_func (void)
 #define PROB_ALWAYS		(REG_BR_PROB_BASE)
 
 /* Split the current basic block and create a condition statement
-   insertion point right before the statement pointed to by ITER.
-   Return an iterator to the point at which the caller might safely
-   insert the condition statement.
+   insertion point right before or after the statement pointed to by
+   ITER.  Return an iterator to the point at which the caller might
+   safely insert the condition statement.
 
    THEN_BLOCK must be set to the address of an uninitialized instance
    of basic_block.  The function will then set *THEN_BLOCK to the
@@ -537,18 +537,21 @@ asan_init_func (void)
    statements starting from *ITER, and *THEN_BLOCK is a new empty
    block.
 
-   *ITER is adjusted to still point to the same statement it was
-   *pointing to initially.  */
+   *ITER is adjusted to point to always point to the first statement
+    of the basic block * FALLTHROUGH_BLOCK.  That statement is the
+    same as what ITER was pointing to prior to calling this function,
+    if BEFORE_P is true; otherwise, it is its following statement.  */
 
 static gimple_stmt_iterator
-create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
-				      bool then_more_likely_p,
-				      basic_block *then_block,
-				      basic_block *fallthrough_block)
+create_cond_insert_point (gimple_stmt_iterator *iter,
+			  bool before_p,
+			  bool then_more_likely_p,
+			  basic_block *then_block,
+			  basic_block *fallthrough_block)
 {
   gimple_stmt_iterator gsi = *iter;
 
-  if (!gsi_end_p (gsi))
+  if (!gsi_end_p (gsi) && before_p)
     gsi_prev (&gsi);
 
   basic_block cur_bb = gsi_bb (*iter);
@@ -589,18 +592,58 @@ create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
   return gsi_last_bb (cond_bb);
 }
 
+/* Insert an if condition followed by a 'then block' right before the
+   statement pointed to by ITER.  The fallthrough block -- which is the
+   else block of the condition as well as the destination of the
+   outcoming edge of the 'then block' -- starts with the statement
+   pointed to by ITER.
+
+   COND is the condition of the if.  
+
+   If THEN_MORE_LIKELY_P is true, the probability of the edge to the
+   'then block' is higher than the probability of the edge to the
+   fallthrough block.
+
+   Upon completion of the function, *THEN_BB is set to the newly
+   inserted 'then block' and similarly, *FALLTHROUGH_BB is set to the
+   fallthrough block.
+
+   *ITER is adjusted to still point to the same statement it was
+   pointing to initially.  */
+
+static void
+insert_if_then_before_iter (gimple cond,
+			    gimple_stmt_iterator *iter,
+			    bool then_more_likely_p,
+			    basic_block *then_bb,
+			    basic_block *fallthrough_bb)
+{
+  gimple_stmt_iterator cond_insert_point =
+    create_cond_insert_point (iter,
+			      /*before_p=*/true,
+			      then_more_likely_p,
+			      then_bb,
+			      fallthrough_bb);
+  gsi_insert_after (&cond_insert_point, cond, GSI_NEW_STMT);
+}
+
 /* Instrument the memory access instruction BASE.  Insert new
-   statements before ITER.
+   statements before or after ITER.
 
    Note that the memory access represented by BASE can be either an
    SSA_NAME, or a non-SSA expression.  LOCATION is the source code
    location.  IS_STORE is TRUE for a store, FALSE for a load.
-   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
+   BEFORE_P is TRUE for inserting the instrumentation code before
+   ITER, FALSE for inserting it after ITER.  SIZE_IN_BYTES is one of
+   1, 2, 4, 8, 16.
+
+   If BEFORE_P is TRUE, *ITER is arranged to still point to the
+   statement it was pointing to prior to calling this function,
+   otherwise, it points to the statement logically following it.  */
 
 static void
-build_check_stmt (tree base, gimple_stmt_iterator *iter,
-                  location_t location, bool is_store,
-		  int size_in_bytes)
+build_check_stmt (location_t location, tree base, gimple_stmt_iterator *iter,
+		  bool before_p, bool is_store, int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
   basic_block then_bb, else_bb;
@@ -614,10 +657,10 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 
   /* Get an iterator on the point where we can add the condition
      statement for the instrumentation.  */
-  gsi = create_cond_insert_point_before_iter (iter,
-					      /*then_more_likely_p=*/false,
-					      &then_bb,
-					      &else_bb);
+  gsi = create_cond_insert_point (iter, before_p,
+				  /*then_more_likely_p=*/false,
+				  &then_bb,
+				  &else_bb);
 
   base = unshare_expr (base);
 
@@ -749,7 +792,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 
 /* If T represents a memory access, add instrumentation code before ITER.
    LOCATION is source code location.
-   IS_STORE is either 1 (for a store) or 0 (for a load).  */
+   IS_STORE is either TRUE (for a store) or FALSE (for a load).  */
 
 static void
 instrument_derefs (gimple_stmt_iterator *iter, tree t,
@@ -784,11 +827,515 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
   int volatilep = 0, unsignedp = 0;
   get_inner_reference (t, &bitsize, &bitpos, &offset,
 		       &mode, &unsignedp, &volatilep, false);
-  if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
+  if (bitpos % (size_in_bytes * BITS_PER_UNIT)
+      || bitsize != size_in_bytes * BITS_PER_UNIT)
     return;
 
   base = build_fold_addr_expr (t);
-  build_check_stmt (base, iter, location, is_store, size_in_bytes);
+  build_check_stmt (location, base, iter, /*before_p=*/true,
+		    is_store, size_in_bytes);
+}
+
+/* Instrument an access to a contiguous memory region that starts at
+   the address pointed to by BASE, over a length of LEN (expressed in
+   the sizeof (*BASE) bytes).  ITER points to the instruction before
+   which the instrumentation instructions must be inserted.  LOCATION
+   is the source location that the instrumentation instructions must
+   have.  If IS_STORE is true, then the memory access is a store;
+   otherwise, it's a load.  */
+
+static void
+instrument_mem_region_access (tree base, tree len,
+			      gimple_stmt_iterator *iter,
+			      location_t location, bool is_store)
+{
+  if (integer_zerop (len))
+    return;
+
+  gimple_stmt_iterator gsi = *iter;
+
+  basic_block fallthrough_bb = NULL, then_bb = NULL;
+  if (!is_gimple_constant (len))
+    {
+      /* So, the length of the memory area to asan-protect is
+	 non-constant.  Let's guard the generated instrumentation code
+	 like:
+
+	 if (len != 0)
+	   {
+	     //asan instrumentation code goes here.
+           }
+	   // falltrough instructions, starting with *ITER.  */
+
+      gimple g = gimple_build_cond (NE_EXPR,
+				    len,
+				    build_int_cst (TREE_TYPE (len), 0),
+				    NULL_TREE, NULL_TREE);
+      gimple_set_location (g, location);
+      insert_if_then_before_iter (g, iter, /*then_more_likely_p=*/true,
+				  &then_bb, &fallthrough_bb);
+      /* Note that fallthrough_bb starts with the statement that was
+	 pointed to by ITER.  */
+
+      /* The 'then block' of the 'if (len != 0) condition is where
+	 we'll generate the asan instrumentation code now.  */
+      gsi = gsi_start_bb (then_bb);
+    }
+
+  /* Instrument the beginning of the memory region to be accessed,
+     and arrange for the rest of the intrumentation code to be
+     inserted in the then block *after* the current gsi.  */
+  build_check_stmt (location, base, &gsi, /*before_p=*/true, is_store, 1);
+
+  if (then_bb)
+    /* We are in the case where the length of the region is not
+       constant; so instrumentation code is being generated in the
+       'then block' of the 'if (len != 0) condition.  Let's arrange
+       for the subsequent instrumentation statements to go in the
+       'then block'.  */
+    gsi = gsi_last_bb (then_bb);
+  else
+    *iter = gsi;
+
+  /* We want to instrument the access at the end of the memory region,
+     which is at (base + len - 1).  */
+
+  /* offset = len - 1;  */
+  len = unshare_expr (len);
+  gimple offset =
+    gimple_build_assign_with_ops (TREE_CODE (len),
+				  make_ssa_name (TREE_TYPE (len), NULL),
+				  len, NULL);
+  gimple_set_location (offset, location);
+  gsi_insert_before (&gsi, offset, GSI_NEW_STMT);
+
+  offset =
+    gimple_build_assign_with_ops (MINUS_EXPR,
+				  make_ssa_name (size_type_node, NULL),
+				  gimple_assign_lhs (offset),
+				  build_int_cst (size_type_node, 1));
+  gimple_set_location (offset, location);
+  gsi_insert_after (&gsi, offset, GSI_NEW_STMT);
+
+  /* _1 = base;  */
+  base = unshare_expr (base);
+  gimple region_end =
+    gimple_build_assign_with_ops (TREE_CODE (base),
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  base, NULL);
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* _2 = _1 + offset;  */
+  region_end =
+    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  gimple_assign_lhs (region_end), 
+				  gimple_assign_lhs (offset));
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* instrument access at _2;  */
+  build_check_stmt (location, gimple_assign_lhs (region_end),
+		    &gsi, /*before_p=*/false, is_store, 1);
+}
+
+/* Instrument the strlen builtin call pointed to by ITER.
+
+   This function instruments the access to the first byte of the
+   argument, right before the call.  After the call it instruments the
+   access to the last byte of the argument; it uses the result of the
+   call to deduce the offset of that last byte.  */
+
+static void
+instrument_strlen_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+  gcc_assert (is_gimple_call (call));
+
+  tree callee = gimple_call_fndecl (call);
+  gcc_assert (is_builtin_fn (callee)
+	      && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL
+	      && DECL_FUNCTION_CODE (callee) == BUILT_IN_STRLEN);
+
+  tree len = gimple_call_lhs (call);
+  if (len == NULL)
+    /* Some passes might clear the return value of the strlen call;
+       bail out in that case.  */
+    return;
+  gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (len)));
+
+  location_t loc = gimple_location (call);
+  tree str_arg = gimple_call_arg (call, 0);
+
+  /* Instrument the access to the first byte of str_arg.  i.e:
+
+     _1 = str_arg; instrument (_1); */
+  gimple str_arg_ssa =
+    gimple_build_assign_with_ops (NOP_EXPR,
+				  make_ssa_name (build_pointer_type
+						 (char_type_node), NULL),
+				  str_arg, NULL);
+  gimple_set_location (str_arg_ssa, loc);
+  gimple_stmt_iterator gsi = *iter;
+  gsi_insert_before (&gsi, str_arg_ssa, GSI_NEW_STMT);
+  build_check_stmt (loc, gimple_assign_lhs (str_arg_ssa), &gsi,
+		    /*before_p=*/false, /*is_store=*/false, 1);
+
+  /* If we initially had an instruction like:
+
+	 int n = strlen (str)
+
+     we now want to instrument the access to str[n], after the
+     instruction above.*/
+
+  /* So let's build the access to str[n] that is, access through the
+     pointer_plus expr: (_1 + len).  */
+  gimple stmt =
+    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+				  make_ssa_name (TREE_TYPE (str_arg),
+						 NULL),
+				  gimple_assign_lhs (str_arg_ssa),
+				  len);
+  gimple_set_location (stmt, loc);
+  gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
+
+  build_check_stmt (loc, gimple_assign_lhs (stmt), &gsi,
+		    /*before_p=*/false, /*is_store=*/false, 1);
+
+  /* Ensure that iter points to the statement logically following the
+     one it was initially pointing to.  */
+  *iter = gsi;
+}
+
+/* Instrument the call to a built-in memory access function that is
+   pointed to by the iterator ITER.  */
+
+static void
+instrument_builtin_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+
+  gcc_assert (is_gimple_builtin_call (call));
+
+  tree callee = gimple_call_fndecl (call);
+  location_t loc = gimple_location (call);
+  tree source0 = NULL_TREE, source1 = NULL_TREE,
+    dest = NULL_TREE, len = NULL_TREE;
+  bool is_store = true;
+
+  switch (DECL_FUNCTION_CODE (callee))
+    {
+      /* (s, s, n) style memops.  */
+    case BUILT_IN_BCMP:
+    case BUILT_IN_MEMCMP:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      source1 = gimple_call_arg (call, 1);
+      break;
+
+      /* (src, dest, n) style memops.  */
+    case BUILT_IN_BCOPY:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      dest = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, src, n) style memops.  */
+    case BUILT_IN_MEMCPY:
+    case BUILT_IN_MEMCPY_CHK:
+    case BUILT_IN_MEMMOVE:
+    case BUILT_IN_MEMMOVE_CHK:
+    case BUILT_IN_MEMPCPY:
+    case BUILT_IN_MEMPCPY_CHK:
+      dest = gimple_call_arg (call, 0);
+      source0 = gimple_call_arg (call, 1);
+      len = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, n) style memops.  */
+    case BUILT_IN_BZERO:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 1);
+      break;
+
+      /* (dest, x, n) style memops*/
+    case BUILT_IN_MEMSET:
+    case BUILT_IN_MEMSET_CHK:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 2);
+      break;
+
+    case BUILT_IN_STRLEN:
+      instrument_strlen_call (iter);
+      return;
+
+    /* And now the __atomic* and __sync builtins.
+       These are handled differently from the classical memory memory
+       access builtins above.  */
+
+    case BUILT_IN_ATOMIC_LOAD:
+    case BUILT_IN_ATOMIC_LOAD_1:
+    case BUILT_IN_ATOMIC_LOAD_2:
+    case BUILT_IN_ATOMIC_LOAD_4:
+    case BUILT_IN_ATOMIC_LOAD_8:
+    case BUILT_IN_ATOMIC_LOAD_16:
+      is_store = false;
+      /* fall through.  */
+
+    case BUILT_IN_SYNC_FETCH_AND_ADD_1:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_2:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_4:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_8:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_SUB_1:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_2:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_4:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_8:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_OR_1:
+    case BUILT_IN_SYNC_FETCH_AND_OR_2:
+    case BUILT_IN_SYNC_FETCH_AND_OR_4:
+    case BUILT_IN_SYNC_FETCH_AND_OR_8:
+    case BUILT_IN_SYNC_FETCH_AND_OR_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_AND_1:
+    case BUILT_IN_SYNC_FETCH_AND_AND_2:
+    case BUILT_IN_SYNC_FETCH_AND_AND_4:
+    case BUILT_IN_SYNC_FETCH_AND_AND_8:
+    case BUILT_IN_SYNC_FETCH_AND_AND_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_XOR_1:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_2:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_4:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_8:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_NAND_1:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_2:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_4:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_8:
+
+    case BUILT_IN_SYNC_ADD_AND_FETCH_1:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_2:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_4:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_8:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_SUB_AND_FETCH_1:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_2:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_4:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_8:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_OR_AND_FETCH_1:
+    case BUILT_IN_SYNC_OR_AND_FETCH_2:
+    case BUILT_IN_SYNC_OR_AND_FETCH_4:
+    case BUILT_IN_SYNC_OR_AND_FETCH_8:
+    case BUILT_IN_SYNC_OR_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_AND_AND_FETCH_1:
+    case BUILT_IN_SYNC_AND_AND_FETCH_2:
+    case BUILT_IN_SYNC_AND_AND_FETCH_4:
+    case BUILT_IN_SYNC_AND_AND_FETCH_8:
+    case BUILT_IN_SYNC_AND_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_XOR_AND_FETCH_1:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_2:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_4:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_8:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_NAND_AND_FETCH_1:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_2:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_4:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_8:
+
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_1:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_2:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_4:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_8:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_16:
+
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_1:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_2:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_4:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_8:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_16:
+
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_1:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_2:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_4:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_8:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_16:
+
+    case BUILT_IN_SYNC_LOCK_RELEASE_1:
+    case BUILT_IN_SYNC_LOCK_RELEASE_2:
+    case BUILT_IN_SYNC_LOCK_RELEASE_4:
+    case BUILT_IN_SYNC_LOCK_RELEASE_8:
+    case BUILT_IN_SYNC_LOCK_RELEASE_16:
+
+    case BUILT_IN_ATOMIC_TEST_AND_SET:
+    case BUILT_IN_ATOMIC_CLEAR:
+    case BUILT_IN_ATOMIC_EXCHANGE:
+    case BUILT_IN_ATOMIC_EXCHANGE_1:
+    case BUILT_IN_ATOMIC_EXCHANGE_2:
+    case BUILT_IN_ATOMIC_EXCHANGE_4:
+    case BUILT_IN_ATOMIC_EXCHANGE_8:
+    case BUILT_IN_ATOMIC_EXCHANGE_16:
+
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_2:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_8:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_16:
+
+    case BUILT_IN_ATOMIC_STORE:
+    case BUILT_IN_ATOMIC_STORE_1:
+    case BUILT_IN_ATOMIC_STORE_2:
+    case BUILT_IN_ATOMIC_STORE_4:
+    case BUILT_IN_ATOMIC_STORE_8:
+    case BUILT_IN_ATOMIC_STORE_16:
+
+    case BUILT_IN_ATOMIC_ADD_FETCH_1:
+    case BUILT_IN_ATOMIC_ADD_FETCH_2:
+    case BUILT_IN_ATOMIC_ADD_FETCH_4:
+    case BUILT_IN_ATOMIC_ADD_FETCH_8:
+    case BUILT_IN_ATOMIC_ADD_FETCH_16:
+
+    case BUILT_IN_ATOMIC_SUB_FETCH_1:
+    case BUILT_IN_ATOMIC_SUB_FETCH_2:
+    case BUILT_IN_ATOMIC_SUB_FETCH_4:
+    case BUILT_IN_ATOMIC_SUB_FETCH_8:
+    case BUILT_IN_ATOMIC_SUB_FETCH_16:
+
+    case BUILT_IN_ATOMIC_AND_FETCH_1:
+    case BUILT_IN_ATOMIC_AND_FETCH_2:
+    case BUILT_IN_ATOMIC_AND_FETCH_4:
+    case BUILT_IN_ATOMIC_AND_FETCH_8:
+    case BUILT_IN_ATOMIC_AND_FETCH_16:
+
+    case BUILT_IN_ATOMIC_NAND_FETCH_1:
+    case BUILT_IN_ATOMIC_NAND_FETCH_2:
+    case BUILT_IN_ATOMIC_NAND_FETCH_4:
+    case BUILT_IN_ATOMIC_NAND_FETCH_8:
+    case BUILT_IN_ATOMIC_NAND_FETCH_16:
+
+    case BUILT_IN_ATOMIC_XOR_FETCH_1:
+    case BUILT_IN_ATOMIC_XOR_FETCH_2:
+    case BUILT_IN_ATOMIC_XOR_FETCH_4:
+    case BUILT_IN_ATOMIC_XOR_FETCH_8:
+    case BUILT_IN_ATOMIC_XOR_FETCH_16:
+
+    case BUILT_IN_ATOMIC_OR_FETCH_1:
+    case BUILT_IN_ATOMIC_OR_FETCH_2:
+    case BUILT_IN_ATOMIC_OR_FETCH_4:
+    case BUILT_IN_ATOMIC_OR_FETCH_8:
+    case BUILT_IN_ATOMIC_OR_FETCH_16:
+
+    case BUILT_IN_ATOMIC_FETCH_ADD_1:
+    case BUILT_IN_ATOMIC_FETCH_ADD_2:
+    case BUILT_IN_ATOMIC_FETCH_ADD_4:
+    case BUILT_IN_ATOMIC_FETCH_ADD_8:
+    case BUILT_IN_ATOMIC_FETCH_ADD_16:
+
+    case BUILT_IN_ATOMIC_FETCH_SUB_1:
+    case BUILT_IN_ATOMIC_FETCH_SUB_2:
+    case BUILT_IN_ATOMIC_FETCH_SUB_4:
+    case BUILT_IN_ATOMIC_FETCH_SUB_8:
+    case BUILT_IN_ATOMIC_FETCH_SUB_16:
+
+    case BUILT_IN_ATOMIC_FETCH_AND_1:
+    case BUILT_IN_ATOMIC_FETCH_AND_2:
+    case BUILT_IN_ATOMIC_FETCH_AND_4:
+    case BUILT_IN_ATOMIC_FETCH_AND_8:
+    case BUILT_IN_ATOMIC_FETCH_AND_16:
+
+    case BUILT_IN_ATOMIC_FETCH_NAND_1:
+    case BUILT_IN_ATOMIC_FETCH_NAND_2:
+    case BUILT_IN_ATOMIC_FETCH_NAND_4:
+    case BUILT_IN_ATOMIC_FETCH_NAND_8:
+    case BUILT_IN_ATOMIC_FETCH_NAND_16:
+
+    case BUILT_IN_ATOMIC_FETCH_XOR_1:
+    case BUILT_IN_ATOMIC_FETCH_XOR_2:
+    case BUILT_IN_ATOMIC_FETCH_XOR_4:
+    case BUILT_IN_ATOMIC_FETCH_XOR_8:
+    case BUILT_IN_ATOMIC_FETCH_XOR_16:
+
+    case BUILT_IN_ATOMIC_FETCH_OR_1:
+    case BUILT_IN_ATOMIC_FETCH_OR_2:
+    case BUILT_IN_ATOMIC_FETCH_OR_4:
+    case BUILT_IN_ATOMIC_FETCH_OR_8:
+    case BUILT_IN_ATOMIC_FETCH_OR_16:
+      {
+	dest = gimple_call_arg (call, 0);
+	/* So DEST represents the address of a memory location.
+	   instrument_derefs wants the memory location, so lets
+	   dereference the address DEST before handing it to
+	   instrument_derefs.  */
+	if (TREE_CODE (dest) == ADDR_EXPR)
+	  dest = TREE_OPERAND (dest, 0);
+	else if (TREE_CODE (dest) == SSA_NAME)
+	  dest = build2 (MEM_REF, TREE_TYPE (TREE_TYPE (dest)),
+			 dest, build_int_cst (TREE_TYPE (dest), 0));
+	else
+	  gcc_unreachable ();
+
+	instrument_derefs (iter, dest, loc, is_store);
+	return;
+      }
+
+    default:
+      /* The other builtins memory access are not instrumented in this
+	 function because they either don't have any length parameter,
+	 or their length parameter is just a limit.  */
+      break;
+    }
+
+  if (len != NULL_TREE)
+    {
+      if (source0 != NULL_TREE)
+	instrument_mem_region_access (source0, len, iter,
+				      loc, /*is_store=*/false);
+      if (source1 != NULL_TREE)
+	instrument_mem_region_access (source1, len, iter,
+				      loc, /*is_store=*/false);
+      else if (dest != NULL_TREE)
+	instrument_mem_region_access (dest, len, iter,
+				      loc, /*is_store=*/true);
+    }
+}
+
+/*  Instrument the assignment statement ITER if it is subject to
+    instrumentation.  */
+
+static void
+instrument_assignment (gimple_stmt_iterator *iter)
+{
+  gimple s = gsi_stmt (*iter);
+
+  gcc_assert (gimple_assign_single_p (s));
+
+  instrument_derefs (iter, gimple_assign_lhs (s),
+		     gimple_location (s), true);
+  instrument_derefs (iter, gimple_assign_rhs1 (s),
+		     gimple_location (s), false);
+}
+
+/* Instrument the function call pointed to by the iterator ITER, if it
+   is subject to instrumentation.  At the moment, the only function
+   calls that are instrumented are some built-in functions that access
+   memory.  Look at instrument_builtin_call to learn more.  */
+
+static void
+maybe_instrument_call (gimple_stmt_iterator *iter)
+{
+  if (is_gimple_builtin_call (gsi_stmt (*iter)))
+    instrument_builtin_call (iter);
 }
 
 /* asan: this looks too complex. Can this be done simpler? */
@@ -809,13 +1356,12 @@ transform_statements (void)
       if (bb->index >= saved_last_basic_block) continue;
       for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
         {
-          gimple s = gsi_stmt (i);
-          if (!gimple_assign_single_p (s))
-	    continue;
-          instrument_derefs (&i, gimple_assign_lhs (s),
-                             gimple_location (s), true);
-          instrument_derefs (&i, gimple_assign_rhs1 (s),
-                             gimple_location (s), false);
+	  gimple s = gsi_stmt (i);
+
+	  if (gimple_assign_single_p (s))
+	    instrument_assignment (&i);
+	  else if (is_gimple_call (s))
+	    maybe_instrument_call (&i);
         }
     }
 }
diff --git a/gcc/gimple.c b/gcc/gimple.c
index a5c16da..481a4d9 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -4121,6 +4121,22 @@ gimple_decl_printable_name (tree decl, int verbosity)
   return IDENTIFIER_POINTER (DECL_NAME (decl));
 }
 
+/* Return TRUE iff stmt is a call to a built-in function.  */
+
+bool
+is_gimple_builtin_call (gimple stmt)
+{
+  tree callee;
+
+  if (is_gimple_call (stmt)
+      && (callee = gimple_call_fndecl (stmt))
+      && is_builtin_fn (callee)
+      && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL)
+    return true;
+
+  return false;
+}
+
 /* Return true when STMT is builtins call to CODE.  */
 
 bool
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 19d45d0..e73fe0d 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -875,6 +875,9 @@ extern bool is_gimple_condexpr (tree);
 /* Returns true iff T is a valid call address expression.  */
 extern bool is_gimple_call_addr (tree);
 
+/* Return TRUE iff stmt is a call to a built-in function.  */
+extern bool is_gimple_builtin_call (gimple stmt);
+
 extern void recalculate_side_effects (tree);
 extern bool gimple_compare_field_offset (tree, tree);
 extern tree gimple_register_canonical_type (tree);
-- 
1.7.11.7


From: Jakub Jelinek <jakub@redhat.com>
Date: Mon, 12 Nov 2012 11:06:18 +0100
Subject: [PATCH 10/11] Avoid missing one statement when instrumenting strlen
calls

	* asan.c (instrument_strlen_call): Return bool whether the call has
	been instrumented.
	(instrument_builtin_call): Change return value to mean whether
	caller should avoid gsi_next before processing next statement.  Pass
	thru return value from instrument_strlen_call.  Set *iter to gsi for
	the call at the end.
	(maybe_instrument_call): Return bool whether caller should avoid
	gsi_next.
	(transform_statements): Don't do gsi_next if maybe_instrument_call
	returned true.
---
 gcc/asan.c | 59 ++++++++++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 44 insertions(+), 15 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index ef855fb..639dd9f 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -940,14 +940,21 @@ instrument_mem_region_access (tree base, tree len,
 		    &gsi, /*before_p=*/false, is_store, 1);
 }
 
-/* Instrument the strlen builtin call pointed to by ITER.
+/* Instrument the call (to the builtin strlen function) pointed to by
+   ITER.
 
    This function instruments the access to the first byte of the
    argument, right before the call.  After the call it instruments the
    access to the last byte of the argument; it uses the result of the
-   call to deduce the offset of that last byte.  */
+   call to deduce the offset of that last byte.
 
-static void
+   Upon completion, iff the call has actullay been instrumented, this
+   function returns TRUE and *ITER points to the statement logically
+   following the built-in strlen function call *ITER was initially
+   pointing to.  Otherwise, the function returns FALSE and *ITER
+   remains unchanged.  */
+
+static bool
 instrument_strlen_call (gimple_stmt_iterator *iter)
 {
   gimple call = gsi_stmt (*iter);
@@ -961,8 +968,9 @@ instrument_strlen_call (gimple_stmt_iterator *iter)
   tree len = gimple_call_lhs (call);
   if (len == NULL)
     /* Some passes might clear the return value of the strlen call;
-       bail out in that case.  */
-    return;
+       bail out in that case.  Return FALSE as we are not advancing
+       *ITER.  */
+    return false;
   gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (len)));
 
   location_t loc = gimple_location (call);
@@ -1006,12 +1014,20 @@ instrument_strlen_call (gimple_stmt_iterator *iter)
   /* Ensure that iter points to the statement logically following the
      one it was initially pointing to.  */
   *iter = gsi;
+  /* As *ITER has been advanced to point to the next statement, let's
+     return true to inform transform_statements that it shouldn't
+     advance *ITER anymore; otherwises it will skip that next
+     statement, which wouldn't be instrumented.  */
+  return true;
 }
 
 /* Instrument the call to a built-in memory access function that is
-   pointed to by the iterator ITER.  */
+   pointed to by the iterator ITER.
 
-static void
+   Upon completion, return TRUE iff *ITER has been advanced to the
+   statement following the one it was originally pointing to.  */
+
+static bool
 instrument_builtin_call (gimple_stmt_iterator *iter)
 {
   gimple call = gsi_stmt (*iter);
@@ -1067,8 +1083,7 @@ instrument_builtin_call (gimple_stmt_iterator *iter)
       break;
 
     case BUILT_IN_STRLEN:
-      instrument_strlen_call (iter);
-      return;
+      return instrument_strlen_call (iter);
 
     /* And now the __atomic* and __sync builtins.
        These are handled differently from the classical memory memory
@@ -1286,7 +1301,7 @@ instrument_builtin_call (gimple_stmt_iterator *iter)
 	  gcc_unreachable ();
 
 	instrument_derefs (iter, dest, loc, is_store);
-	return;
+	return false;
       }
 
     default:
@@ -1307,7 +1322,11 @@ instrument_builtin_call (gimple_stmt_iterator *iter)
       else if (dest != NULL_TREE)
 	instrument_mem_region_access (dest, len, iter,
 				      loc, /*is_store=*/true);
+
+      *iter = gsi_for_stmt (call);
+      return false;
     }
+  return false;
 }
 
 /*  Instrument the assignment statement ITER if it is subject to
@@ -1329,13 +1348,17 @@ instrument_assignment (gimple_stmt_iterator *iter)
 /* Instrument the function call pointed to by the iterator ITER, if it
    is subject to instrumentation.  At the moment, the only function
    calls that are instrumented are some built-in functions that access
-   memory.  Look at instrument_builtin_call to learn more.  */
+   memory.  Look at instrument_builtin_call to learn more.
 
-static void
+   Upon completion return TRUE iff *ITER was advanced to the statement
+   following the one it was originally pointing to.  */
+
+static bool
 maybe_instrument_call (gimple_stmt_iterator *iter)
 {
   if (is_gimple_builtin_call (gsi_stmt (*iter)))
-    instrument_builtin_call (iter);
+    return instrument_builtin_call (iter);
+  return false;
 }
 
 /* asan: this looks too complex. Can this be done simpler? */
@@ -1354,14 +1377,20 @@ transform_statements (void)
   FOR_EACH_BB (bb)
     {
       if (bb->index >= saved_last_basic_block) continue;
-      for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+      for (i = gsi_start_bb (bb); !gsi_end_p (i);)
         {
 	  gimple s = gsi_stmt (i);
 
 	  if (gimple_assign_single_p (s))
 	    instrument_assignment (&i);
 	  else if (is_gimple_call (s))
-	    maybe_instrument_call (&i);
+	    {
+	      if (maybe_instrument_call (&i))
+		/* Avoid gsi_next (&i), because maybe_instrument_call
+		   advanced the I iterator already.  */
+		continue;
+	    }
+	  gsi_next (&i);
         }
     }
 }
-- 
1.7.11.7


-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 10/10] Import the asan runtime library into GCC tree
  2012-11-06 17:41     ` Diego Novillo
@ 2012-11-12 11:47       ` Dodji Seketeli
  2012-11-12 18:59         ` H.J. Lu
  0 siblings, 1 reply; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-12 11:47 UTC (permalink / raw)
  To: Diego Novillo
  Cc: gcc-patches, jakub, wmi, davidxl, konstantin.s.serebryany, Tobias Burnus

Diego Novillo <dnovillo@google.com> writes:

> On 2012-11-02 16:10 , Dodji Seketeli wrote:
>
>>          * configure.ac: Add libsanitizer to target_libraries.
>> 	* Makefile.def: Ditto.
>> 	* configure: Regenerate.
>> 	* Makefile.in: Regenerate.
>> 	* libsanitizer: New directory for asan runtime.  Contains an empty
>> 	tsan directory.
>>
>> gcc:
>> 	* gcc.c (LINK_COMMAND_SPEC): Add -lasan to link command
>> 	if -faddress-sanitizer is on.
>
> OK with Jakub's comments addressed.
>
> References to -fasan in diagnostics should be replaced.  But there's
> been another flag name change upstream, so let's do it together with
> the new flag names.

Done.   This also addresses the comment later made by Tobias below:

Tobias Burnus <burnus@net-b.de> writes:

> Other issues:

> * Probably fixed on the branch: gcc/gcc.c still has "fasan" instead of
> "faddress-sanitizer" for the spec:
> +    %{fasan:-lasan}

Below is a link to the updated patch.

This patch imports the runtime library in the GCC tree, ensures that
-lasan is passed to the linker when -faddress-sanitizer is used and
sets up the build system accordingly.

     * configure.ac: Add libsanitizer to target_libraries.
	* Makefile.def: Ditto.
	* configure: Regenerate.
	* Makefile.in: Regenerate.
	* libsanitizer: New directory for asan runtime.  Contains an empty
	tsan directory.

gcc:
	* gcc.c (LINK_COMMAND_SPEC): Add -laddress-sanitizer to link command
	if -faddress-sanitizer is on.

libsanitizer:

	Initial checkin: migrate asan runtime from llvm.

http://people.redhat.com/~dseketel/gcc/patches/0011-Import-the-asan-runtime-library-into-GCC-tree.patch

-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 05/10] Implement protection of stack variables
  2012-11-12 11:31       ` Dodji Seketeli
@ 2012-11-12 11:51         ` Jakub Jelinek
  2012-11-12 16:08           ` Dodji Seketeli
  0 siblings, 1 reply; 80+ messages in thread
From: Jakub Jelinek @ 2012-11-12 11:51 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Diego Novillo, gcc-patches, wmi, davidxl, konstantin.s.serebryany

On Mon, Nov 12, 2012 at 12:30:37PM +0100, Dodji Seketeli wrote:
> + For this function, the stack protected by asan will be organized as
> + follows, from the top of the stack to the bottom:
> +
> + Slot 1/ [red zone of 32 bytes called 'RIGHT RedZone']
> +
> + Slot 2/ [24 bytes for variable 'a']
> +
> + Slot 3/ [8 bytes of red zone, that adds up to the space of 'a' to make
> +	  the next slot be 32 bytes aligned; this one is called Partial
> +	  Redzone; this 32 bytes alignment is an asan constraint]

If you are going from top to bottom, the padding (here Slot 3/) goes above
the variables, so you need to swap Slot 2/ and 3/, 5/ and 6/ and adjust
comment for former slot 6/.

> +
> + Slot 4/ [red zone of 32 bytes called 'Middle RedZone']
> +
> + Slot 5/ [8 bytes for variable 'b']
> +
> + Slot 6/ [24 bytes of Partial Red Zone (similar to slot 3]
> +
> + Slot 7/ [32 bytes of Red Zone at the bottom of the stack, called 'LEFT
> +	  RedZone']
> +
...
> + The shadow memory for that stack layout is going to look like this:
> +
> +     - content of shadow memory 8 bytes for slot 7: 0xFFFFFFFFF1F1F1F1.

Please strip the extra leading FFFFFFFF from the constants, the stores are
all 32-bit and the constants are just sign-extended.

	Jakub

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-02 22:53 ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
                     ` (10 preceding siblings ...)
       [not found]   ` <87fw4r7g8w.fsf_-_@redhat.com>
@ 2012-11-12 16:07   ` Dodji Seketeli
  2012-11-12 16:21     ` Jakub Jelinek
  2012-11-12 17:20     ` Jack Howarth
  11 siblings, 2 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-12 16:07 UTC (permalink / raw)
  To: gcc-patches
  Cc: dnovillo, jakub, wmi, davidxl, konstantin.s.serebryany, Tobias Burnus

Following a request from Jakub, and given the fact that the patch set
have been reviewed by Diego, I have committed the last set of patches I
have posted to trunk.

This will hopefully ease the polishing work that has started already.

I am of course watching for the fall-outs.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 05/10] Implement protection of stack variables
  2012-11-12 11:51         ` Jakub Jelinek
@ 2012-11-12 16:08           ` Dodji Seketeli
  0 siblings, 0 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-12 16:08 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Diego Novillo, gcc-patches, wmi, davidxl, konstantin.s.serebryany

Jakub Jelinek <jakub@redhat.com> writes:

> On Mon, Nov 12, 2012 at 12:30:37PM +0100, Dodji Seketeli wrote:
>> + For this function, the stack protected by asan will be organized as
>> + follows, from the top of the stack to the bottom:
>> +
>> + Slot 1/ [red zone of 32 bytes called 'RIGHT RedZone']
>> +
>> + Slot 2/ [24 bytes for variable 'a']
>> +
>> + Slot 3/ [8 bytes of red zone, that adds up to the space of 'a' to make
>> +	  the next slot be 32 bytes aligned; this one is called Partial
>> +	  Redzone; this 32 bytes alignment is an asan constraint]
>
> If you are going from top to bottom, the padding (here Slot 3/) goes above
> the variables, so you need to swap Slot 2/ and 3/, 5/ and 6/ and adjust
> comment for former slot 6/.

Done, committed to trunk.

>
>> +
>> + Slot 4/ [red zone of 32 bytes called 'Middle RedZone']
>> +
>> + Slot 5/ [8 bytes for variable 'b']
>> +
>> + Slot 6/ [24 bytes of Partial Red Zone (similar to slot 3]
>> +
>> + Slot 7/ [32 bytes of Red Zone at the bottom of the stack, called 'LEFT
>> +	  RedZone']
>> +
> ...
>> + The shadow memory for that stack layout is going to look like this:
>> +
>> +     - content of shadow memory 8 bytes for slot 7: 0xFFFFFFFFF1F1F1F1.
>
> Please strip the extra leading FFFFFFFF from the constants, the stores are
> all 32-bit and the constants are just sign-extended.

Done, committed to trunk.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-12 16:07   ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
@ 2012-11-12 16:21     ` Jakub Jelinek
  2012-11-12 16:45       ` Tobias Burnus
  2012-11-12 17:20     ` Jack Howarth
  1 sibling, 1 reply; 80+ messages in thread
From: Jakub Jelinek @ 2012-11-12 16:21 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: gcc-patches

On Mon, Nov 12, 2012 at 05:07:42PM +0100, Dodji Seketeli wrote:
> Following a request from Jakub, and given the fact that the patch set
> have been reviewed by Diego, I have committed the last set of patches I
> have posted to trunk.

Thanks, I've committed as obvious the following formatting cleanup.
Mostly whitespace changes, otherwise just removed two more occurrences of
FFFFFFFF that shouldn't be there.

--- ChangeLog	(revision 193441)
+++ ChangeLog	(working copy)
@@ -1,4 +1,8 @@
-2012-11-12  Wei Mi <wmi@google.com>
+2012-11-12  Jakub Jelinek  <jakub@redhat.com>
+
+	* asan.c: Formatting cleanups.
+
+2012-11-12  Wei Mi  <wmi@google.com>
 
 	* gcc.c (LINK_COMMAND_SPEC): Add -lasan to link command if
 	-faddress-sanitizer is on.
@@ -28,7 +32,6 @@
 	* asan.c (create_cond_insert_point_before_iter): Factorize out of ...
 	(build_check_stmt): ... here.
 
-
 2012-11-12  Dodji Seketeli  <dodji@redhat.com>
 
 	* asan.c (create_cond_insert_point_before_iter): Factorize out of ...
@@ -40,7 +43,7 @@
 	represented by an SSA_NAME.
 
 2012-11-12  Jakub Jelinek  <jakub@redhat.com>
-	    Wei Mi <wmi@google.com>
+	    Wei Mi  <wmi@google.com>
 
 	* varasm.c: Include asan.h.
 	(assemble_noswitch_variable): Grow size by asan_red_zone_size
@@ -111,7 +114,7 @@
 
 2012-11-12  Jakub Jelinek  <jakub@redhat.com>
 	    Xinliang David Li  <davidxl@google.com>
-	    Dodji Seketeli <dodji@redhat.com>
+	    Dodji Seketeli  <dodji@redhat.com>
 
 	* Makefile.in (GTFILES): Add $(srcdir)/asan.c.
 	(asan.o): Update the dependencies of asan.o.
@@ -155,9 +158,9 @@
 	* config/i386/i386.c (ix86_asan_shadow_offset): New function.
 	(TARGET_ASAN_SHADOW_OFFSET): Define.
 
-2012-11-12  Wei Mi <wmi@google.com>
-	    Diego Novillo <dnovillo@google.com>
-	    Dodji Seketeli <dodji@redhat.com>
+2012-11-12  Wei Mi  <wmi@google.com>
+	    Diego Novillo  <dnovillo@google.com>
+	    Dodji Seketeli  <dodji@redhat.com>
 
 	* Makefile.in: Add asan.c and its dependencies.
 	* common.opt: Add -faddress-sanitizer option.
--- asan.c	(revision 193441)
+++ asan.c	(working copy)
@@ -33,42 +33,41 @@ along with GCC; see the file COPYING3.
 #include "optabs.h"
 #include "output.h"
 
-/*
- AddressSanitizer finds out-of-bounds and use-after-free bugs 
- with <2x slowdown on average.
-
- The tool consists of two parts:
- instrumentation module (this file) and a run-time library.
- The instrumentation module adds a run-time check before every memory insn.
-   For a 8- or 16- byte load accessing address X:
-     ShadowAddr = (X >> 3) + Offset
-     ShadowValue = *(char*)ShadowAddr;  // *(short*) for 16-byte access.
-     if (ShadowValue)
-       __asan_report_load8(X);
-   For a load of N bytes (N=1, 2 or 4) from address X:
-     ShadowAddr = (X >> 3) + Offset
-     ShadowValue = *(char*)ShadowAddr;
-     if (ShadowValue)
-       if ((X & 7) + N - 1 > ShadowValue)
-         __asan_report_loadN(X);
- Stores are instrumented similarly, but using __asan_report_storeN functions.
- A call too __asan_init() is inserted to the list of module CTORs.
-
- The run-time library redefines malloc (so that redzone are inserted around
- the allocated memory) and free (so that reuse of free-ed memory is delayed),
- provides __asan_report* and __asan_init functions.
-
- Read more:
- http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm
-
- The current implementation supports detection of out-of-bounds and
- use-after-free in the heap, on the stack and for global variables.
-
- [Protection of stack variables]
-
- To understand how detection of out-of-bounds and use-after-free works
- for stack variables, lets look at this example on x86_64 where the
- stack grows downward:
+/* AddressSanitizer finds out-of-bounds and use-after-free bugs
+   with <2x slowdown on average.
+
+   The tool consists of two parts:
+   instrumentation module (this file) and a run-time library.
+   The instrumentation module adds a run-time check before every memory insn.
+     For a 8- or 16- byte load accessing address X:
+       ShadowAddr = (X >> 3) + Offset
+       ShadowValue = *(char*)ShadowAddr;  // *(short*) for 16-byte access.
+       if (ShadowValue)
+	 __asan_report_load8(X);
+     For a load of N bytes (N=1, 2 or 4) from address X:
+       ShadowAddr = (X >> 3) + Offset
+       ShadowValue = *(char*)ShadowAddr;
+       if (ShadowValue)
+	 if ((X & 7) + N - 1 > ShadowValue)
+	   __asan_report_loadN(X);
+   Stores are instrumented similarly, but using __asan_report_storeN functions.
+   A call too __asan_init() is inserted to the list of module CTORs.
+
+   The run-time library redefines malloc (so that redzone are inserted around
+   the allocated memory) and free (so that reuse of free-ed memory is delayed),
+   provides __asan_report* and __asan_init functions.
+
+   Read more:
+   http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm
+
+   The current implementation supports detection of out-of-bounds and
+   use-after-free in the heap, on the stack and for global variables.
+
+   [Protection of stack variables]
+
+   To understand how detection of out-of-bounds and use-after-free works
+   for stack variables, lets look at this example on x86_64 where the
+   stack grows downward:
 
      int
      foo ()
@@ -82,28 +81,28 @@ along with GCC; see the file COPYING3.
        return a[5] + b[1];
      }
 
- For this function, the stack protected by asan will be organized as
- follows, from the top of the stack to the bottom:
+   For this function, the stack protected by asan will be organized as
+   follows, from the top of the stack to the bottom:
 
- Slot 1/ [red zone of 32 bytes called 'RIGHT RedZone']
+   Slot 1/ [red zone of 32 bytes called 'RIGHT RedZone']
 
- Slot 2/ [8 bytes of red zone, that adds up to the space of 'a' to make
-	  the next slot be 32 bytes aligned; this one is called Partial
-	  Redzone; this 32 bytes alignment is an asan constraint]
+   Slot 2/ [8 bytes of red zone, that adds up to the space of 'a' to make
+	   the next slot be 32 bytes aligned; this one is called Partial
+	   Redzone; this 32 bytes alignment is an asan constraint]
 
- Slot 3/ [24 bytes for variable 'a']
+   Slot 3/ [24 bytes for variable 'a']
 
- Slot 4/ [red zone of 32 bytes called 'Middle RedZone']
+   Slot 4/ [red zone of 32 bytes called 'Middle RedZone']
 
- Slot 5/ [24 bytes of Partial Red Zone (similar to slot 2]
+   Slot 5/ [24 bytes of Partial Red Zone (similar to slot 2]
 
- Slot 6/ [8 bytes for variable 'b']
+   Slot 6/ [8 bytes for variable 'b']
 
- Slot 7/ [32 bytes of Red Zone at the bottom of the stack, called 'LEFT
-	  RedZone']
+   Slot 7/ [32 bytes of Red Zone at the bottom of the stack, called
+	    'LEFT RedZone']
 
- The 32 bytes of LEFT red zone at the bottom of the stack can be
- decomposed as such:
+   The 32 bytes of LEFT red zone at the bottom of the stack can be
+   decomposed as such:
 
      1/ The first 8 bytes contain a magical asan number that is always
      0x41B58AB3.
@@ -122,7 +121,7 @@ along with GCC; see the file COPYING3.
       3/ The following 16 bytes of the red zone have no particular
       format.
 
- The shadow memory for that stack layout is going to look like this:
+   The shadow memory for that stack layout is going to look like this:
 
      - content of shadow memory 8 bytes for slot 7: 0xF1F1F1F1.
        The F1 byte pattern is a magic number called
@@ -149,39 +148,39 @@ along with GCC; see the file COPYING3.
        seat between two 32 aligned slots of {variable,padding}.
 
      - content of shadow memory 8 bytes for slot 3 and 2:
-       0xFFFFFFFFF4000000.  This represents is the concatenation of
+       0xF4000000.  This represents is the concatenation of
        variable 'a' and the partial red zone following it, like what we
        had for variable 'b'.  The least significant 3 bytes being 00
        means that the 3 bytes of variable 'a' are addressable.
 
-     - content of shadow memory 8 bytes for slot 1: 0xFFFFFFFFF3F3F3F3.
+     - content of shadow memory 8 bytes for slot 1: 0xF3F3F3F3.
        The F3 byte pattern is a magic number called
        ASAN_STACK_MAGIC_RIGHT.  It flags the fact that the memory
        region for this shadow byte is a RIGHT red zone intended to seat
        at the top of the variables of the stack.
 
- Note that the real variable layout is done in expand_used_vars in
- cfgexpand.c.  As far as Address Sanitizer is concerned, it lays out
- stack variables as well as the different red zones, emits some
- prologue code to populate the shadow memory as to poison (mark as
- non-accessible) the regions of the red zones and mark the regions of
- stack variables as accessible, and emit some epilogue code to
- un-poison (mark as accessible) the regions of red zones right before
- the function exits.
-
- [Protection of global variables]
-
- The basic idea is to insert a red zone between two global variables
- and install a constructor function that calls the asan runtime to do
- the populating of the relevant shadow memory regions at load time.
-
- So the global variables are laid out as to insert a red zone between
- them. The size of the red zones is so that each variable starts on a
- 32 bytes boundary.
-
- Then a constructor function is installed so that, for each global
- variable, it calls the runtime asan library function
- __asan_register_globals_with an instance of this type:
+   Note that the real variable layout is done in expand_used_vars in
+   cfgexpand.c.  As far as Address Sanitizer is concerned, it lays out
+   stack variables as well as the different red zones, emits some
+   prologue code to populate the shadow memory as to poison (mark as
+   non-accessible) the regions of the red zones and mark the regions of
+   stack variables as accessible, and emit some epilogue code to
+   un-poison (mark as accessible) the regions of red zones right before
+   the function exits.
+
+   [Protection of global variables]
+
+   The basic idea is to insert a red zone between two global variables
+   and install a constructor function that calls the asan runtime to do
+   the populating of the relevant shadow memory regions at load time.
+
+   So the global variables are laid out as to insert a red zone between
+   them. The size of the red zones is so that each variable starts on a
+   32 bytes boundary.
+
+   Then a constructor function is installed so that, for each global
+   variable, it calls the runtime asan library function
+   __asan_register_globals_with an instance of this type:
 
      struct __asan_global
      {
@@ -202,8 +201,8 @@ along with GCC; see the file COPYING3.
        uptr __has_dynamic_init;
      }
 
- A destructor function that calls the runtime asan library function
- _asan_unregister_globals is also installed.  */
+   A destructor function that calls the runtime asan library function
+   _asan_unregister_globals is also installed.  */
 
 alias_set_type asan_shadow_set = -1;
 
@@ -475,7 +474,7 @@ asan_protect_global (tree decl)
     return false;
 #endif
 
-  return true;    
+  return true;
 }
 
 /* Construct a function tree for __asan_report_{load,store}{1,2,4,8,16}.
@@ -490,13 +489,13 @@ report_error_func (bool is_store, int si
   char name[100];
 
   sprintf (name, "__asan_report_%s%d",
-           is_store ? "store" : "load", size_in_bytes);
+	   is_store ? "store" : "load", size_in_bytes);
   fn_type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
   def = build_fn_decl (name, fn_type);
   TREE_NOTHROW (def) = 1;
   TREE_THIS_VOLATILE (def) = 1;  /* Attribute noreturn. Surprise!  */
-  DECL_ATTRIBUTES (def) = tree_cons (get_identifier ("leaf"), 
-                                     NULL, DECL_ATTRIBUTES (def));
+  DECL_ATTRIBUTES (def) = tree_cons (get_identifier ("leaf"),
+				     NULL, DECL_ATTRIBUTES (def));
   DECL_ASSEMBLER_NAME (def);
   return def;
 }
@@ -598,7 +597,7 @@ create_cond_insert_point (gimple_stmt_it
    outcoming edge of the 'then block' -- starts with the statement
    pointed to by ITER.
 
-   COND is the condition of the if.  
+   COND is the condition of the if.
 
    If THEN_MORE_LIKELY_P is true, the probability of the edge to the
    'then block' is higher than the probability of the edge to the
@@ -796,7 +795,7 @@ build_check_stmt (location_t location, t
 
 static void
 instrument_derefs (gimple_stmt_iterator *iter, tree t,
-                  location_t location, bool is_store)
+		  location_t location, bool is_store)
 {
   tree type, base;
   HOST_WIDE_INT size_in_bytes;
@@ -864,7 +863,7 @@ instrument_mem_region_access (tree base,
 	 if (len != 0)
 	   {
 	     //asan instrumentation code goes here.
-           }
+	   }
 	   // falltrough instructions, starting with *ITER.  */
 
       gimple g = gimple_build_cond (NE_EXPR,
@@ -930,7 +929,7 @@ instrument_mem_region_access (tree base,
   region_end =
     gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
 				  make_ssa_name (TREE_TYPE (base), NULL),
-				  gimple_assign_lhs (region_end), 
+				  gimple_assign_lhs (region_end),
 				  gimple_assign_lhs (offset));
   gimple_set_location (region_end, location);
   gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
@@ -1378,7 +1377,7 @@ transform_statements (void)
     {
       if (bb->index >= saved_last_basic_block) continue;
       for (i = gsi_start_bb (bb); !gsi_end_p (i);)
-        {
+	{
 	  gimple s = gsi_stmt (i);
 
 	  if (gimple_assign_single_p (s))
@@ -1391,7 +1390,7 @@ transform_statements (void)
 		continue;
 	    }
 	  gsi_next (&i);
-        }
+	}
     }
 }
 
@@ -1594,18 +1593,18 @@ struct gimple_opt_pass pass_asan =
 {
  {
   GIMPLE_PASS,
-  "asan",                               /* name  */
-  OPTGROUP_NONE,                        /* optinfo_flags */
-  gate_asan,                            /* gate  */
-  asan_instrument,                      /* execute  */
-  NULL,                                 /* sub  */
-  NULL,                                 /* next  */
-  0,                                    /* static_pass_number  */
-  TV_NONE,                              /* tv_id  */
+  "asan",				/* name  */
+  OPTGROUP_NONE,			/* optinfo_flags */
+  gate_asan,				/* gate  */
+  asan_instrument,			/* execute  */
+  NULL,					/* sub  */
+  NULL,					/* next  */
+  0,					/* static_pass_number  */
+  TV_NONE,				/* tv_id  */
   PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required  */
-  0,                                    /* properties_provided  */
-  0,                                    /* properties_destroyed  */
-  0,                                    /* todo_flags_start  */
+  0,					/* properties_provided  */
+  0,					/* properties_destroyed  */
+  0,					/* todo_flags_start  */
   TODO_verify_flow | TODO_verify_stmts
   | TODO_update_ssa			/* todo_flags_finish  */
  }
@@ -1622,7 +1621,7 @@ struct gimple_opt_pass pass_asan_O0 =
  {
   GIMPLE_PASS,
   "asan0",				/* name  */
-  OPTGROUP_NONE,                        /* optinfo_flags */
+  OPTGROUP_NONE,			/* optinfo_flags */
   gate_asan_O0,				/* gate  */
   asan_instrument,			/* execute  */
   NULL,					/* sub  */


	Jakub

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-12 16:21     ` Jakub Jelinek
@ 2012-11-12 16:45       ` Tobias Burnus
  2012-11-12 16:51         ` Konstantin Serebryany
  0 siblings, 1 reply; 80+ messages in thread
From: Tobias Burnus @ 2012-11-12 16:45 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Dodji Seketeli, gcc patches

[-- Attachment #1: Type: text/plain, Size: 555 bytes --]

Jakub Jelinek:
> On Mon, Nov 12, 2012 at 05:07:42PM +0100, Dodji Seketeli wrote:
>> Following a request from Jakub, and given the fact that the patch set
>> have been reviewed by Diego, I have committed the last set of patches I
>> have posted to trunk.
>
> Thanks, I've committed as obvious the following formatting cleanup.

Also thanks from my side!

I have also committed a small patch, which restores the single "-fÂ…" 
(instead of "--fÂ…" vs. "fÂ…" and which moved -faddress-sanitizer from the 
optimization to the debugging option section.

Tobias

[-- Attachment #2: committed.diff --]
[-- Type: text/x-patch, Size: 2254 bytes --]

Index: gcc/ChangeLog
===================================================================
--- gcc/ChangeLog	(Revision 193442)
+++ gcc/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,8 @@
+2012-11-12  Tobias Burnus  <burnus@net-b.de>
+
+	* doc/invoke.texi: Move -faddress-sanitizer from Optimization
+	Options to Debugging Options.
+
 2012-11-12  Jakub Jelinek  <jakub@redhat.com>
 
 	* asan.c: Formatting cleanups.
@@ -164,7 +169,7 @@
 
 	* Makefile.in: Add asan.c and its dependencies.
 	* common.opt: Add -faddress-sanitizer option.
-	* invoke.texi: Document the new flag.
+	* doc/invoke.texi: Document the new flag.
 	* passes.c: Add the asan pass.
 	* toplev.c (compile_file): Call asan_finish_file.
 	* asan.c: New file.
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(Revision 193442)
+++ gcc/doc/invoke.texi	(Arbeitskopie)
@@ -289,7 +289,7 @@ Objective-C and Objective-C++ Dialects}.
 @item Debugging Options
 @xref{Debugging Options,,Options for Debugging Your Program or GCC}.
 @gccoptlist{-d@var{letters}  -dumpspecs  -dumpmachine  -dumpversion @gol
--fdbg-cnt-list -fdbg-cnt=@var{counter-value-list} @gol
+-faddress-sanitizer -fdbg-cnt-list -fdbg-cnt=@var{counter-value-list} @gol
 -fdisable-ipa-@var{pass_name} @gol
 -fdisable-rtl-@var{pass_name} @gol
 -fdisable-rtl-@var{pass-name}=@var{range-list} @gol
@@ -354,10 +354,10 @@ Objective-C and Objective-C++ Dialects}.
 @item Optimization Options
 @xref{Optimize Options,,Options that Control Optimization}.
 @gccoptlist{-falign-functions[=@var{n}] -falign-jumps[=@var{n}] @gol
--falign-labels[=@var{n}] -falign-loops[=@var{n}] -faddress-sanitizer @gol
---fassociative-math fauto-inc-dec -fbranch-probabilities @gol
---fbranch-target-load-optimize fbranch-target-load-optimize2 @gol
---fbtr-bb-exclusive -fcaller-saves @gol
+-falign-labels[=@var{n}] -falign-loops[=@var{n}] @gol
+-fassociative-math -fauto-inc-dec -fbranch-probabilities @gol
+-fbranch-target-load-optimize -fbranch-target-load-optimize2 @gol
+-fbtr-bb-exclusive -fcaller-saves @gol
 -fcheck-data-deps -fcombine-stack-adjustments -fconserve-stack @gol
 -fcompare-elim -fcprop-registers -fcrossjumping @gol
 -fcse-follow-jumps -fcse-skip-blocks -fcx-fortran-rules @gol

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-12 16:45       ` Tobias Burnus
@ 2012-11-12 16:51         ` Konstantin Serebryany
  0 siblings, 0 replies; 80+ messages in thread
From: Konstantin Serebryany @ 2012-11-12 16:51 UTC (permalink / raw)
  To: Tobias Burnus; +Cc: Jakub Jelinek, Dodji Seketeli, gcc patches

Folks, please remember that the Clang flag has recently changed to
-fsanitize=address (-fsanitize=thread).
This is hopefully the last syntax change there.

--kcc

On Mon, Nov 12, 2012 at 8:45 AM, Tobias Burnus <burnus@net-b.de> wrote:
> Jakub Jelinek:
>
>> On Mon, Nov 12, 2012 at 05:07:42PM +0100, Dodji Seketeli wrote:
>>>
>>> Following a request from Jakub, and given the fact that the patch set
>>> have been reviewed by Diego, I have committed the last set of patches I
>>> have posted to trunk.
>>
>>
>> Thanks, I've committed as obvious the following formatting cleanup.
>
>
> Also thanks from my side!
>
> I have also committed a small patch, which restores the single "-f…"
> (instead of "--f…" vs. "f…" and which moved -faddress-sanitizer from the
> optimization to the debugging option section.
>
> Tobias

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-12 16:07   ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
  2012-11-12 16:21     ` Jakub Jelinek
@ 2012-11-12 17:20     ` Jack Howarth
  2012-11-12 17:34       ` Jack Howarth
  1 sibling, 1 reply; 80+ messages in thread
From: Jack Howarth @ 2012-11-12 17:20 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: gcc-patches, dnovillo, jakub, wmi, davidxl,
	konstantin.s.serebryany, Tobias Burnus

On Mon, Nov 12, 2012 at 05:07:42PM +0100, Dodji Seketeli wrote:
> Following a request from Jakub, and given the fact that the patch set
> have been reviewed by Diego, I have committed the last set of patches I
> have posted to trunk.
> 
> This will hopefully ease the polishing work that has started already.
> 
> I am of course watching for the fall-outs.
> 
> -- 
> 		Dodji

Dodji,
    I am finding that at r193442 bootstrapping on x86_64-apple-darwin12 fails with...

Making all in interception
/bin/sh ../libtool --tag=CXX   --mode=compile /sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/./gcc/g++ -B/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/./gcc/ -nostdinc++ -nostdinc++ -I/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/include/x86_64-apple-darwin12.2.0 -I/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/include -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20121112/libstdc++-v3/libsupc++ -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20121112/libstdc++-v3/include/backward -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20121112/libstdc++-v3/testsuite/util -L/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/src -L/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/src/.libs -B/sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/bin/ -B/sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/lib/ -isystem /sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/include -isystem /sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/sys-include    -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS  -I. -I../../../../gcc-4.8-20121112/libsanitizer/interception  -I ../../../../gcc-4.8-20121112/libsanitizer/include   -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long  -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros -Wno-c99-extensions  -g -O2 -MT interception_mac.lo -MD -MP -MF .deps/interception_mac.Tpo -c -o interception_mac.lo ../../../../gcc-4.8-20121112/libsanitizer/interception/interception_mac.cc
libtool: compile:  /sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/./gcc/g++ -B/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/./gcc/ -nostdinc++ -nostdinc++ -I/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/include/x86_64-apple-darwin12.2.0 -I/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/include -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20121112/libstdc++-v3/libsupc++ -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20121112/libstdc++-v3/include/backward -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20121112/libstdc++-v3/testsuite/util -L/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/src -L/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/src/.libs -B/sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/bin/ -B/sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/lib/ -isystem /sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/include -isystem /sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/sys-include -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I. -I../../../../gcc-4.8-20121112/libsanitizer/interception -I ../../../../gcc-4.8-20121112/libsanitizer/include -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros -Wno-c99-extensions -g -O2 -MT interception_mac.lo -MD -MP -MF .deps/interception_mac.Tpo -c ../../../../gcc-4.8-20121112/libsanitizer/interception/interception_mac.cc  -fno-common -DPIC -o .libs/interception_mac.o
../../../../gcc-4.8-20121112/libsanitizer/interception/interception_mac.cc:16:41: fatal error: mach_override/mach_override.h: No such file or directory
 #include "mach_override/mach_override.h"
                                         ^
compilation terminated.
make[3]: *** [interception_mac.lo] Error 1
make[2]: *** [all-recursive] Error 1
make[1]: *** [all-target-libsanitizer] Error 2
make: *** [all] Error 2

Is this just from a missing file in the merge or do we need to open a PR for this?
         Jack

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-12 17:20     ` Jack Howarth
@ 2012-11-12 17:34       ` Jack Howarth
  2012-11-12 17:37         ` Tobias Burnus
  2012-11-12 17:55         ` Dodji Seketeli
  0 siblings, 2 replies; 80+ messages in thread
From: Jack Howarth @ 2012-11-12 17:34 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: gcc-patches, dnovillo, jakub, wmi, davidxl,
	konstantin.s.serebryany, Tobias Burnus

On Mon, Nov 12, 2012 at 12:20:08PM -0500, Jack Howarth wrote:
> On Mon, Nov 12, 2012 at 05:07:42PM +0100, Dodji Seketeli wrote:
> > Following a request from Jakub, and given the fact that the patch set
> > have been reviewed by Diego, I have committed the last set of patches I
> > have posted to trunk.
> > 
> > This will hopefully ease the polishing work that has started already.
> > 
> > I am of course watching for the fall-outs.
> > 
> > -- 
> > 		Dodji
> 
> Dodji,
>     I am finding that at r193442 bootstrapping on x86_64-apple-darwin12 fails with...
> 
> Making all in interception
> /bin/sh ../libtool --tag=CXX   --mode=compile /sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/./gcc/g++ -B/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/./gcc/ -nostdinc++ -nostdinc++ -I/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/include/x86_64-apple-darwin12.2.0 -I/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/include -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20121112/libstdc++-v3/libsupc++ -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20121112/libstdc++-v3/include/backward -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20121112/libstdc++-v3/testsuite/util -L/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/src -L/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/src/.libs -B/sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/bin/ -B/sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/lib/ -isystem /sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/include -isystem /sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/sys-include    -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS  -I. -I../../../../gcc-4.8-20121112/libsanitizer/interception  -I ../../../../gcc-4.8-20121112/libsanitizer/include   -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long  -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros -Wno-c99-extensions  -g -O2 -MT interception_mac.lo -MD -MP -MF .deps/interception_mac.Tpo -c -o interception_mac.lo ../../../../gcc-4.8-20121112/libsanitizer/interception/interception_mac.cc
> libtool: compile:  /sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/./gcc/g++ -B/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/./gcc/ -nostdinc++ -nostdinc++ -I/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/include/x86_64-apple-darwin12.2.0 -I/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/include -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20121112/libstdc++-v3/libsupc++ -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20121112/libstdc++-v3/include/backward -I/sw/src/fink.build/gcc48-4.8.0-1000/gcc-4.8-20121112/libstdc++-v3/testsuite/util -L/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/src -L/sw/src/fink.build/gcc48-4.8.0-1000/darwin_objdir/x86_64-apple-darwin12.2.0/libstdc++-v3/src/.libs -B/sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/bin/ -B/sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/lib/ -isystem /sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/include -isystem /sw/lib/gcc4.8/x86_64-apple-darwin12.2.0/sys-include -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I. -I../../../../gcc-4.8-20121112/libsanitizer/interception -I ../../../../gcc-4.8-20121112/libsanitizer/include -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros -Wno-c99-extensions -g -O2 -MT interception_mac.lo -MD -MP -MF .deps/interception_mac.Tpo -c ../../../../gcc-4.8-20121112/libsanitizer/interception/interception_mac.cc  -fno-common -DPIC -o .libs/interception_mac.o
> ../../../../gcc-4.8-20121112/libsanitizer/interception/interception_mac.cc:16:41: fatal error: mach_override/mach_override.h: No such file or directory
>  #include "mach_override/mach_override.h"
>                                          ^
> compilation terminated.
> make[3]: *** [interception_mac.lo] Error 1
> make[2]: *** [all-recursive] Error 1
> make[1]: *** [all-target-libsanitizer] Error 2
> make: *** [all] Error 2
> 
> Is this just from a missing file in the merge or do we need to open a PR for this?
>          Jack

Dodji,
   Copying over the lib/interception/mach_override directory from llvm.org's compiler-rt 3.2 branch into libsanitizer/interception allows
the build of libsanitizer to proceed. I noticed that the mach_override subdirectory has a license file which shows...

Copyright (c) 2003-2009 Jonathan 'Wolf' Rentzsch: <http://rentzsch.com>
Some rights reserved: <http://opensource.org/licenses/mit-license.php>

Hopefully this subdirectory wasn't omitted for licensing reasons because without it the bootstrap on darwin
is broken.
        Jack`

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-12 17:34       ` Jack Howarth
@ 2012-11-12 17:37         ` Tobias Burnus
  2012-11-12 17:57           ` Jack Howarth
  2012-11-12 17:55         ` Dodji Seketeli
  1 sibling, 1 reply; 80+ messages in thread
From: Tobias Burnus @ 2012-11-12 17:37 UTC (permalink / raw)
  To: Jack Howarth
  Cc: Dodji Seketeli, gcc-patches, dnovillo, jakub, wmi, davidxl,
	konstantin.s.serebryany

Jack Howarth wrote:
> Copying over the lib/interception/mach_override directory from 
> llvm.org's compiler-rt 3.2 branch into libsanitizer/interception 
> allows the build of libsanitizer to proceed. I noticed that the 
> mach_override subdirectory has a license file which shows...

I was about to point to:

https://github.com/llvm-mirror/compiler-rt/tree/master/lib/interception/mach_override

which contains the required files.

Tobias

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-12 17:34       ` Jack Howarth
  2012-11-12 17:37         ` Tobias Burnus
@ 2012-11-12 17:55         ` Dodji Seketeli
  2012-11-12 18:40           ` Jack Howarth
  1 sibling, 1 reply; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-12 17:55 UTC (permalink / raw)
  To: Jack Howarth
  Cc: gcc-patches, dnovillo, jakub, wmi, davidxl,
	konstantin.s.serebryany, Tobias Burnus, Wei Mi

Hello Jack,

Jack Howarth <howarth@bromo.med.uc.edu> writes:
>> 
>> Dodji,
>>     I am finding that at r193442 bootstrapping on
>>     x86_64-apple-darwin12 fails with...

>> ../../../../gcc-4.8-20121112/libsanitizer/interception/interception_mac.cc:16:41: fatal error: mach_override/mach_override.h: No such file or directory
>>  #include "mach_override/mach_override.h"
>>                                          ^
>> compilation terminated.

Thank you for pointing this out.

>    Copying over the lib/interception/mach_override directory from llvm.org's compiler-rt 3.2 branch into libsanitizer/interception allows
> the build of libsanitizer to proceed. I noticed that the mach_override
> subdirectory has a license file which shows...

Interesting.

>
> Copyright (c) 2003-2009 Jonathan 'Wolf' Rentzsch: <http://rentzsch.com>
> Some rights reserved: <http://opensource.org/licenses/mit-license.php>
>
> Hopefully this subdirectory wasn't omitted for licensing reasons because without it the bootstrap on darwin
> is broken.

Yeah, hopefully.

Wei, is there any reason why the mach_override directory was left out of
the libsanitizer commit?  Maybe I missed something during my patch
slicing?

Cheers.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-12 17:37         ` Tobias Burnus
@ 2012-11-12 17:57           ` Jack Howarth
  0 siblings, 0 replies; 80+ messages in thread
From: Jack Howarth @ 2012-11-12 17:57 UTC (permalink / raw)
  To: Tobias Burnus
  Cc: Dodji Seketeli, gcc-patches, dnovillo, jakub, wmi, davidxl,
	konstantin.s.serebryany

On Mon, Nov 12, 2012 at 06:37:06PM +0100, Tobias Burnus wrote:
> Jack Howarth wrote:
>> Copying over the lib/interception/mach_override directory from  
>> llvm.org's compiler-rt 3.2 branch into libsanitizer/interception  
>> allows the build of libsanitizer to proceed. I noticed that the  
>> mach_override subdirectory has a license file which shows...
>
> I was about to point to:
>
> https://github.com/llvm-mirror/compiler-rt/tree/master/lib/interception/mach_override
>
> which contains the required files.
>
> Tobias

Tobias,
   This still leaves the question of why they weren't merged? Hopefully this isn't due
to some sort of licensing issue for that subdirectory as we can't have a bootstrap
that requires files from outside of the source tree.
       Jack

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-12 17:55         ` Dodji Seketeli
@ 2012-11-12 18:40           ` Jack Howarth
  0 siblings, 0 replies; 80+ messages in thread
From: Jack Howarth @ 2012-11-12 18:40 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: gcc-patches, dnovillo, jakub, wmi, davidxl,
	konstantin.s.serebryany, Tobias Burnus

On Mon, Nov 12, 2012 at 06:55:32PM +0100, Dodji Seketeli wrote:
> Hello Jack,
> 
> Jack Howarth <howarth@bromo.med.uc.edu> writes:
> >> 
> >> Dodji,
> >>     I am finding that at r193442 bootstrapping on
> >>     x86_64-apple-darwin12 fails with...
> 
> >> ../../../../gcc-4.8-20121112/libsanitizer/interception/interception_mac.cc:16:41: fatal error: mach_override/mach_override.h: No such file or directory
> >>  #include "mach_override/mach_override.h"
> >>                                          ^
> >> compilation terminated.
> 
> Thank you for pointing this out.
> 
> >    Copying over the lib/interception/mach_override directory from llvm.org's compiler-rt 3.2 branch into libsanitizer/interception allows
> > the build of libsanitizer to proceed. I noticed that the mach_override
> > subdirectory has a license file which shows...
> 
> Interesting.
> 
> >
> > Copyright (c) 2003-2009 Jonathan 'Wolf' Rentzsch: <http://rentzsch.com>
> > Some rights reserved: <http://opensource.org/licenses/mit-license.php>
> >
> > Hopefully this subdirectory wasn't omitted for licensing reasons because without it the bootstrap on darwin
> > is broken.
> 
> Yeah, hopefully.
> 
> Wei, is there any reason why the mach_override directory was left out of
> the libsanitizer commit?  Maybe I missed something during my patch
> slicing?

Dodji,
   I don't see the mach_override directory in the asan gcc branch either so
it appears to have never been ported from llvm's compiler-rt.
            Jack

> 
> Cheers.
> 
> -- 
> 		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 10/10] Import the asan runtime library into GCC tree
  2012-11-12 11:47       ` Dodji Seketeli
@ 2012-11-12 18:59         ` H.J. Lu
  2012-11-14 11:11           ` H.J. Lu
  0 siblings, 1 reply; 80+ messages in thread
From: H.J. Lu @ 2012-11-12 18:59 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Diego Novillo, gcc-patches, jakub, wmi, davidxl,
	konstantin.s.serebryany, Tobias Burnus

On Mon, Nov 12, 2012 at 3:47 AM, Dodji Seketeli <dodji@seketeli.org> wrote:
> Diego Novillo <dnovillo@google.com> writes:
>
>> On 2012-11-02 16:10 , Dodji Seketeli wrote:
>>
>>>          * configure.ac: Add libsanitizer to target_libraries.
>>>      * Makefile.def: Ditto.
>>>      * configure: Regenerate.
>>>      * Makefile.in: Regenerate.
>>>      * libsanitizer: New directory for asan runtime.  Contains an empty
>>>      tsan directory.
>>>
>>> gcc:
>>>      * gcc.c (LINK_COMMAND_SPEC): Add -lasan to link command
>>>      if -faddress-sanitizer is on.
>>
>> OK with Jakub's comments addressed.
>>
>> References to -fasan in diagnostics should be replaced.  But there's
>> been another flag name change upstream, so let's do it together with
>> the new flag names.
>
> Done.   This also addresses the comment later made by Tobias below:
>
> Tobias Burnus <burnus@net-b.de> writes:
>
>> Other issues:
>
>> * Probably fixed on the branch: gcc/gcc.c still has "fasan" instead of
>> "faddress-sanitizer" for the spec:
>> +    %{fasan:-lasan}
>
> Below is a link to the updated patch.
>
> This patch imports the runtime library in the GCC tree, ensures that
> -lasan is passed to the linker when -faddress-sanitizer is used and
> sets up the build system accordingly.
>
>      * configure.ac: Add libsanitizer to target_libraries.
>         * Makefile.def: Ditto.
>         * configure: Regenerate.
>         * Makefile.in: Regenerate.
>         * libsanitizer: New directory for asan runtime.  Contains an empty
>         tsan directory.
>
> gcc:
>         * gcc.c (LINK_COMMAND_SPEC): Add -laddress-sanitizer to link command
>         if -faddress-sanitizer is on.
>
> libsanitizer:
>
>         Initial checkin: migrate asan runtime from llvm.
>
> http://people.redhat.com/~dseketel/gcc/patches/0011-Import-the-asan-runtime-library-into-GCC-tree.patch
>
> --
>                 Dodji

I checked in this patch to add libsanitizer generated files.

-- 
H.J.
---
diff --git a/contrib/ChangeLog b/contrib/ChangeLog
index ef5d6f6..233870d 100644
--- a/contrib/ChangeLog
+++ b/contrib/ChangeLog
@@ -1,3 +1,7 @@
+2012-11-12  H.J. Lu  <hongjiu.lu@intel.com>
+
+	* gcc_update: Add libsanitizer generated files.
+
 2012-11-05  Lawrence Crowl  <crowl@google.com>

 	* compare_two_ftime_report_sets: New.
diff --git a/contrib/gcc_update b/contrib/gcc_update
index 02897ab..d9c2dfb 100755
--- a/contrib/gcc_update
+++ b/contrib/gcc_update
@@ -149,6 +149,9 @@ libatomic/Makefile.in: libatomic/Makefile.am
libatomic/aclocal.m4
 libatomic/testsuite/Makefile.in: libatomic/testsuite/Makefile.am
libatomic/aclocal.m4
 libatomic/configure: libatomic/configure.ac libatomic/aclocal.m4
 libatomic/auto-config.h.in: libatomic/configure.ac libatomic/aclocal.m4
+libsanitizer/aclocal.m4: libsanitizer/configure.ac
+libsanitizer/Makefile.in: libsanitizer/Makefile.am libsanitizer/aclocal.m4
+libsanitizer/configure: libsanitizer/configure.ac libsanitizer/aclocal.m4
 # Top level
 Makefile.in: Makefile.tpl Makefile.def
 configure: configure.ac config/acx.m4

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
                   ` (12 preceding siblings ...)
  2012-11-02 22:53 ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
@ 2012-11-12 20:39 ` H.J. Lu
  2012-11-12 22:15   ` Ian Lance Taylor
  2012-11-15 19:42 ` Jack Howarth
  14 siblings, 1 reply; 80+ messages in thread
From: H.J. Lu @ 2012-11-12 20:39 UTC (permalink / raw)
  To: dodji
  Cc: gcc-patches, dnovillo, jakub, wmi, davidxl,
	konstantin.s.serebryany, Dodji Seketeli

On Thu, Nov 1, 2012 at 12:52 PM,  <dodji@redhat.com> wrote:
> From: Dodji Seketeli <dodji@seketeli.org>
>
> Hello,
>
> The set of patches following this message represents the work that
> happened on the asan branch to build up the Address Sanitizer work
> started in the Google branch.
>
> Address Sanitizer (aka asan) is a memory error detector.  It finds
> use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++
> programs.
>
> One can learn about the way it works by reading the pdf slides at [1],
> or by reading the documentation on the wiki page of the project at [2].
>
> To make a long story short, it works by associating each memory region
> of eight consecutive bytes with a shadow byte that tells whether if
> each byte of the memory region is addressable or not.  So,
> conceptually, there is a function 'MemToShadow' which, for each set of
> contiguous eight bytes of memory returns a shadow byte that tells
> whether if each byte is accessible or not.
>
> Then, each memory access is instrumented by the asan pass to retrieve
> the shadow byte of the accessed memory; if the access is to a memory
> address that is deemed non-accessible, a call to an asan runtime
> library function is issued to report a meaningful error to the user,
> and the access is performed, letting the user program proceed despite
> the error.
>
> The advantage of this approach, compared to say, Valgrind[4] is the
> lower time and space overhead.  Eventually, when this tool becomes
> more solid, it'll become complementary to Valgrind.
>
> Apart from the compiler components, asan needs a runtime library to
> function.  We share that library with the LLVM implementation of asan
> that is described at [3].  The last patch of the set imports this
> library in its pristine form into our tree.  The plan is to regularly
> synchronize it with its LLVM upstream repository.
>
> On behalf of the GCC asan developers listed below, I am thus proposing
> these patches for inclusion into trunk.  I chose to follow the
> chronological commits that happened on the [asan] branch, to ease the
> authorship propagation.  Except for some few exceptions, each of these
> commits are reasonably logically atomic, so they hopefully shouldn't
> be too hard to review.
>
> The first patch is the initial import of the asan state from the
> Google branch into the [asan] branch.  Subsequent patches clean the
> code up, add features like protection of stack and global variables,
> instrumentation of memory access through built-in functions, and, last
> but not least, the import of the runtime library.
>
> Please note that the ChangeLog.asan is meant to disappear at commit
> time, as its content will be updated (for the dates) and prepended to
> the normal ChangeLog file.
>
> One noticeable shortcoming that we have at the moment is the lack of a
> DejaGNU test harness for this.  This is planned to be addressed as
> soon as possible.
>

Don't we need a bugzilla component for Sanitizer?

-- 
H.J.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-12 20:39 ` H.J. Lu
@ 2012-11-12 22:15   ` Ian Lance Taylor
  0 siblings, 0 replies; 80+ messages in thread
From: Ian Lance Taylor @ 2012-11-12 22:15 UTC (permalink / raw)
  To: H.J. Lu
  Cc: dodji, gcc-patches, dnovillo, jakub, wmi, davidxl,
	konstantin.s.serebryany, Dodji Seketeli

On Mon, Nov 12, 2012 at 12:39 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
> Don't we need a bugzilla component for Sanitizer?

"other"?

Ian

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 10/10] Import the asan runtime library into GCC tree
  2012-11-12 18:59         ` H.J. Lu
@ 2012-11-14 11:11           ` H.J. Lu
  2012-11-14 11:42             ` H.J. Lu
  0 siblings, 1 reply; 80+ messages in thread
From: H.J. Lu @ 2012-11-14 11:11 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Diego Novillo, gcc-patches, jakub, wmi, davidxl,
	konstantin.s.serebryany, Tobias Burnus

On Mon, Nov 12, 2012 at 10:59 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Mon, Nov 12, 2012 at 3:47 AM, Dodji Seketeli <dodji@seketeli.org> wrote:
>> Diego Novillo <dnovillo@google.com> writes:
>>
>>> On 2012-11-02 16:10 , Dodji Seketeli wrote:
>>>
>>>>          * configure.ac: Add libsanitizer to target_libraries.
>>>>      * Makefile.def: Ditto.
>>>>      * configure: Regenerate.
>>>>      * Makefile.in: Regenerate.
>>>>      * libsanitizer: New directory for asan runtime.  Contains an empty
>>>>      tsan directory.
>>>>
>>>> gcc:
>>>>      * gcc.c (LINK_COMMAND_SPEC): Add -lasan to link command
>>>>      if -faddress-sanitizer is on.
>>>
>>> OK with Jakub's comments addressed.
>>>
>>> References to -fasan in diagnostics should be replaced.  But there's
>>> been another flag name change upstream, so let's do it together with
>>> the new flag names.
>>
>> Done.   This also addresses the comment later made by Tobias below:
>>
>> Tobias Burnus <burnus@net-b.de> writes:
>>
>>> Other issues:
>>
>>> * Probably fixed on the branch: gcc/gcc.c still has "fasan" instead of
>>> "faddress-sanitizer" for the spec:
>>> +    %{fasan:-lasan}
>>
>> Below is a link to the updated patch.
>>
>> This patch imports the runtime library in the GCC tree, ensures that
>> -lasan is passed to the linker when -faddress-sanitizer is used and
>> sets up the build system accordingly.
>>
>>      * configure.ac: Add libsanitizer to target_libraries.
>>         * Makefile.def: Ditto.
>>         * configure: Regenerate.
>>         * Makefile.in: Regenerate.
>>         * libsanitizer: New directory for asan runtime.  Contains an empty
>>         tsan directory.
>>
>> gcc:
>>         * gcc.c (LINK_COMMAND_SPEC): Add -laddress-sanitizer to link command
>>         if -faddress-sanitizer is on.
>>
>> libsanitizer:
>>
>>         Initial checkin: migrate asan runtime from llvm.
>>
>> http://people.redhat.com/~dseketel/gcc/patches/0011-Import-the-asan-runtime-library-into-GCC-tree.patch
>>
>> --
>>                 Dodji
>
> I checked in this patch to add libsanitizer generated files.
>
> --
> H.J.
> ---
> diff --git a/contrib/ChangeLog b/contrib/ChangeLog
> index ef5d6f6..233870d 100644
> --- a/contrib/ChangeLog
> +++ b/contrib/ChangeLog
> @@ -1,3 +1,7 @@
> +2012-11-12  H.J. Lu  <hongjiu.lu@intel.com>
> +
> +       * gcc_update: Add libsanitizer generated files.
> +
>  2012-11-05  Lawrence Crowl  <crowl@google.com>
>
>         * compare_two_ftime_report_sets: New.
> diff --git a/contrib/gcc_update b/contrib/gcc_update
> index 02897ab..d9c2dfb 100755
> --- a/contrib/gcc_update
> +++ b/contrib/gcc_update
> @@ -149,6 +149,9 @@ libatomic/Makefile.in: libatomic/Makefile.am
> libatomic/aclocal.m4
>  libatomic/testsuite/Makefile.in: libatomic/testsuite/Makefile.am
> libatomic/aclocal.m4
>  libatomic/configure: libatomic/configure.ac libatomic/aclocal.m4
>  libatomic/auto-config.h.in: libatomic/configure.ac libatomic/aclocal.m4
> +libsanitizer/aclocal.m4: libsanitizer/configure.ac
> +libsanitizer/Makefile.in: libsanitizer/Makefile.am libsanitizer/aclocal.m4
> +libsanitizer/configure: libsanitizer/configure.ac libsanitizer/aclocal.m4
>  # Top level
>  Makefile.in: Makefile.tpl Makefile.def
>  configure: configure.ac config/acx.m4

I checked in this to update libsanitizer generated files.

-- 
H.J.
--
Index: ChangeLog
===================================================================
--- ChangeLog	(revision 193496)
+++ ChangeLog	(working copy)
@@ -1,3 +1,7 @@
+2012-11-14  H.J. Lu  <hongjiu.lu@intel.com>
+
+	* gcc_update: Update libsanitizer generated files.
+
 2012-11-12  Tobias Burnus  <burnus@net-b.de>

 	* gcc_update: Add libquadmath generated files.
Index: gcc_update
===================================================================
--- gcc_update	(revision 193496)
+++ gcc_update	(working copy)
@@ -152,9 +152,12 @@
 libatomic/testsuite/Makefile.in: libatomic/testsuite/Makefile.am
libatomic/aclocal.m4
 libatomic/configure: libatomic/configure.ac libatomic/aclocal.m4
 libatomic/auto-config.h.in: libatomic/configure.ac libatomic/aclocal.m4
-libsanitizer/aclocal.m4: libsanitizer/configure.ac
+libsanitizer/aclocal.m4: libsanitizer/configure.ac libsanitizer/acinclude.m4
 libsanitizer/Makefile.in: libsanitizer/Makefile.am libsanitizer/aclocal.m4
 libsanitizer/configure: libsanitizer/configure.ac libsanitizer/aclocal.m4
+libsanitizer/asan/Makefile.in: libsanitizer/asan/Makefile.am
libsanitizer/aclocal.m4
+libsanitizer/interception/Makefile.in:
libsanitizer/interception/Makefile.am libsanitizer/aclocal.m4
+libsanitizer/sanitizer_common/Makefile.in:
libsanitizer/sanitizer_common/Makefile.am libsanitizer/aclocal.m4
 # Top level
 Makefile.in: Makefile.tpl Makefile.def
 configure: configure.ac config/acx.m4

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 10/10] Import the asan runtime library into GCC tree
  2012-11-14 11:11           ` H.J. Lu
@ 2012-11-14 11:42             ` H.J. Lu
  0 siblings, 0 replies; 80+ messages in thread
From: H.J. Lu @ 2012-11-14 11:42 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Diego Novillo, gcc-patches, jakub, wmi, davidxl,
	konstantin.s.serebryany, Tobias Burnus

On Wed, Nov 14, 2012 at 3:11 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Mon, Nov 12, 2012 at 10:59 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Mon, Nov 12, 2012 at 3:47 AM, Dodji Seketeli <dodji@seketeli.org> wrote:
>>> Diego Novillo <dnovillo@google.com> writes:
>>>
>>>> On 2012-11-02 16:10 , Dodji Seketeli wrote:
>>>>
>>>>>          * configure.ac: Add libsanitizer to target_libraries.
>>>>>      * Makefile.def: Ditto.
>>>>>      * configure: Regenerate.
>>>>>      * Makefile.in: Regenerate.
>>>>>      * libsanitizer: New directory for asan runtime.  Contains an empty
>>>>>      tsan directory.
>>>>>
>>>>> gcc:
>>>>>      * gcc.c (LINK_COMMAND_SPEC): Add -lasan to link command
>>>>>      if -faddress-sanitizer is on.
>>>>
>>>> OK with Jakub's comments addressed.
>>>>
>>>> References to -fasan in diagnostics should be replaced.  But there's
>>>> been another flag name change upstream, so let's do it together with
>>>> the new flag names.
>>>
>>> Done.   This also addresses the comment later made by Tobias below:
>>>
>>> Tobias Burnus <burnus@net-b.de> writes:
>>>
>>>> Other issues:
>>>
>>>> * Probably fixed on the branch: gcc/gcc.c still has "fasan" instead of
>>>> "faddress-sanitizer" for the spec:
>>>> +    %{fasan:-lasan}
>>>
>>> Below is a link to the updated patch.
>>>
>>> This patch imports the runtime library in the GCC tree, ensures that
>>> -lasan is passed to the linker when -faddress-sanitizer is used and
>>> sets up the build system accordingly.
>>>
>>>      * configure.ac: Add libsanitizer to target_libraries.
>>>         * Makefile.def: Ditto.
>>>         * configure: Regenerate.
>>>         * Makefile.in: Regenerate.
>>>         * libsanitizer: New directory for asan runtime.  Contains an empty
>>>         tsan directory.
>>>
>>> gcc:
>>>         * gcc.c (LINK_COMMAND_SPEC): Add -laddress-sanitizer to link command
>>>         if -faddress-sanitizer is on.
>>>
>>> libsanitizer:
>>>
>>>         Initial checkin: migrate asan runtime from llvm.
>>>
>>> http://people.redhat.com/~dseketel/gcc/patches/0011-Import-the-asan-runtime-library-into-GCC-tree.patch
>>>

I renamed ChangeLog.asan to ChangeLog.

-- 
H.J.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
                   ` (13 preceding siblings ...)
  2012-11-12 20:39 ` H.J. Lu
@ 2012-11-15 19:42 ` Jack Howarth
  2012-11-15 23:42   ` Konstantin Serebryany
  14 siblings, 1 reply; 80+ messages in thread
From: Jack Howarth @ 2012-11-15 19:42 UTC (permalink / raw)
  To: dodji
  Cc: gcc-patches, dnovillo, jakub, wmi, davidxl,
	konstantin.s.serebryany, Dodji Seketeli

On Thu, Nov 01, 2012 at 08:52:33PM +0100, dodji@redhat.com wrote:
> From: Dodji Seketeli <dodji@seketeli.org>
> 
> Hello,
> 
> The set of patches following this message represents the work that
> happened on the asan branch to build up the Address Sanitizer work
> started in the Google branch.
> 
> Address Sanitizer (aka asan) is a memory error detector.  It finds
> use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++
> programs.
> 
> One can learn about the way it works by reading the pdf slides at [1],
> or by reading the documentation on the wiki page of the project at [2].
> 
> To make a long story short, it works by associating each memory region
> of eight consecutive bytes with a shadow byte that tells whether if
> each byte of the memory region is addressable or not.  So,
> conceptually, there is a function 'MemToShadow' which, for each set of
> contiguous eight bytes of memory returns a shadow byte that tells
> whether if each byte is accessible or not.
> 
> Then, each memory access is instrumented by the asan pass to retrieve
> the shadow byte of the accessed memory; if the access is to a memory
> address that is deemed non-accessible, a call to an asan runtime
> library function is issued to report a meaningful error to the user,
> and the access is performed, letting the user program proceed despite
> the error.
> 
> The advantage of this approach, compared to say, Valgrind[4] is the
> lower time and space overhead.  Eventually, when this tool becomes
> more solid, it'll become complementary to Valgrind.
> 
> Apart from the compiler components, asan needs a runtime library to
> function.  We share that library with the LLVM implementation of asan
> that is described at [3].  The last patch of the set imports this
> library in its pristine form into our tree.  The plan is to regularly
> synchronize it with its LLVM upstream repository.
> 
> On behalf of the GCC asan developers listed below, I am thus proposing
> these patches for inclusion into trunk.  I chose to follow the
> chronological commits that happened on the [asan] branch, to ease the
> authorship propagation.  Except for some few exceptions, each of these
> commits are reasonably logically atomic, so they hopefully shouldn't
> be too hard to review.
> 
> The first patch is the initial import of the asan state from the
> Google branch into the [asan] branch.  Subsequent patches clean the
> code up, add features like protection of stack and global variables,
> instrumentation of memory access through built-in functions, and, last
> but not least, the import of the runtime library.
> 
> Please note that the ChangeLog.asan is meant to disappear at commit
> time, as its content will be updated (for the dates) and prepended to
> the normal ChangeLog file.
> 
> One noticeable shortcoming that we have at the moment is the lack of a
> DejaGNU test harness for this.  This is planned to be addressed as
> soon as possible.
> 
> Please find below is a summary of the patches of the set.
> 
> Thanks.
> 
> [1]: http://gcc.gnu.org/wiki/cauldron2012?action=AttachFile&do=get&target=kcc.pdf
> [2]: http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm
> [3]: http://code.google.com/p/address-sanitizer/w/list
> [4]: http://www.valgrind.org
> 
> Diego Novillo (2):
>   Initial import of asan from the Google branch
>   Rename tree-asan.[ch] to asan.[ch]
> 
> Dodji Seketeli (3):
>   Make build_check_stmt accept an SSA_NAME for its base
>   Factorize condition insertion code out of build_check_stmt
>   Instrument built-in memory access function calls
> 
> Jakub Jelinek (6):
>   Initial asan cleanups
>   Emit GIMPLE directly instead of gimplifying GENERIC.
>   Allow asan at -O0
>   Implement protection of stack variables
>   Implement protection of global variables
>   Fix a couple of ICEs.
> 
> Wei Mi (2):
>   Don't forget to protect 32 bytes aligned global variables.
>   Import the asan runtime library into GCC tree
> 
>  ChangeLog.asan                                     |     7 +
>  Makefile.def                                       |     2 +
>  Makefile.in                                        |   487 +-
>  configure                                          |     1 +
>  configure.ac                                       |     1 +
>  gcc/ChangeLog.asan                                 |   175 +
>  gcc/Makefile.in                                    |    10 +-
>  gcc/asan.c                                         |  1495 ++
>  gcc/asan.h                                         |    70 +
>  gcc/cfgexpand.c                                    |   165 +-
>  gcc/common.opt                                     |     4 +
>  gcc/config/i386/i386.c                             |    11 +
>  gcc/doc/invoke.texi                                |     8 +-
>  gcc/doc/tm.texi                                    |     6 +
>  gcc/doc/tm.texi.in                                 |     2 +
>  gcc/gcc.c                                          |     1 +
>  gcc/passes.c                                       |     2 +
>  gcc/target.def                                     |    11 +
>  gcc/toplev.c                                       |    14 +
>  gcc/tree-pass.h                                    |     2 +
>  gcc/varasm.c                                       |    22 +
>  libasan/ChangeLog.asan                             |     3 +
>  libasan/LICENSE.TXT                                |    97 +
>  libasan/Makefile.am                                |    98 +
>  libasan/Makefile.in                                |   992 ++
>  libasan/README.gcc                                 |     4 +
>  libasan/aclocal.m4                                 |  9645 ++++++++++
>  libasan/asan_allocator.cc                          |  1045 ++
>  libasan/asan_allocator.h                           |   177 +
>  libasan/asan_flags.h                               |   103 +
>  libasan/asan_globals.cc                            |   206 +
>  libasan/asan_intercepted_functions.h               |   217 +
>  libasan/asan_interceptors.cc                       |   704 +
>  libasan/asan_interceptors.h                        |    39 +
>  libasan/asan_internal.h                            |   169 +
>  libasan/asan_linux.cc                              |   150 +
>  libasan/asan_lock.h                                |    40 +
>  libasan/asan_mac.cc                                |   526 +
>  libasan/asan_mac.h                                 |    54 +
>  libasan/asan_malloc_linux.cc                       |   142 +
>  libasan/asan_malloc_mac.cc                         |   427 +
>  libasan/asan_malloc_win.cc                         |   140 +
>  libasan/asan_mapping.h                             |   120 +
>  libasan/asan_new_delete.cc                         |    54 +
>  libasan/asan_poisoning.cc                          |   151 +
>  libasan/asan_posix.cc                              |   118 +
>  libasan/asan_report.cc                             |   492 +
>  libasan/asan_report.h                              |    51 +
>  libasan/asan_rtl.cc                                |   404 +
>  libasan/asan_stack.cc                              |    35 +
>  libasan/asan_stack.h                               |    52 +
>  libasan/asan_stats.cc                              |    86 +
>  libasan/asan_stats.h                               |    65 +
>  libasan/asan_thread.cc                             |   153 +
>  libasan/asan_thread.h                              |   103 +
>  libasan/asan_thread_registry.cc                    |   188 +
>  libasan/asan_thread_registry.h                     |    83 +
>  libasan/asan_win.cc                                |   190 +
>  libasan/config.guess                               |  1530 ++
>  libasan/config.sub                                 |  1773 ++
>  libasan/configure                                  | 17515 +++++++++++++++++++
>  libasan/configure.ac                               |    67 +
>  libasan/depcomp                                    |   630 +
>  libasan/include/sanitizer/asan_interface.h         |   197 +
>  libasan/include/sanitizer/common_interface_defs.h  |    66 +
>  libasan/install-sh                                 |   527 +
>  libasan/interception/interception.h                |   195 +
>  libasan/interception/interception_linux.cc         |    28 +
>  libasan/interception/interception_linux.h          |    35 +
>  libasan/interception/interception_mac.cc           |    29 +
>  libasan/interception/interception_mac.h            |    47 +
>  libasan/interception/interception_win.cc           |   149 +
>  libasan/interception/interception_win.h            |    43 +
>  libasan/libtool-version                            |     6 +
>  libasan/ltmain.sh                                  |  9661 ++++++++++
>  libasan/missing                                    |   376 +
>  libasan/sanitizer_common/sanitizer_allocator.cc    |    83 +
>  libasan/sanitizer_common/sanitizer_allocator64.h   |   573 +
>  libasan/sanitizer_common/sanitizer_atomic.h        |    63 +
>  libasan/sanitizer_common/sanitizer_atomic_clang.h  |   120 +
>  libasan/sanitizer_common/sanitizer_atomic_msvc.h   |   134 +
>  libasan/sanitizer_common/sanitizer_common.cc       |   151 +
>  libasan/sanitizer_common/sanitizer_common.h        |   181 +
>  libasan/sanitizer_common/sanitizer_flags.cc        |    95 +
>  libasan/sanitizer_common/sanitizer_flags.h         |    25 +
>  libasan/sanitizer_common/sanitizer_internal_defs.h |   186 +
>  libasan/sanitizer_common/sanitizer_libc.cc         |   189 +
>  libasan/sanitizer_common/sanitizer_libc.h          |    69 +
>  libasan/sanitizer_common/sanitizer_linux.cc        |   296 +
>  libasan/sanitizer_common/sanitizer_list.h          |   118 +
>  libasan/sanitizer_common/sanitizer_mac.cc          |   249 +
>  libasan/sanitizer_common/sanitizer_mutex.h         |   106 +
>  libasan/sanitizer_common/sanitizer_placement_new.h |    31 +
>  libasan/sanitizer_common/sanitizer_posix.cc        |   187 +
>  libasan/sanitizer_common/sanitizer_printf.cc       |   196 +
>  libasan/sanitizer_common/sanitizer_procmaps.h      |    95 +
>  libasan/sanitizer_common/sanitizer_stackdepot.cc   |   194 +
>  libasan/sanitizer_common/sanitizer_stackdepot.h    |    27 +
>  libasan/sanitizer_common/sanitizer_stacktrace.cc   |   245 +
>  libasan/sanitizer_common/sanitizer_stacktrace.h    |    73 +
>  libasan/sanitizer_common/sanitizer_symbolizer.cc   |   311 +
>  libasan/sanitizer_common/sanitizer_symbolizer.h    |    97 +
>  .../sanitizer_common/sanitizer_symbolizer_linux.cc |   162 +
>  .../sanitizer_common/sanitizer_symbolizer_mac.cc   |    31 +
>  .../sanitizer_common/sanitizer_symbolizer_win.cc   |    33 +
>  libasan/sanitizer_common/sanitizer_win.cc          |   205 +
>  106 files changed, 57193 insertions(+), 25 deletions(-)
>  create mode 100644 ChangeLog.asan
>  create mode 100644 gcc/ChangeLog.asan
>  create mode 100644 gcc/asan.c
>  create mode 100644 gcc/asan.h
>  create mode 100644 libasan/ChangeLog.asan
>  create mode 100644 libasan/LICENSE.TXT
>  create mode 100644 libasan/Makefile.am
>  create mode 100644 libasan/Makefile.in
>  create mode 100644 libasan/README.gcc
>  create mode 100644 libasan/aclocal.m4
>  create mode 100644 libasan/asan_allocator.cc
>  create mode 100644 libasan/asan_allocator.h
>  create mode 100644 libasan/asan_flags.h
>  create mode 100644 libasan/asan_globals.cc
>  create mode 100644 libasan/asan_intercepted_functions.h
>  create mode 100644 libasan/asan_interceptors.cc
>  create mode 100644 libasan/asan_interceptors.h
>  create mode 100644 libasan/asan_internal.h
>  create mode 100644 libasan/asan_linux.cc
>  create mode 100644 libasan/asan_lock.h
>  create mode 100644 libasan/asan_mac.cc
>  create mode 100644 libasan/asan_mac.h
>  create mode 100644 libasan/asan_malloc_linux.cc
>  create mode 100644 libasan/asan_malloc_mac.cc
>  create mode 100644 libasan/asan_malloc_win.cc
>  create mode 100644 libasan/asan_mapping.h
>  create mode 100644 libasan/asan_new_delete.cc
>  create mode 100644 libasan/asan_poisoning.cc
>  create mode 100644 libasan/asan_posix.cc
>  create mode 100644 libasan/asan_report.cc
>  create mode 100644 libasan/asan_report.h
>  create mode 100644 libasan/asan_rtl.cc
>  create mode 100644 libasan/asan_stack.cc
>  create mode 100644 libasan/asan_stack.h
>  create mode 100644 libasan/asan_stats.cc
>  create mode 100644 libasan/asan_stats.h
>  create mode 100644 libasan/asan_thread.cc
>  create mode 100644 libasan/asan_thread.h
>  create mode 100644 libasan/asan_thread_registry.cc
>  create mode 100644 libasan/asan_thread_registry.h
>  create mode 100644 libasan/asan_win.cc
>  create mode 100644 libasan/config.guess
>  create mode 100644 libasan/config.sub
>  create mode 100644 libasan/configure
>  create mode 100644 libasan/configure.ac
>  create mode 100644 libasan/depcomp
>  create mode 100644 libasan/include/sanitizer/asan_interface.h
>  create mode 100644 libasan/include/sanitizer/common_interface_defs.h
>  create mode 100644 libasan/install-sh
>  create mode 100644 libasan/interception/interception.h
>  create mode 100644 libasan/interception/interception_linux.cc
>  create mode 100644 libasan/interception/interception_linux.h
>  create mode 100644 libasan/interception/interception_mac.cc
>  create mode 100644 libasan/interception/interception_mac.h
>  create mode 100644 libasan/interception/interception_win.cc
>  create mode 100644 libasan/interception/interception_win.h
>  create mode 100644 libasan/libtool-version
>  create mode 100644 libasan/ltmain.sh
>  create mode 100644 libasan/missing
>  create mode 100644 libasan/sanitizer_common/sanitizer_allocator.cc
>  create mode 100644 libasan/sanitizer_common/sanitizer_allocator64.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_atomic.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_atomic_clang.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_atomic_msvc.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_common.cc
>  create mode 100644 libasan/sanitizer_common/sanitizer_common.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_flags.cc
>  create mode 100644 libasan/sanitizer_common/sanitizer_flags.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_internal_defs.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_libc.cc
>  create mode 100644 libasan/sanitizer_common/sanitizer_libc.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_linux.cc
>  create mode 100644 libasan/sanitizer_common/sanitizer_list.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_mac.cc
>  create mode 100644 libasan/sanitizer_common/sanitizer_mutex.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_placement_new.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_posix.cc
>  create mode 100644 libasan/sanitizer_common/sanitizer_printf.cc
>  create mode 100644 libasan/sanitizer_common/sanitizer_procmaps.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_stackdepot.cc
>  create mode 100644 libasan/sanitizer_common/sanitizer_stackdepot.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_stacktrace.cc
>  create mode 100644 libasan/sanitizer_common/sanitizer_stacktrace.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer.cc
>  create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer.h
>  create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer_linux.cc
>  create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer_mac.cc
>  create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer_win.cc
>  create mode 100644 libasan/sanitizer_common/sanitizer_win.cc
>

Dodji,
    The Google branch is missing the required interception/mach_override/mach_override.h
and interception/mach_override/mach_override.c files from compiler-rt svn for darwin. I have 
posted what I believe to be the final patch which eanbles libsanitizer on darwin...

http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01285.html

which has been tested with the existing asan testsuite, the use-after-free.c testcase as 
well as the Polyhedron 2005 benchmarks for -O1 -g -fno-omit-frame-pointer -faddress-sanitizer
and -O3 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -faddress-sanitizer
to prove that the current mach_override from upstream is sufficient for darwin to use.
Due to the large number of maintainers for libsanitizer, it is unclear who is the person
responsible for upstream merges to lobby for these files to be ported into gcc trunk.
With Alexander Potapenko's commit of the bug fix to mach_override/mach_override.c
required for FSF gcc...

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121112/155989.html

...there really is no reason to continue to delay (as the interpose code simply won't
be completed in time for gcc 4.8.0). Can we please get some movement on importing
these missing files from upstream? Thanks.
             Jack

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-15 19:42 ` Jack Howarth
@ 2012-11-15 23:42   ` Konstantin Serebryany
  2012-11-16  8:27     ` Dodji Seketeli
  0 siblings, 1 reply; 80+ messages in thread
From: Konstantin Serebryany @ 2012-11-15 23:42 UTC (permalink / raw)
  To: Jack Howarth
  Cc: dodji, gcc-patches, dnovillo, jakub, wmi, davidxl,
	Dodji Seketeli, Alexander Potapenko

I see no problems with committing mach_override to gcc.
The code should be verbatim copy from
llvm/projects/compiler-rt/lib/interception/mach_override
Note that this code comes with an MIT license and was not developed by
Google (we did add quite a few patches).

Sorry for delay with replies, I am lagging behind emails.
Also, Alexander Potapenko is the best person to ask about asan-darwin.
Maybe we can add him to the list of sanitizer maintainers?

--kcc


On Thu, Nov 15, 2012 at 11:41 AM, Jack Howarth <howarth@bromo.med.uc.edu> wrote:
> On Thu, Nov 01, 2012 at 08:52:33PM +0100, dodji@redhat.com wrote:
>> From: Dodji Seketeli <dodji@seketeli.org>
>>
>> Hello,
>>
>> The set of patches following this message represents the work that
>> happened on the asan branch to build up the Address Sanitizer work
>> started in the Google branch.
>>
>> Address Sanitizer (aka asan) is a memory error detector.  It finds
>> use-after-free and {heap,stack,global}-buffer overflow bugs in C/C++
>> programs.
>>
>> One can learn about the way it works by reading the pdf slides at [1],
>> or by reading the documentation on the wiki page of the project at [2].
>>
>> To make a long story short, it works by associating each memory region
>> of eight consecutive bytes with a shadow byte that tells whether if
>> each byte of the memory region is addressable or not.  So,
>> conceptually, there is a function 'MemToShadow' which, for each set of
>> contiguous eight bytes of memory returns a shadow byte that tells
>> whether if each byte is accessible or not.
>>
>> Then, each memory access is instrumented by the asan pass to retrieve
>> the shadow byte of the accessed memory; if the access is to a memory
>> address that is deemed non-accessible, a call to an asan runtime
>> library function is issued to report a meaningful error to the user,
>> and the access is performed, letting the user program proceed despite
>> the error.
>>
>> The advantage of this approach, compared to say, Valgrind[4] is the
>> lower time and space overhead.  Eventually, when this tool becomes
>> more solid, it'll become complementary to Valgrind.
>>
>> Apart from the compiler components, asan needs a runtime library to
>> function.  We share that library with the LLVM implementation of asan
>> that is described at [3].  The last patch of the set imports this
>> library in its pristine form into our tree.  The plan is to regularly
>> synchronize it with its LLVM upstream repository.
>>
>> On behalf of the GCC asan developers listed below, I am thus proposing
>> these patches for inclusion into trunk.  I chose to follow the
>> chronological commits that happened on the [asan] branch, to ease the
>> authorship propagation.  Except for some few exceptions, each of these
>> commits are reasonably logically atomic, so they hopefully shouldn't
>> be too hard to review.
>>
>> The first patch is the initial import of the asan state from the
>> Google branch into the [asan] branch.  Subsequent patches clean the
>> code up, add features like protection of stack and global variables,
>> instrumentation of memory access through built-in functions, and, last
>> but not least, the import of the runtime library.
>>
>> Please note that the ChangeLog.asan is meant to disappear at commit
>> time, as its content will be updated (for the dates) and prepended to
>> the normal ChangeLog file.
>>
>> One noticeable shortcoming that we have at the moment is the lack of a
>> DejaGNU test harness for this.  This is planned to be addressed as
>> soon as possible.
>>
>> Please find below is a summary of the patches of the set.
>>
>> Thanks.
>>
>> [1]: http://gcc.gnu.org/wiki/cauldron2012?action=AttachFile&do=get&target=kcc.pdf
>> [2]: http://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm
>> [3]: http://code.google.com/p/address-sanitizer/w/list
>> [4]: http://www.valgrind.org
>>
>> Diego Novillo (2):
>>   Initial import of asan from the Google branch
>>   Rename tree-asan.[ch] to asan.[ch]
>>
>> Dodji Seketeli (3):
>>   Make build_check_stmt accept an SSA_NAME for its base
>>   Factorize condition insertion code out of build_check_stmt
>>   Instrument built-in memory access function calls
>>
>> Jakub Jelinek (6):
>>   Initial asan cleanups
>>   Emit GIMPLE directly instead of gimplifying GENERIC.
>>   Allow asan at -O0
>>   Implement protection of stack variables
>>   Implement protection of global variables
>>   Fix a couple of ICEs.
>>
>> Wei Mi (2):
>>   Don't forget to protect 32 bytes aligned global variables.
>>   Import the asan runtime library into GCC tree
>>
>>  ChangeLog.asan                                     |     7 +
>>  Makefile.def                                       |     2 +
>>  Makefile.in                                        |   487 +-
>>  configure                                          |     1 +
>>  configure.ac                                       |     1 +
>>  gcc/ChangeLog.asan                                 |   175 +
>>  gcc/Makefile.in                                    |    10 +-
>>  gcc/asan.c                                         |  1495 ++
>>  gcc/asan.h                                         |    70 +
>>  gcc/cfgexpand.c                                    |   165 +-
>>  gcc/common.opt                                     |     4 +
>>  gcc/config/i386/i386.c                             |    11 +
>>  gcc/doc/invoke.texi                                |     8 +-
>>  gcc/doc/tm.texi                                    |     6 +
>>  gcc/doc/tm.texi.in                                 |     2 +
>>  gcc/gcc.c                                          |     1 +
>>  gcc/passes.c                                       |     2 +
>>  gcc/target.def                                     |    11 +
>>  gcc/toplev.c                                       |    14 +
>>  gcc/tree-pass.h                                    |     2 +
>>  gcc/varasm.c                                       |    22 +
>>  libasan/ChangeLog.asan                             |     3 +
>>  libasan/LICENSE.TXT                                |    97 +
>>  libasan/Makefile.am                                |    98 +
>>  libasan/Makefile.in                                |   992 ++
>>  libasan/README.gcc                                 |     4 +
>>  libasan/aclocal.m4                                 |  9645 ++++++++++
>>  libasan/asan_allocator.cc                          |  1045 ++
>>  libasan/asan_allocator.h                           |   177 +
>>  libasan/asan_flags.h                               |   103 +
>>  libasan/asan_globals.cc                            |   206 +
>>  libasan/asan_intercepted_functions.h               |   217 +
>>  libasan/asan_interceptors.cc                       |   704 +
>>  libasan/asan_interceptors.h                        |    39 +
>>  libasan/asan_internal.h                            |   169 +
>>  libasan/asan_linux.cc                              |   150 +
>>  libasan/asan_lock.h                                |    40 +
>>  libasan/asan_mac.cc                                |   526 +
>>  libasan/asan_mac.h                                 |    54 +
>>  libasan/asan_malloc_linux.cc                       |   142 +
>>  libasan/asan_malloc_mac.cc                         |   427 +
>>  libasan/asan_malloc_win.cc                         |   140 +
>>  libasan/asan_mapping.h                             |   120 +
>>  libasan/asan_new_delete.cc                         |    54 +
>>  libasan/asan_poisoning.cc                          |   151 +
>>  libasan/asan_posix.cc                              |   118 +
>>  libasan/asan_report.cc                             |   492 +
>>  libasan/asan_report.h                              |    51 +
>>  libasan/asan_rtl.cc                                |   404 +
>>  libasan/asan_stack.cc                              |    35 +
>>  libasan/asan_stack.h                               |    52 +
>>  libasan/asan_stats.cc                              |    86 +
>>  libasan/asan_stats.h                               |    65 +
>>  libasan/asan_thread.cc                             |   153 +
>>  libasan/asan_thread.h                              |   103 +
>>  libasan/asan_thread_registry.cc                    |   188 +
>>  libasan/asan_thread_registry.h                     |    83 +
>>  libasan/asan_win.cc                                |   190 +
>>  libasan/config.guess                               |  1530 ++
>>  libasan/config.sub                                 |  1773 ++
>>  libasan/configure                                  | 17515 +++++++++++++++++++
>>  libasan/configure.ac                               |    67 +
>>  libasan/depcomp                                    |   630 +
>>  libasan/include/sanitizer/asan_interface.h         |   197 +
>>  libasan/include/sanitizer/common_interface_defs.h  |    66 +
>>  libasan/install-sh                                 |   527 +
>>  libasan/interception/interception.h                |   195 +
>>  libasan/interception/interception_linux.cc         |    28 +
>>  libasan/interception/interception_linux.h          |    35 +
>>  libasan/interception/interception_mac.cc           |    29 +
>>  libasan/interception/interception_mac.h            |    47 +
>>  libasan/interception/interception_win.cc           |   149 +
>>  libasan/interception/interception_win.h            |    43 +
>>  libasan/libtool-version                            |     6 +
>>  libasan/ltmain.sh                                  |  9661 ++++++++++
>>  libasan/missing                                    |   376 +
>>  libasan/sanitizer_common/sanitizer_allocator.cc    |    83 +
>>  libasan/sanitizer_common/sanitizer_allocator64.h   |   573 +
>>  libasan/sanitizer_common/sanitizer_atomic.h        |    63 +
>>  libasan/sanitizer_common/sanitizer_atomic_clang.h  |   120 +
>>  libasan/sanitizer_common/sanitizer_atomic_msvc.h   |   134 +
>>  libasan/sanitizer_common/sanitizer_common.cc       |   151 +
>>  libasan/sanitizer_common/sanitizer_common.h        |   181 +
>>  libasan/sanitizer_common/sanitizer_flags.cc        |    95 +
>>  libasan/sanitizer_common/sanitizer_flags.h         |    25 +
>>  libasan/sanitizer_common/sanitizer_internal_defs.h |   186 +
>>  libasan/sanitizer_common/sanitizer_libc.cc         |   189 +
>>  libasan/sanitizer_common/sanitizer_libc.h          |    69 +
>>  libasan/sanitizer_common/sanitizer_linux.cc        |   296 +
>>  libasan/sanitizer_common/sanitizer_list.h          |   118 +
>>  libasan/sanitizer_common/sanitizer_mac.cc          |   249 +
>>  libasan/sanitizer_common/sanitizer_mutex.h         |   106 +
>>  libasan/sanitizer_common/sanitizer_placement_new.h |    31 +
>>  libasan/sanitizer_common/sanitizer_posix.cc        |   187 +
>>  libasan/sanitizer_common/sanitizer_printf.cc       |   196 +
>>  libasan/sanitizer_common/sanitizer_procmaps.h      |    95 +
>>  libasan/sanitizer_common/sanitizer_stackdepot.cc   |   194 +
>>  libasan/sanitizer_common/sanitizer_stackdepot.h    |    27 +
>>  libasan/sanitizer_common/sanitizer_stacktrace.cc   |   245 +
>>  libasan/sanitizer_common/sanitizer_stacktrace.h    |    73 +
>>  libasan/sanitizer_common/sanitizer_symbolizer.cc   |   311 +
>>  libasan/sanitizer_common/sanitizer_symbolizer.h    |    97 +
>>  .../sanitizer_common/sanitizer_symbolizer_linux.cc |   162 +
>>  .../sanitizer_common/sanitizer_symbolizer_mac.cc   |    31 +
>>  .../sanitizer_common/sanitizer_symbolizer_win.cc   |    33 +
>>  libasan/sanitizer_common/sanitizer_win.cc          |   205 +
>>  106 files changed, 57193 insertions(+), 25 deletions(-)
>>  create mode 100644 ChangeLog.asan
>>  create mode 100644 gcc/ChangeLog.asan
>>  create mode 100644 gcc/asan.c
>>  create mode 100644 gcc/asan.h
>>  create mode 100644 libasan/ChangeLog.asan
>>  create mode 100644 libasan/LICENSE.TXT
>>  create mode 100644 libasan/Makefile.am
>>  create mode 100644 libasan/Makefile.in
>>  create mode 100644 libasan/README.gcc
>>  create mode 100644 libasan/aclocal.m4
>>  create mode 100644 libasan/asan_allocator.cc
>>  create mode 100644 libasan/asan_allocator.h
>>  create mode 100644 libasan/asan_flags.h
>>  create mode 100644 libasan/asan_globals.cc
>>  create mode 100644 libasan/asan_intercepted_functions.h
>>  create mode 100644 libasan/asan_interceptors.cc
>>  create mode 100644 libasan/asan_interceptors.h
>>  create mode 100644 libasan/asan_internal.h
>>  create mode 100644 libasan/asan_linux.cc
>>  create mode 100644 libasan/asan_lock.h
>>  create mode 100644 libasan/asan_mac.cc
>>  create mode 100644 libasan/asan_mac.h
>>  create mode 100644 libasan/asan_malloc_linux.cc
>>  create mode 100644 libasan/asan_malloc_mac.cc
>>  create mode 100644 libasan/asan_malloc_win.cc
>>  create mode 100644 libasan/asan_mapping.h
>>  create mode 100644 libasan/asan_new_delete.cc
>>  create mode 100644 libasan/asan_poisoning.cc
>>  create mode 100644 libasan/asan_posix.cc
>>  create mode 100644 libasan/asan_report.cc
>>  create mode 100644 libasan/asan_report.h
>>  create mode 100644 libasan/asan_rtl.cc
>>  create mode 100644 libasan/asan_stack.cc
>>  create mode 100644 libasan/asan_stack.h
>>  create mode 100644 libasan/asan_stats.cc
>>  create mode 100644 libasan/asan_stats.h
>>  create mode 100644 libasan/asan_thread.cc
>>  create mode 100644 libasan/asan_thread.h
>>  create mode 100644 libasan/asan_thread_registry.cc
>>  create mode 100644 libasan/asan_thread_registry.h
>>  create mode 100644 libasan/asan_win.cc
>>  create mode 100644 libasan/config.guess
>>  create mode 100644 libasan/config.sub
>>  create mode 100644 libasan/configure
>>  create mode 100644 libasan/configure.ac
>>  create mode 100644 libasan/depcomp
>>  create mode 100644 libasan/include/sanitizer/asan_interface.h
>>  create mode 100644 libasan/include/sanitizer/common_interface_defs.h
>>  create mode 100644 libasan/install-sh
>>  create mode 100644 libasan/interception/interception.h
>>  create mode 100644 libasan/interception/interception_linux.cc
>>  create mode 100644 libasan/interception/interception_linux.h
>>  create mode 100644 libasan/interception/interception_mac.cc
>>  create mode 100644 libasan/interception/interception_mac.h
>>  create mode 100644 libasan/interception/interception_win.cc
>>  create mode 100644 libasan/interception/interception_win.h
>>  create mode 100644 libasan/libtool-version
>>  create mode 100644 libasan/ltmain.sh
>>  create mode 100644 libasan/missing
>>  create mode 100644 libasan/sanitizer_common/sanitizer_allocator.cc
>>  create mode 100644 libasan/sanitizer_common/sanitizer_allocator64.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_atomic.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_atomic_clang.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_atomic_msvc.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_common.cc
>>  create mode 100644 libasan/sanitizer_common/sanitizer_common.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_flags.cc
>>  create mode 100644 libasan/sanitizer_common/sanitizer_flags.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_internal_defs.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_libc.cc
>>  create mode 100644 libasan/sanitizer_common/sanitizer_libc.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_linux.cc
>>  create mode 100644 libasan/sanitizer_common/sanitizer_list.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_mac.cc
>>  create mode 100644 libasan/sanitizer_common/sanitizer_mutex.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_placement_new.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_posix.cc
>>  create mode 100644 libasan/sanitizer_common/sanitizer_printf.cc
>>  create mode 100644 libasan/sanitizer_common/sanitizer_procmaps.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_stackdepot.cc
>>  create mode 100644 libasan/sanitizer_common/sanitizer_stackdepot.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_stacktrace.cc
>>  create mode 100644 libasan/sanitizer_common/sanitizer_stacktrace.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer.cc
>>  create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer.h
>>  create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer_linux.cc
>>  create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer_mac.cc
>>  create mode 100644 libasan/sanitizer_common/sanitizer_symbolizer_win.cc
>>  create mode 100644 libasan/sanitizer_common/sanitizer_win.cc
>>
>
> Dodji,
>     The Google branch is missing the required interception/mach_override/mach_override.h
> and interception/mach_override/mach_override.c files from compiler-rt svn for darwin. I have
> posted what I believe to be the final patch which eanbles libsanitizer on darwin...
>
> http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01285.html
>
> which has been tested with the existing asan testsuite, the use-after-free.c testcase as
> well as the Polyhedron 2005 benchmarks for -O1 -g -fno-omit-frame-pointer -faddress-sanitizer
> and -O3 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -faddress-sanitizer
> to prove that the current mach_override from upstream is sufficient for darwin to use.
> Due to the large number of maintainers for libsanitizer, it is unclear who is the person
> responsible for upstream merges to lobby for these files to be ported into gcc trunk.
> With Alexander Potapenko's commit of the bug fix to mach_override/mach_override.c
> required for FSF gcc...
>
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121112/155989.html
>
> ...there really is no reason to continue to delay (as the interpose code simply won't
> be completed in time for gcc 4.8.0). Can we please get some movement on importing
> these missing files from upstream? Thanks.
>              Jack

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-15 23:42   ` Konstantin Serebryany
@ 2012-11-16  8:27     ` Dodji Seketeli
  2012-11-16 14:03       ` Jack Howarth
                         ` (2 more replies)
  0 siblings, 3 replies; 80+ messages in thread
From: Dodji Seketeli @ 2012-11-16  8:27 UTC (permalink / raw)
  To: Konstantin Serebryany
  Cc: Jack Howarth, gcc-patches, dnovillo, jakub, wmi, davidxl,
	Alexander Potapenko, mikestump

Jack Howarth <howarth@bromo.med.uc.edu> writes:

>     The Google branch is missing the required
> interception/mach_override/mach_override.h and
> interception/mach_override/mach_override.c files from compiler-rt svn
> for darwin. I have posted what I believe to be the final patch which
> eanbles libsanitizer on darwin...
>
> http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01285.html

I see in that thread that Mike Stump has approves the patch if no
asan{-darwin} people disagrees.  I'll abide by principle, FWIW.  :-)

> which has been tested with the existing asan testsuite, the
> use-after-free.c testcase as well as the Polyhedron 2005 benchmarks
> for -O1 -g -fno-omit-frame-pointer -faddress-sanitizer and -O3
> -funroll-loops -ffast-math -g -fno-omit-frame-pointer
> -faddress-sanitizer to prove that the current mach_override from
> upstream is sufficient for darwin to use.

I see.   Thanks.

> Due to the large number of maintainers for libsanitizer, it is unclear
> who is the person responsible for upstream merges to lobby for these
> files to be ported into gcc trunk.  With Alexander Potapenko's commit
> of the bug fix to mach_override/mach_override.c required for FSF
> gcc...
>
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121112/155989.html
>
> ...there really is no reason to continue to delay (as the interpose code simply won't
> be completed in time for gcc 4.8.0).

It makes sense to me.

> Can we please get some movement on importing these missing files from
> upstream?

Well, given that ....

Konstantin Serebryany <konstantin.s.serebryany@gmail.com> writes:

> I see no problems with committing mach_override to gcc.
> The code should be verbatim copy from
> llvm/projects/compiler-rt/lib/interception/mach_override
> Note that this code comes with an MIT license and was not developed by
> Google (we did add quite a few patches).

... Konstantin who is one of the libsanitizer maintainers agrees, I see
no reason to delay this either.

So, Jack, as you are on top of this topic and has the platform to test
at hand, I guess you could just import the missing files from the llvm
repository and commit them to GCC, unless a GCC maintainers disagrees,
of course.

Thus, you could maybe just send the patch of the file you are about to
commit as a reply to this thread, so that Konstantin and Alexander can
officially ACK it?  I am mentioning Alexander because of what Konstantin
is saying ...

> Also, Alexander Potapenko is the best person to ask about asan-darwin.

.... here.

> Maybe we can add him to the list of sanitizer maintainers?

Seconded.  At least for libsanitier/Darwin.

Cheers.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-16  8:27     ` Dodji Seketeli
@ 2012-11-16 14:03       ` Jack Howarth
  2012-11-16 15:57       ` Jack Howarth
  2012-11-16 16:56       ` Alexander Potapenko
  2 siblings, 0 replies; 80+ messages in thread
From: Jack Howarth @ 2012-11-16 14:03 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Konstantin Serebryany, gcc-patches, dnovillo, jakub, wmi,
	davidxl, Alexander Potapenko, mikestump

[-- Attachment #1: Type: text/plain, Size: 5023 bytes --]

On Fri, Nov 16, 2012 at 09:27:26AM +0100, Dodji Seketeli wrote:
> Jack Howarth <howarth@bromo.med.uc.edu> writes:
> 
> >     The Google branch is missing the required
> > interception/mach_override/mach_override.h and
> > interception/mach_override/mach_override.c files from compiler-rt svn
> > for darwin. I have posted what I believe to be the final patch which
> > eanbles libsanitizer on darwin...
> >
> > http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01285.html
> 
> I see in that thread that Mike Stump has approves the patch if no
> asan{-darwin} people disagrees.  I'll abide by principle, FWIW.  :-)
> 
> > which has been tested with the existing asan testsuite, the
> > use-after-free.c testcase as well as the Polyhedron 2005 benchmarks
> > for -O1 -g -fno-omit-frame-pointer -faddress-sanitizer and -O3
> > -funroll-loops -ffast-math -g -fno-omit-frame-pointer
> > -faddress-sanitizer to prove that the current mach_override from
> > upstream is sufficient for darwin to use.
> 
> I see.   Thanks.
> 
> > Due to the large number of maintainers for libsanitizer, it is unclear
> > who is the person responsible for upstream merges to lobby for these
> > files to be ported into gcc trunk.  With Alexander Potapenko's commit
> > of the bug fix to mach_override/mach_override.c required for FSF
> > gcc...
> >
> > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121112/155989.html
> >
> > ...there really is no reason to continue to delay (as the interpose code simply won't
> > be completed in time for gcc 4.8.0).
> 
> It makes sense to me.
> 
> > Can we please get some movement on importing these missing files from
> > upstream?
> 
> Well, given that ....
> 
> Konstantin Serebryany <konstantin.s.serebryany@gmail.com> writes:
> 
> > I see no problems with committing mach_override to gcc.
> > The code should be verbatim copy from
> > llvm/projects/compiler-rt/lib/interception/mach_override
> > Note that this code comes with an MIT license and was not developed by
> > Google (we did add quite a few patches).
> 
> ... Konstantin who is one of the libsanitizer maintainers agrees, I see
> no reason to delay this either.
> 
> So, Jack, as you are on top of this topic and has the platform to test
> at hand, I guess you could just import the missing files from the llvm
> repository and commit them to GCC, unless a GCC maintainers disagrees,
> of course.

Can one of the libsanitizer maintainers handle the importation? The only
requirements are that they use a mach_override/mach_override.c and
mach_override/mach_override.h from on or after llvm r168032...

Author: glider
Date: Thu Nov 15 02:32:16 2012
New Revision: 168032

URL: http://llvm.org/viewvc/llvm-project?rev=168032&view=rev
Log:
[ASan] Add the "lea $imm(%rip),%rax" instruction to mach_override.c
The need for this has been reported by Jack Howarth (howarth at bromo.med.uc.edu) who's porting ASan-Darwin to GCC

Modified:
    compiler-rt/trunk/lib/interception/mach_override/mach_override.c

Modified: compiler-rt/trunk/lib/interception/mach_override/mach_override.c
URL: http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/interception/mach_override/mach_override.c?rev=168032&r1=168031&r2=168032&view=diff
==============================================================================
--- compiler-rt/trunk/lib/interception/mach_override/mach_override.c (original)
+++ compiler-rt/trunk/lib/interception/mach_override/mach_override.c Thu Nov 15 02:32:16 2012
@@ -725,6 +725,8 @@
         { 0x2, {0xFF, 0x00}, {0x89, 0x00} },                               // mov r/m32,r32 or r/m16,r16
         { 0x3, {0xFF, 0xFF, 0xFF}, {0x49, 0x89, 0xF8} },                   // mov %rdi,%r8
         { 0x4, {0xFF, 0xFF, 0xFF, 0xFF}, {0x40, 0x0F, 0xBE, 0xCE} },       // movsbl %sil,%ecx
+        { 0x7, {0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00},
+               {0x48, 0x8D, 0x05, 0x00, 0x00, 0x00, 0x00} },  // lea $imm(%rip),%rax
         { 0x3, {0xFF, 0xFF, 0xFF}, {0x0F, 0xBE, 0xCE} },  // movsbl, %dh, %ecx
         { 0x3, {0xFF, 0xFF, 0x00}, {0xFF, 0x77, 0x00} },  // pushq $imm(%rdi)
         { 0x2, {0xFF, 0xFF}, {0xDB, 0xE3} }, // fninit


and place these two files in libsanitizer/interception/mach_override. I am unclear on what adjustments
you are doing to the licensing comments in these files. The imported files could just be tacked onto
my posted patch (reattached to this message) and done as a single commit. Thanks in advance.
        Jack
ps I assume that these changes should also be committed to the gcc asan branch as well.


> 
> Thus, you could maybe just send the patch of the file you are about to
> commit as a reply to this thread, so that Konstantin and Alexander can
> officially ACK it?  I am mentioning Alexander because of what Konstantin
> is saying ...
> 
> > Also, Alexander Potapenko is the best person to ask about asan-darwin.
> 
> .... here.
> 
> > Maybe we can add him to the list of sanitizer maintainers?
> 
> Seconded.  At least for libsanitier/Darwin.
> 
> Cheers.
> 
> -- 
> 		Dodji

[-- Attachment #2: asan_v5.diff --]
[-- Type: text/plain, Size: 4186 bytes --]

   The attached patch assumes that the current mach_override/mach_override.h and
mach_override/mach_override.c files have been imported by the libsanitizer maintainers
from llvm compiler-rt svn for use by darwin. The patch adds darwin to the supported
target list in configure.tgt and defines USING_MACH_OVERRIDE for darwin in 
configure.ac. The definition of USING_MACH_OVERRIDE is used in Makefile.am as
the test for appending mach_override/mach_override.c to libinterception_la_SOURCES.
LINK_COMMAND_SPEC_A in gcc/config/darwin.h is modified to add an entry to handle
faddress-sanitizer so that the required linkages are used for libasan. The static
linkage of libasan.a in LINK_COMMAND_SPEC_A is handle separately for -static-libstdc++
(which requires libstdc++.a) and the -static, -static-gcc and -static-gfortran cases.
Tested on x86_64-apple-darwin12 against the mach_override/mach_override.h and
mach_override/mach_override.c from llvm compiler-rt svn for both -m32 and -m64 with
the both use-after-free.c testcase and...

 make -k check RUNTESTFLAGS="asan.exp --target_board=unix'{-m32,-m64}'"

without regressions.
              Jack
ps Note that this patch assumes that both mach_override.h and mach_override.c
reside in a mach_override subdirectory in interception as is the case in the
llvm's compiler-rt.

gcc/

2012-11-15  Jack Howarth <howarth@bromo.med.uc.edu>

	* config/darwin.h (LINK_COMMAND_SPEC_A): Deal with -faddress-sanitizer.

libsanitizer/

2012-11-15  Jack Howarth <howarth@bromo.med.uc.edu>

	* configure.tgt: Add darwin to supported targets.
	* configure.ac: Define USING_MACH_OVERRIDE when on darwin.
	* interception/Makefile.am: Compile mach_override.c when
	USING_MACH_OVERRIDE defined.
	* configure: Regenerated.
	* interception/Makefile.in: Likewise.


Index: libsanitizer/interception/Makefile.am
===================================================================
--- libsanitizer/interception/Makefile.am	(revision 193537)
+++ libsanitizer/interception/Makefile.am	(working copy)
@@ -14,7 +14,11 @@ interception_files = \
         interception_mac.cc \
         interception_win.cc
 
-libinterception_la_SOURCES = $(interception_files) 
+if USING_MACH_OVERRIDE
+libinterception_la_SOURCES = $(interception_files) mach_override/mach_override.c
+else
+libinterception_la_SOURCES = $(interception_files)
+endif
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
 # values defined in terms of make variables, as is the case for CC and
Index: libsanitizer/configure.ac
===================================================================
--- libsanitizer/configure.ac	(revision 193537)
+++ libsanitizer/configure.ac	(working copy)
@@ -22,6 +22,12 @@ AC_CANONICAL_SYSTEM
 target_alias=${target_alias-$host_alias}
 AC_SUBST(target_alias)
 
+case "$host" in
+  *-*-darwin*) MACH_OVERRIDE=true ;;
+  *) MACH_OVERRIDE=false ;;
+esac
+AM_CONDITIONAL(USING_MACH_OVERRIDE, $MACH_OVERRIDE)
+
 AM_INIT_AUTOMAKE(foreign)
 AM_ENABLE_MULTILIB(, ..)
 
Index: libsanitizer/configure.tgt
===================================================================
--- libsanitizer/configure.tgt	(revision 193537)
+++ libsanitizer/configure.tgt	(working copy)
@@ -22,6 +22,8 @@
 case "${target}" in
   x86_64-*-linux* | i?86-*-linux*)
 	;;
+  x86_64-*-darwin* | i?86-*-darwin*)
+	;;
   *)
 	UNSUPPORTED=1
 	;;
Index: gcc/config/darwin.h
===================================================================
--- gcc/config/darwin.h	(revision 193537)
+++ gcc/config/darwin.h	(working copy)
@@ -180,6 +180,9 @@ extern GTY(()) int darwin_ms_struct;
     %{L*} %(link_libgcc) %o %{fprofile-arcs|fprofile-generate*|coverage:-lgcov} \
     %{fopenmp|ftree-parallelize-loops=*: \
       %{static|static-libgcc|static-libstdc++|static-libgfortran: libgomp.a%s; : -lgomp } } \
+    %{faddress-sanitizer: \
+      %{static|static-libgcc|static-libgfortran: -framework CoreFoundation -lstdc++ libasan.a%s; \
+      static-libstdc++: -framework CoreFoundation libstdc++.a%s libasan.a%s; : -framework CoreFoundation -lasan } } \
     %{fgnu-tm: \
       %{static|static-libgcc|static-libstdc++|static-libgfortran: libitm.a%s; : -litm } } \
     %{!nostdlib:%{!nodefaultlibs:\

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-16  8:27     ` Dodji Seketeli
  2012-11-16 14:03       ` Jack Howarth
@ 2012-11-16 15:57       ` Jack Howarth
  2012-11-16 16:02         ` Jakub Jelinek
  2012-11-16 16:56       ` Alexander Potapenko
  2 siblings, 1 reply; 80+ messages in thread
From: Jack Howarth @ 2012-11-16 15:57 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Konstantin Serebryany, gcc-patches, dnovillo, jakub, wmi,
	davidxl, Alexander Potapenko, mikestump

[-- Attachment #1: Type: text/plain, Size: 3908 bytes --]

On Fri, Nov 16, 2012 at 09:27:26AM +0100, Dodji Seketeli wrote:
> Jack Howarth <howarth@bromo.med.uc.edu> writes:
> 
> >     The Google branch is missing the required
> > interception/mach_override/mach_override.h and
> > interception/mach_override/mach_override.c files from compiler-rt svn
> > for darwin. I have posted what I believe to be the final patch which
> > eanbles libsanitizer on darwin...
> >
> > http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01285.html
> 
> I see in that thread that Mike Stump has approves the patch if no
> asan{-darwin} people disagrees.  I'll abide by principle, FWIW.  :-)
> 
> > which has been tested with the existing asan testsuite, the
> > use-after-free.c testcase as well as the Polyhedron 2005 benchmarks
> > for -O1 -g -fno-omit-frame-pointer -faddress-sanitizer and -O3
> > -funroll-loops -ffast-math -g -fno-omit-frame-pointer
> > -faddress-sanitizer to prove that the current mach_override from
> > upstream is sufficient for darwin to use.
> 
> I see.   Thanks.
> 
> > Due to the large number of maintainers for libsanitizer, it is unclear
> > who is the person responsible for upstream merges to lobby for these
> > files to be ported into gcc trunk.  With Alexander Potapenko's commit
> > of the bug fix to mach_override/mach_override.c required for FSF
> > gcc...
> >
> > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121112/155989.html
> >
> > ...there really is no reason to continue to delay (as the interpose code simply won't
> > be completed in time for gcc 4.8.0).
> 
> It makes sense to me.
> 
> > Can we please get some movement on importing these missing files from
> > upstream?
> 
> Well, given that ....
> 
> Konstantin Serebryany <konstantin.s.serebryany@gmail.com> writes:
> 
> > I see no problems with committing mach_override to gcc.
> > The code should be verbatim copy from
> > llvm/projects/compiler-rt/lib/interception/mach_override
> > Note that this code comes with an MIT license and was not developed by
> > Google (we did add quite a few patches).
> 
> ... Konstantin who is one of the libsanitizer maintainers agrees, I see
> no reason to delay this either.
> 
> So, Jack, as you are on top of this topic and has the platform to test
> at hand, I guess you could just import the missing files from the llvm
> repository and commit them to GCC, unless a GCC maintainers disagrees,
> of course.

Dodji,
    Attached is a new version of the patch adjusted for the bit-rot from
the maintainer-mode checkins of today and with the missing files for
libsanitizer/interception/mach_override/mach_override.c and 
libsanitizer/interception/mach_override/mach_override.h from llvm
compiler-rt svn at...

------------------------------------------------------------------------
r168032 | glider | 2012-11-15 03:32:16 -0500 (Thu, 15 Nov 2012) | 3 lines

[ASan] Add the "lea $imm(%rip),%rax" instruction to mach_override.c
The need for this has been reported by Jack Howarth (howarth@bromo.med.uc.edu) who's porting ASan-Darwin to GCC

added to the patch so that you can make any required licensing adjustments to
the comments at the top of those files. Hopefully we can get this in soon as
the build parts keep bit-rotting.
        Jack
ps I added you to the ChangeLog for the llvm file imports since I believe that should be
done by the libsanitizer maintainers. Thanks in advance for handling the commit in
both gcc trunk and gcc asan branch.

> 
> Thus, you could maybe just send the patch of the file you are about to
> commit as a reply to this thread, so that Konstantin and Alexander can
> officially ACK it?  I am mentioning Alexander because of what Konstantin
> is saying ...
> 
> > Also, Alexander Potapenko is the best person to ask about asan-darwin.
> 
> .... here.
> 
> > Maybe we can add him to the list of sanitizer maintainers?
> 
> Seconded.  At least for libsanitier/Darwin.
> 
> Cheers.
> 
> -- 
> 		Dodji

[-- Attachment #2: asan_v6.diff --]
[-- Type: text/plain, Size: 42773 bytes --]

gcc/

2012-11-16  Jack Howarth <howarth@bromo.med.uc.edu>

	* config/darwin.h (LINK_COMMAND_SPEC_A): Deal with -faddress-sanitizer.

libsanitizer/

2012-11-16  Dodji Seketeli <dodji@redhat.com>
	    Jack Howarth <howarth@bromo.med.uc.edu>

	* interception/mach_override/mach_override.c: Migrate from llvm.
	* interception/mach_override/mach_override.h: Likewise.
	* configure.tgt: Add darwin to supported targets.
	* configure.ac: Define USING_MACH_OVERRIDE when on darwin.
	* interception/Makefile.am: Compile mach_override.c when
	USING_MACH_OVERRIDE defined.
	* configure: Regenerated.
	* interception/Makefile.in: Likewise.

--- /dev/null	2012-11-16 10:24:58.000000000 -0500
+++ libsanitizer/interception/mach_override/mach_override.c	2012-11-16 10:26:42.000000000 -0500
@@ -0,0 +1,970 @@
+/*******************************************************************************
+	mach_override.c
+		Copyright (c) 2003-2009 Jonathan 'Wolf' Rentzsch: <http://rentzsch.com>
+		Some rights reserved: <http://opensource.org/licenses/mit-license.php>
+
+	***************************************************************************/
+#ifdef __APPLE__
+
+#include "mach_override.h"
+
+#include <mach-o/dyld.h>
+#include <mach/mach_host.h>
+#include <mach/mach_init.h>
+#include <mach/vm_map.h>
+#include <sys/mman.h>
+
+#include <CoreServices/CoreServices.h>
+
+//#define DEBUG_DISASM 1
+#undef DEBUG_DISASM
+
+/**************************
+*	
+*	Constants
+*	
+**************************/
+#pragma mark	-
+#pragma mark	(Constants)
+
+#if defined(__ppc__) || defined(__POWERPC__)
+
+static
+long kIslandTemplate[] = {
+	0x9001FFFC,	//	stw		r0,-4(SP)
+	0x3C00DEAD,	//	lis		r0,0xDEAD
+	0x6000BEEF,	//	ori		r0,r0,0xBEEF
+	0x7C0903A6,	//	mtctr	r0
+	0x8001FFFC,	//	lwz		r0,-4(SP)
+	0x60000000,	//	nop		; optionally replaced
+	0x4E800420 	//	bctr
+};
+
+#define kAddressHi			3
+#define kAddressLo			5
+#define kInstructionHi		10
+#define kInstructionLo		11
+
+#elif defined(__i386__) 
+
+#define kOriginalInstructionsSize 16
+
+static
+unsigned char kIslandTemplate[] = {
+	// kOriginalInstructionsSize nop instructions so that we 
+	// should have enough space to host original instructions 
+	0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 
+	0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90,
+	// Now the real jump instruction
+	0xE9, 0xEF, 0xBE, 0xAD, 0xDE
+};
+
+#define kInstructions	0
+#define kJumpAddress    kInstructions + kOriginalInstructionsSize + 1
+#elif defined(__x86_64__)
+
+#define kOriginalInstructionsSize 32
+
+#define kJumpAddress    kOriginalInstructionsSize + 6
+
+static
+unsigned char kIslandTemplate[] = {
+	// kOriginalInstructionsSize nop instructions so that we 
+	// should have enough space to host original instructions 
+	0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 
+	0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90,
+	0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 
+	0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90,
+	// Now the real jump instruction
+	0xFF, 0x25, 0x00, 0x00, 0x00, 0x00,
+        0x00, 0x00, 0x00, 0x00,
+        0x00, 0x00, 0x00, 0x00
+};
+
+#endif
+
+#define	kAllocateHigh		1
+#define	kAllocateNormal		0
+
+/**************************
+*	
+*	Data Types
+*	
+**************************/
+#pragma mark	-
+#pragma mark	(Data Types)
+
+typedef	struct	{
+	char	instructions[sizeof(kIslandTemplate)];
+	int		allocatedHigh;
+}	BranchIsland;
+
+/**************************
+*	
+*	Funky Protos
+*	
+**************************/
+#pragma mark	-
+#pragma mark	(Funky Protos)
+
+
+	static mach_error_t
+allocateBranchIsland(
+		BranchIsland	**island,
+		int				allocateHigh,
+		void *originalFunctionAddress);
+
+	static mach_error_t
+freeBranchIsland(
+		BranchIsland	*island );
+
+	static mach_error_t
+defaultIslandMalloc(
+	  void **ptr, size_t unused_size, void *hint);
+
+	static mach_error_t
+defaultIslandFree(
+   	void *ptr);
+
+#if defined(__ppc__) || defined(__POWERPC__)
+	static mach_error_t
+setBranchIslandTarget(
+		BranchIsland	*island,
+		const void		*branchTo,
+		long			instruction );
+#endif 
+
+#if defined(__i386__) || defined(__x86_64__)
+static mach_error_t
+setBranchIslandTarget_i386(
+						   BranchIsland	*island,
+						   const void		*branchTo,
+						   char*			instructions );
+// Can't be made static because there's no C implementation for atomic_mov64
+// on i386.
+void 
+atomic_mov64(
+		uint64_t *targetAddress,
+		uint64_t value ) __attribute__((visibility("hidden")));
+
+	static Boolean 
+eatKnownInstructions( 
+	unsigned char	*code, 
+	uint64_t		*newInstruction,
+	int				*howManyEaten, 
+	char			*originalInstructions,
+	int				*originalInstructionCount, 
+	uint8_t			*originalInstructionSizes );
+
+	static void
+fixupInstructions(
+    void		*originalFunction,
+    void		*escapeIsland,
+    void		*instructionsToFix,
+	int			instructionCount,
+	uint8_t		*instructionSizes );
+
+#ifdef DEBUG_DISASM
+	static void
+dump16Bytes(
+	void	*ptr);
+#endif  // DEBUG_DISASM
+#endif
+
+/*******************************************************************************
+*	
+*	Interface
+*	
+*******************************************************************************/
+#pragma mark	-
+#pragma mark	(Interface)
+
+#if defined(__i386__) || defined(__x86_64__)
+static mach_error_t makeIslandExecutable(void *address) {
+	mach_error_t err = err_none;
+    vm_size_t pageSize;
+    host_page_size( mach_host_self(), &pageSize );
+    uintptr_t page = (uintptr_t)address & ~(uintptr_t)(pageSize-1);
+    int e = err_none;
+    e |= mprotect((void *)page, pageSize, PROT_EXEC | PROT_READ | PROT_WRITE);
+    e |= msync((void *)page, pageSize, MS_INVALIDATE );
+    if (e) {
+        err = err_cannot_override;
+    }
+    return err;
+}
+#endif
+
+		static mach_error_t
+defaultIslandMalloc(
+	void **ptr, size_t unused_size, void *hint) {
+  return allocateBranchIsland( (BranchIsland**)ptr, kAllocateHigh, hint );
+}
+		static mach_error_t
+defaultIslandFree(
+	void *ptr) {
+	return freeBranchIsland(ptr);
+}
+
+    mach_error_t
+__asan_mach_override_ptr(
+	void *originalFunctionAddress,
+    const void *overrideFunctionAddress,
+    void **originalFunctionReentryIsland )
+{
+  return __asan_mach_override_ptr_custom(originalFunctionAddress,
+		overrideFunctionAddress,
+		originalFunctionReentryIsland,
+		defaultIslandMalloc,
+		defaultIslandFree);
+}
+
+    mach_error_t
+__asan_mach_override_ptr_custom(
+	void *originalFunctionAddress,
+    const void *overrideFunctionAddress,
+    void **originalFunctionReentryIsland,
+		island_malloc *alloc,
+		island_free *dealloc)
+{
+	assert( originalFunctionAddress );
+	assert( overrideFunctionAddress );
+	
+	// this addresses overriding such functions as AudioOutputUnitStart()
+	// test with modified DefaultOutputUnit project
+#if defined(__x86_64__)
+    for(;;){
+        if(*(uint16_t*)originalFunctionAddress==0x25FF)    // jmp qword near [rip+0x????????]
+            originalFunctionAddress=*(void**)((char*)originalFunctionAddress+6+*(int32_t *)((uint16_t*)originalFunctionAddress+1));
+        else break;
+    }
+#elif defined(__i386__)
+    for(;;){
+        if(*(uint16_t*)originalFunctionAddress==0x25FF)    // jmp *0x????????
+            originalFunctionAddress=**(void***)((uint16_t*)originalFunctionAddress+1);
+        else break;
+    }
+#endif
+#ifdef DEBUG_DISASM
+  {
+    fprintf(stderr, "Replacing function at %p\n", originalFunctionAddress);
+    fprintf(stderr, "First 16 bytes of the function: ");
+    unsigned char *orig = (unsigned char *)originalFunctionAddress;
+    int i;
+    for (i = 0; i < 16; i++) {
+       fprintf(stderr, "%x ", (unsigned int) orig[i]);
+    }
+    fprintf(stderr, "\n");
+    fprintf(stderr, 
+            "To disassemble, save the following function as disas.c"
+            " and run:\n  gcc -c disas.c && gobjdump -d disas.o\n"
+            "The first 16 bytes of the original function will start"
+            " after four nop instructions.\n");
+    fprintf(stderr, "\nvoid foo() {\n  asm volatile(\"nop;nop;nop;nop;\");\n");
+    int j = 0;
+    for (j = 0; j < 2; j++) {
+      fprintf(stderr, "  asm volatile(\".byte ");
+      for (i = 8 * j; i < 8 * (j+1) - 1; i++) {
+        fprintf(stderr, "0x%x, ", (unsigned int) orig[i]);
+      }
+      fprintf(stderr, "0x%x;\");\n", (unsigned int) orig[8 * (j+1) - 1]);
+    }
+    fprintf(stderr, "}\n\n");
+  }
+#endif
+
+	long	*originalFunctionPtr = (long*) originalFunctionAddress;
+	mach_error_t	err = err_none;
+	
+#if defined(__ppc__) || defined(__POWERPC__)
+	//	Ensure first instruction isn't 'mfctr'.
+	#define	kMFCTRMask			0xfc1fffff
+	#define	kMFCTRInstruction	0x7c0903a6
+	
+	long	originalInstruction = *originalFunctionPtr;
+	if( !err && ((originalInstruction & kMFCTRMask) == kMFCTRInstruction) )
+		err = err_cannot_override;
+#elif defined(__i386__) || defined(__x86_64__)
+	int eatenCount = 0;
+	int originalInstructionCount = 0;
+	char originalInstructions[kOriginalInstructionsSize];
+	uint8_t originalInstructionSizes[kOriginalInstructionsSize];
+	uint64_t jumpRelativeInstruction = 0; // JMP
+
+	Boolean overridePossible = eatKnownInstructions ((unsigned char *)originalFunctionPtr, 
+										&jumpRelativeInstruction, &eatenCount, 
+										originalInstructions, &originalInstructionCount, 
+										originalInstructionSizes );
+#ifdef DEBUG_DISASM
+  if (!overridePossible) fprintf(stderr, "overridePossible = false @%d\n", __LINE__);
+#endif
+	if (eatenCount > kOriginalInstructionsSize) {
+#ifdef DEBUG_DISASM
+		fprintf(stderr, "Too many instructions eaten\n");
+#endif    
+		overridePossible = false;
+	}
+	if (!overridePossible) err = err_cannot_override;
+	if (err) fprintf(stderr, "err = %x %s:%d\n", err, __FILE__, __LINE__);
+#endif
+	
+	//	Make the original function implementation writable.
+	if( !err ) {
+		err = vm_protect( mach_task_self(),
+				(vm_address_t) originalFunctionPtr, 8, false,
+				(VM_PROT_ALL | VM_PROT_COPY) );
+		if( err )
+			err = vm_protect( mach_task_self(),
+					(vm_address_t) originalFunctionPtr, 8, false,
+					(VM_PROT_DEFAULT | VM_PROT_COPY) );
+	}
+	if (err) fprintf(stderr, "err = %x %s:%d\n", err, __FILE__, __LINE__);
+	
+	//	Allocate and target the escape island to the overriding function.
+	BranchIsland	*escapeIsland = NULL;
+	if( !err )
+		err = alloc( (void**)&escapeIsland, sizeof(BranchIsland), originalFunctionAddress );
+	if ( err ) fprintf(stderr, "err = %x %s:%d\n", err, __FILE__, __LINE__);
+	
+#if defined(__ppc__) || defined(__POWERPC__)
+	if( !err )
+		err = setBranchIslandTarget( escapeIsland, overrideFunctionAddress, 0 );
+	
+	//	Build the branch absolute instruction to the escape island.
+	long	branchAbsoluteInstruction = 0; // Set to 0 just to silence warning.
+	if( !err ) {
+		long escapeIslandAddress = ((long) escapeIsland) & 0x3FFFFFF;
+		branchAbsoluteInstruction = 0x48000002 | escapeIslandAddress;
+	}
+#elif defined(__i386__) || defined(__x86_64__)
+        if (err) fprintf(stderr, "err = %x %s:%d\n", err, __FILE__, __LINE__);
+
+	if( !err )
+		err = setBranchIslandTarget_i386( escapeIsland, overrideFunctionAddress, 0 );
+ 
+	if (err) fprintf(stderr, "err = %x %s:%d\n", err, __FILE__, __LINE__);
+	// Build the jump relative instruction to the escape island
+#endif
+
+
+#if defined(__i386__) || defined(__x86_64__)
+	if (!err) {
+		uint32_t addressOffset = ((char*)escapeIsland - (char*)originalFunctionPtr - 5);
+		addressOffset = OSSwapInt32(addressOffset);
+		
+		jumpRelativeInstruction |= 0xE900000000000000LL; 
+		jumpRelativeInstruction |= ((uint64_t)addressOffset & 0xffffffff) << 24;
+		jumpRelativeInstruction = OSSwapInt64(jumpRelativeInstruction);		
+	}
+#endif
+	
+	//	Optionally allocate & return the reentry island. This may contain relocated
+	//  jmp instructions and so has all the same addressing reachability requirements
+	//  the escape island has to the original function, except the escape island is
+	//  technically our original function.
+	BranchIsland	*reentryIsland = NULL;
+	if( !err && originalFunctionReentryIsland ) {
+		err = alloc( (void**)&reentryIsland, sizeof(BranchIsland), escapeIsland);
+		if( !err )
+			*originalFunctionReentryIsland = reentryIsland;
+	}
+	
+#if defined(__ppc__) || defined(__POWERPC__)	
+	//	Atomically:
+	//	o If the reentry island was allocated:
+	//		o Insert the original instruction into the reentry island.
+	//		o Target the reentry island at the 2nd instruction of the
+	//		  original function.
+	//	o Replace the original instruction with the branch absolute.
+	if( !err ) {
+		int escapeIslandEngaged = false;
+		do {
+			if( reentryIsland )
+				err = setBranchIslandTarget( reentryIsland,
+						(void*) (originalFunctionPtr+1), originalInstruction );
+			if( !err ) {
+				escapeIslandEngaged = CompareAndSwap( originalInstruction,
+										branchAbsoluteInstruction,
+										(UInt32*)originalFunctionPtr );
+				if( !escapeIslandEngaged ) {
+					//	Someone replaced the instruction out from under us,
+					//	re-read the instruction, make sure it's still not
+					//	'mfctr' and try again.
+					originalInstruction = *originalFunctionPtr;
+					if( (originalInstruction & kMFCTRMask) == kMFCTRInstruction)
+						err = err_cannot_override;
+				}
+			}
+		} while( !err && !escapeIslandEngaged );
+	}
+#elif defined(__i386__) || defined(__x86_64__)
+	// Atomically:
+	//	o If the reentry island was allocated:
+	//		o Insert the original instructions into the reentry island.
+	//		o Target the reentry island at the first non-replaced 
+	//        instruction of the original function.
+	//	o Replace the original first instructions with the jump relative.
+	//
+	// Note that on i386, we do not support someone else changing the code under our feet
+	if ( !err ) {
+		fixupInstructions(originalFunctionPtr, reentryIsland, originalInstructions,
+					originalInstructionCount, originalInstructionSizes );
+	
+		if( reentryIsland )
+			err = setBranchIslandTarget_i386( reentryIsland,
+										 (void*) ((char *)originalFunctionPtr+eatenCount), originalInstructions );
+		// try making islands executable before planting the jmp
+#if defined(__x86_64__) || defined(__i386__)
+        if( !err )
+            err = makeIslandExecutable(escapeIsland);
+        if( !err && reentryIsland )
+            err = makeIslandExecutable(reentryIsland);
+#endif
+		if ( !err )
+			atomic_mov64((uint64_t *)originalFunctionPtr, jumpRelativeInstruction);
+	}
+#endif
+	
+	//	Clean up on error.
+	if( err ) {
+		if( reentryIsland )
+			dealloc( reentryIsland );
+		if( escapeIsland )
+			dealloc( escapeIsland );
+	}
+
+#ifdef DEBUG_DISASM
+  {
+    fprintf(stderr, "First 16 bytes of the function after slicing: ");
+    unsigned char *orig = (unsigned char *)originalFunctionAddress;
+    int i;
+    for (i = 0; i < 16; i++) {
+       fprintf(stderr, "%x ", (unsigned int) orig[i]);
+    }
+    fprintf(stderr, "\n");
+  }
+#endif
+	return err;
+}
+
+/*******************************************************************************
+*	
+*	Implementation
+*	
+*******************************************************************************/
+#pragma mark	-
+#pragma mark	(Implementation)
+
+/***************************************************************************//**
+	Implementation: Allocates memory for a branch island.
+	
+	@param	island			<-	The allocated island.
+	@param	allocateHigh	->	Whether to allocate the island at the end of the
+								address space (for use with the branch absolute
+								instruction).
+	@result					<-	mach_error_t
+
+	***************************************************************************/
+
+	static mach_error_t
+allocateBranchIsland(
+		BranchIsland	**island,
+		int				allocateHigh,
+		void *originalFunctionAddress)
+{
+	assert( island );
+	
+	mach_error_t	err = err_none;
+	
+	if( allocateHigh ) {
+		vm_size_t pageSize;
+		err = host_page_size( mach_host_self(), &pageSize );
+		if( !err ) {
+			assert( sizeof( BranchIsland ) <= pageSize );
+#if defined(__ppc__) || defined(__POWERPC__)
+			vm_address_t first = 0xfeffffff;
+			vm_address_t last = 0xfe000000 + pageSize;
+#elif defined(__x86_64__)
+			vm_address_t first = ((uint64_t)originalFunctionAddress & ~(uint64_t)(((uint64_t)1 << 31) - 1)) | ((uint64_t)1 << 31); // start in the middle of the page?
+			vm_address_t last = 0x0;
+#else
+			vm_address_t first = 0xffc00000;
+			vm_address_t last = 0xfffe0000;
+#endif
+
+			vm_address_t page = first;
+			int allocated = 0;
+			vm_map_t task_self = mach_task_self();
+			
+			while( !err && !allocated && page != last ) {
+
+				err = vm_allocate( task_self, &page, pageSize, 0 );
+				if( err == err_none )
+					allocated = 1;
+				else if( err == KERN_NO_SPACE ) {
+#if defined(__x86_64__)
+					page -= pageSize;
+#else
+					page += pageSize;
+#endif
+					err = err_none;
+				}
+			}
+			if( allocated )
+				*island = (BranchIsland*) page;
+			else if( !allocated && !err )
+				err = KERN_NO_SPACE;
+		}
+	} else {
+		void *block = malloc( sizeof( BranchIsland ) );
+		if( block )
+			*island = block;
+		else
+			err = KERN_NO_SPACE;
+	}
+	if( !err )
+		(**island).allocatedHigh = allocateHigh;
+	
+	return err;
+}
+
+/***************************************************************************//**
+	Implementation: Deallocates memory for a branch island.
+	
+	@param	island	->	The island to deallocate.
+	@result			<-	mach_error_t
+
+	***************************************************************************/
+
+	static mach_error_t
+freeBranchIsland(
+		BranchIsland	*island )
+{
+	assert( island );
+	assert( (*(long*)&island->instructions[0]) == kIslandTemplate[0] );
+	assert( island->allocatedHigh );
+	
+	mach_error_t	err = err_none;
+	
+	if( island->allocatedHigh ) {
+		vm_size_t pageSize;
+		err = host_page_size( mach_host_self(), &pageSize );
+		if( !err ) {
+			assert( sizeof( BranchIsland ) <= pageSize );
+			err = vm_deallocate(
+					mach_task_self(),
+					(vm_address_t) island, pageSize );
+		}
+	} else {
+		free( island );
+	}
+	
+	return err;
+}
+
+/***************************************************************************//**
+	Implementation: Sets the branch island's target, with an optional
+	instruction.
+	
+	@param	island		->	The branch island to insert target into.
+	@param	branchTo	->	The address of the target.
+	@param	instruction	->	Optional instruction to execute prior to branch. Set
+							to zero for nop.
+	@result				<-	mach_error_t
+
+	***************************************************************************/
+#if defined(__ppc__) || defined(__POWERPC__)
+	static mach_error_t
+setBranchIslandTarget(
+		BranchIsland	*island,
+		const void		*branchTo,
+		long			instruction )
+{
+	//	Copy over the template code.
+    bcopy( kIslandTemplate, island->instructions, sizeof( kIslandTemplate ) );
+    
+    //	Fill in the address.
+    ((short*)island->instructions)[kAddressLo] = ((long) branchTo) & 0x0000FFFF;
+    ((short*)island->instructions)[kAddressHi]
+    	= (((long) branchTo) >> 16) & 0x0000FFFF;
+    
+    //	Fill in the (optional) instuction.
+    if( instruction != 0 ) {
+        ((short*)island->instructions)[kInstructionLo]
+        	= instruction & 0x0000FFFF;
+        ((short*)island->instructions)[kInstructionHi]
+        	= (instruction >> 16) & 0x0000FFFF;
+    }
+    
+    //MakeDataExecutable( island->instructions, sizeof( kIslandTemplate ) );
+	msync( island->instructions, sizeof( kIslandTemplate ), MS_INVALIDATE );
+    
+    return err_none;
+}
+#endif 
+
+#if defined(__i386__)
+	static mach_error_t
+setBranchIslandTarget_i386(
+	BranchIsland	*island,
+	const void		*branchTo,
+	char*			instructions )
+{
+
+	//	Copy over the template code.
+    bcopy( kIslandTemplate, island->instructions, sizeof( kIslandTemplate ) );
+
+	// copy original instructions
+	if (instructions) {
+		bcopy (instructions, island->instructions + kInstructions, kOriginalInstructionsSize);
+	}
+	
+    // Fill in the address.
+    int32_t addressOffset = (char *)branchTo - (island->instructions + kJumpAddress + 4);
+    *((int32_t *)(island->instructions + kJumpAddress)) = addressOffset; 
+
+    msync( island->instructions, sizeof( kIslandTemplate ), MS_INVALIDATE );
+    return err_none;
+}
+
+#elif defined(__x86_64__)
+static mach_error_t
+setBranchIslandTarget_i386(
+        BranchIsland	*island,
+        const void		*branchTo,
+        char*			instructions )
+{
+    // Copy over the template code.
+    bcopy( kIslandTemplate, island->instructions, sizeof( kIslandTemplate ) );
+
+    // Copy original instructions.
+    if (instructions) {
+        bcopy (instructions, island->instructions, kOriginalInstructionsSize);
+    }
+
+    //	Fill in the address.
+    *((uint64_t *)(island->instructions + kJumpAddress)) = (uint64_t)branchTo; 
+    msync( island->instructions, sizeof( kIslandTemplate ), MS_INVALIDATE );
+
+    return err_none;
+}
+#endif
+
+
+#if defined(__i386__) || defined(__x86_64__)
+// simplistic instruction matching
+typedef struct {
+	unsigned int length; // max 15
+	unsigned char mask[15]; // sequence of bytes in memory order
+	unsigned char constraint[15]; // sequence of bytes in memory order
+}	AsmInstructionMatch;
+
+#if defined(__i386__)
+static AsmInstructionMatch possibleInstructions[] = {
+	{ 0x5, {0xFF, 0x00, 0x00, 0x00, 0x00}, {0xE9, 0x00, 0x00, 0x00, 0x00} },	// jmp 0x????????
+	{ 0x5, {0xFF, 0xFF, 0xFF, 0xFF, 0xFF}, {0x55, 0x89, 0xe5, 0xc9, 0xc3} },	// push %esp; mov %esp,%ebp; leave; ret
+	{ 0x1, {0xFF}, {0x90} },							// nop
+	{ 0x1, {0xF8}, {0x50} },							// push %reg
+	{ 0x2, {0xFF, 0xFF}, {0x89, 0xE5} },				                // mov %esp,%ebp
+	{ 0x3, {0xFF, 0xFF, 0xFF}, {0x89, 0x1C, 0x24} },				                // mov %ebx,(%esp)
+	{ 0x3, {0xFF, 0xFF, 0x00}, {0x83, 0xEC, 0x00} },	                        // sub 0x??, %esp
+	{ 0x6, {0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00}, {0x81, 0xEC, 0x00, 0x00, 0x00, 0x00} },	// sub 0x??, %esp with 32bit immediate
+	{ 0x2, {0xFF, 0xFF}, {0x31, 0xC0} },						// xor %eax, %eax
+	{ 0x3, {0xFF, 0x4F, 0x00}, {0x8B, 0x45, 0x00} },  // mov $imm(%ebp), %reg
+	{ 0x3, {0xFF, 0x4C, 0x00}, {0x8B, 0x40, 0x00} },  // mov $imm(%eax-%edx), %reg
+	{ 0x3, {0xFF, 0xCF, 0x00}, {0x8B, 0x4D, 0x00} },  // mov $imm(%rpb), %reg
+	{ 0x3, {0xFF, 0x4F, 0x00}, {0x8A, 0x4D, 0x00} },  // mov $imm(%ebp), %cl
+	{ 0x4, {0xFF, 0xFF, 0xFF, 0x00}, {0x8B, 0x4C, 0x24, 0x00} },  			// mov $imm(%esp), %ecx
+	{ 0x4, {0xFF, 0x00, 0x00, 0x00}, {0x8B, 0x00, 0x00, 0x00} },  			// mov r16,r/m16 or r32,r/m32
+	{ 0x5, {0xFF, 0x00, 0x00, 0x00, 0x00}, {0xB9, 0x00, 0x00, 0x00, 0x00} }, 	// mov $imm, %ecx
+	{ 0x5, {0xFF, 0x00, 0x00, 0x00, 0x00}, {0xB8, 0x00, 0x00, 0x00, 0x00} }, 	// mov $imm, %eax
+	{ 0x4, {0xFF, 0xFF, 0xFF, 0x00}, {0x66, 0x0F, 0xEF, 0x00} },             	// pxor xmm2/128, xmm1
+	{ 0x2, {0xFF, 0xFF}, {0xDB, 0xE3} }, 						// fninit
+	{ 0x5, {0xFF, 0x00, 0x00, 0x00, 0x00}, {0xE8, 0x00, 0x00, 0x00, 0x00} },	// call $imm
+	{ 0x4, {0xFF, 0xFF, 0xFF, 0x00}, {0x0F, 0xBE, 0x55, 0x00} },                    // movsbl $imm(%ebp), %edx
+	{ 0x0, {0x00}, {0x00} }
+};
+#elif defined(__x86_64__)
+// TODO(glider): disassembling the "0x48, 0x89" sequences is trickier than it's done below.
+// If it stops working, refer to http://ref.x86asm.net/geek.html#modrm_byte_32_64 to do it
+// more accurately.
+// Note: 0x48 is in fact the REX.W prefix, but it might be wrong to treat it as a separate
+// instruction.
+static AsmInstructionMatch possibleInstructions[] = {
+	{ 0x5, {0xFF, 0x00, 0x00, 0x00, 0x00}, {0xE9, 0x00, 0x00, 0x00, 0x00} },	// jmp 0x????????
+	{ 0x1, {0xFF}, {0x90} },							// nop
+	{ 0x1, {0xF8}, {0x50} },							// push %rX
+	{ 0x1, {0xFF}, {0x65} },							// GS prefix
+	{ 0x3, {0xFF, 0xFF, 0xFF}, {0x48, 0x89, 0xE5} },				// mov %rsp,%rbp
+	{ 0x4, {0xFF, 0xFF, 0xFF, 0x00}, {0x48, 0x83, 0xEC, 0x00} },	                // sub 0x??, %rsp
+	{ 0x4, {0xFB, 0xFF, 0x07, 0x00}, {0x48, 0x89, 0x05, 0x00} },	                // move onto rbp
+	{ 0x3, {0xFB, 0xFF, 0x00}, {0x48, 0x89, 0x00} },	                            // mov %reg, %reg
+	{ 0x3, {0xFB, 0xFF, 0x00}, {0x49, 0x89, 0x00} },	                            // mov %reg, %reg (REX.WB)
+	{ 0x2, {0xFF, 0x00}, {0x41, 0x00} },						// push %rXX
+	{ 0x2, {0xFF, 0x00}, {0x84, 0x00} },						// test %rX8,%rX8
+	{ 0x2, {0xFF, 0x00}, {0x85, 0x00} },						// test %rX,%rX
+	{ 0x2, {0xFF, 0x00}, {0x77, 0x00} },						// ja $i8
+	{ 0x2, {0xFF, 0x00}, {0x74, 0x00} },						// je $i8
+	{ 0x5, {0xF8, 0x00, 0x00, 0x00, 0x00}, {0xB8, 0x00, 0x00, 0x00, 0x00} },	// mov $imm, %reg
+	{ 0x3, {0xFF, 0xFF, 0x00}, {0xFF, 0x77, 0x00} },				// pushq $imm(%rdi)
+	{ 0x2, {0xFF, 0xFF}, {0x31, 0xC0} },						// xor %eax, %eax
+	{ 0x5, {0xFF, 0x00, 0x00, 0x00, 0x00}, {0x25, 0x00, 0x00, 0x00, 0x00} },	// and $imm, %eax
+	{ 0x3, {0xFF, 0xFF, 0xFF}, {0x80, 0x3F, 0x00} },				// cmpb $imm, (%rdi)
+
+  { 0x8, {0xFF, 0xFF, 0xCF, 0xFF, 0x00, 0x00, 0x00, 0x00},
+         {0x48, 0x8B, 0x04, 0x25, 0x00, 0x00, 0x00, 0x00}, },                     // mov $imm, %{rax,rdx,rsp,rsi}
+  { 0x4, {0xFF, 0xFF, 0xFF, 0x00}, {0x48, 0x83, 0xFA, 0x00}, },   // cmp $i8, %rdx
+	{ 0x4, {0xFF, 0xFF, 0x00, 0x00}, {0x83, 0x7f, 0x00, 0x00}, },			// cmpl $imm, $imm(%rdi)
+	{ 0xa, {0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
+               {0x48, 0xB8, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00} },    // mov $imm, %rax
+        { 0x6, {0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00},
+               {0x81, 0xE6, 0x00, 0x00, 0x00, 0x00} },                            // and $imm, %esi
+        { 0x6, {0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00},
+               {0xFF, 0x25, 0x00, 0x00, 0x00, 0x00} },                            // jmpq *(%rip)
+        { 0x4, {0xFF, 0xFF, 0xFF, 0x00}, {0x66, 0x0F, 0xEF, 0x00} },              // pxor xmm2/128, xmm1
+        { 0x2, {0xFF, 0x00}, {0x89, 0x00} },                               // mov r/m32,r32 or r/m16,r16
+        { 0x3, {0xFF, 0xFF, 0xFF}, {0x49, 0x89, 0xF8} },                   // mov %rdi,%r8
+        { 0x4, {0xFF, 0xFF, 0xFF, 0xFF}, {0x40, 0x0F, 0xBE, 0xCE} },       // movsbl %sil,%ecx
+        { 0x7, {0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00},
+               {0x48, 0x8D, 0x05, 0x00, 0x00, 0x00, 0x00} },  // lea $imm(%rip),%rax
+        { 0x3, {0xFF, 0xFF, 0xFF}, {0x0F, 0xBE, 0xCE} },  // movsbl, %dh, %ecx
+        { 0x3, {0xFF, 0xFF, 0x00}, {0xFF, 0x77, 0x00} },  // pushq $imm(%rdi)
+        { 0x2, {0xFF, 0xFF}, {0xDB, 0xE3} }, // fninit
+        { 0x3, {0xFF, 0xFF, 0xFF}, {0x48, 0x85, 0xD2} },  // test %rdx,%rdx
+	{ 0x0, {0x00}, {0x00} }
+};
+#endif
+
+static Boolean codeMatchesInstruction(unsigned char *code, AsmInstructionMatch* instruction) 
+{
+	Boolean match = true;
+	
+	size_t i;
+  assert(instruction);
+#ifdef DEBUG_DISASM
+	fprintf(stderr, "Matching: ");
+#endif  
+	for (i=0; i<instruction->length; i++) {
+		unsigned char mask = instruction->mask[i];
+		unsigned char constraint = instruction->constraint[i];
+		unsigned char codeValue = code[i];
+#ifdef DEBUG_DISASM
+		fprintf(stderr, "%x ", (unsigned)codeValue);
+#endif    
+		match = ((codeValue & mask) == constraint);
+		if (!match) break;
+	}
+#ifdef DEBUG_DISASM
+	if (match) {
+		fprintf(stderr, " OK\n");
+	} else {
+		fprintf(stderr, " FAIL\n");
+	}
+#endif  
+	return match;
+}
+
+#if defined(__i386__) || defined(__x86_64__)
+	static Boolean 
+eatKnownInstructions( 
+	unsigned char	*code, 
+	uint64_t		*newInstruction,
+	int				*howManyEaten, 
+	char			*originalInstructions,
+	int				*originalInstructionCount, 
+	uint8_t			*originalInstructionSizes )
+{
+	Boolean allInstructionsKnown = true;
+	int totalEaten = 0;
+	unsigned char* ptr = code;
+	int remainsToEat = 5; // a JMP instruction takes 5 bytes
+	int instructionIndex = 0;
+	
+	if (howManyEaten) *howManyEaten = 0;
+	if (originalInstructionCount) *originalInstructionCount = 0;
+	while (remainsToEat > 0) {
+		Boolean curInstructionKnown = false;
+		
+		// See if instruction matches one  we know
+		AsmInstructionMatch* curInstr = possibleInstructions;
+		do { 
+			if ((curInstructionKnown = codeMatchesInstruction(ptr, curInstr))) break;
+			curInstr++;
+		} while (curInstr->length > 0);
+		
+		// if all instruction matches failed, we don't know current instruction then, stop here
+		if (!curInstructionKnown) { 
+			allInstructionsKnown = false;
+			fprintf(stderr, "mach_override: some instructions unknown! Need to update mach_override.c\n");
+			break;
+		}
+		
+		// At this point, we've matched curInstr
+		int eaten = curInstr->length;
+		ptr += eaten;
+		remainsToEat -= eaten;
+		totalEaten += eaten;
+		
+		if (originalInstructionSizes) originalInstructionSizes[instructionIndex] = eaten;
+		instructionIndex += 1;
+		if (originalInstructionCount) *originalInstructionCount = instructionIndex;
+	}
+
+
+	if (howManyEaten) *howManyEaten = totalEaten;
+
+	if (originalInstructions) {
+		Boolean enoughSpaceForOriginalInstructions = (totalEaten < kOriginalInstructionsSize);
+		
+		if (enoughSpaceForOriginalInstructions) {
+			memset(originalInstructions, 0x90 /* NOP */, kOriginalInstructionsSize); // fill instructions with NOP
+			bcopy(code, originalInstructions, totalEaten);
+		} else {
+#ifdef DEBUG_DISASM
+			fprintf(stderr, "Not enough space in island to store original instructions. Adapt the island definition and kOriginalInstructionsSize\n");
+#endif      
+			return false;
+		}
+	}
+	
+	if (allInstructionsKnown) {
+		// save last 3 bytes of first 64bits of codre we'll replace
+		uint64_t currentFirst64BitsOfCode = *((uint64_t *)code);
+		currentFirst64BitsOfCode = OSSwapInt64(currentFirst64BitsOfCode); // back to memory representation
+		currentFirst64BitsOfCode &= 0x0000000000FFFFFFLL; 
+		
+		// keep only last 3 instructions bytes, first 5 will be replaced by JMP instr
+		*newInstruction &= 0xFFFFFFFFFF000000LL; // clear last 3 bytes
+		*newInstruction |= (currentFirst64BitsOfCode & 0x0000000000FFFFFFLL); // set last 3 bytes
+	}
+
+	return allInstructionsKnown;
+}
+
+	static void
+fixupInstructions(
+    void		*originalFunction,
+    void		*escapeIsland,
+    void		*instructionsToFix,
+	int			instructionCount,
+	uint8_t		*instructionSizes )
+{
+	void *initialOriginalFunction = originalFunction;
+	int	index, fixed_size, code_size = 0;
+	for (index = 0;index < instructionCount;index += 1)
+		code_size += instructionSizes[index];
+
+#ifdef DEBUG_DISASM
+	void *initialInstructionsToFix = instructionsToFix;
+	fprintf(stderr, "BEFORE FIXING:\n");
+	dump16Bytes(initialOriginalFunction);
+	dump16Bytes(initialInstructionsToFix);
+#endif  // DEBUG_DISASM
+
+	for (index = 0;index < instructionCount;index += 1)
+	{
+                fixed_size = instructionSizes[index];
+		if ((*(uint8_t*)instructionsToFix == 0xE9) || // 32-bit jump relative
+		    (*(uint8_t*)instructionsToFix == 0xE8))   // 32-bit call relative
+		{
+			uint32_t offset = (uintptr_t)originalFunction - (uintptr_t)escapeIsland;
+			uint32_t *jumpOffsetPtr = (uint32_t*)((uintptr_t)instructionsToFix + 1);
+			*jumpOffsetPtr += offset;
+		}
+		if ((*(uint8_t*)instructionsToFix == 0x74) ||  // Near jump if equal (je), 2 bytes.
+		    (*(uint8_t*)instructionsToFix == 0x77))    // Near jump if above (ja), 2 bytes.
+		{
+			// We replace a near je/ja instruction, "7P JJ", with a 32-bit je/ja, "0F 8P WW XX YY ZZ".
+			// This is critical, otherwise a near jump will likely fall outside the original function.
+			uint32_t offset = (uintptr_t)initialOriginalFunction - (uintptr_t)escapeIsland;
+			uint32_t jumpOffset = *(uint8_t*)((uintptr_t)instructionsToFix + 1);
+			*((uint8_t*)instructionsToFix + 1) = *(uint8_t*)instructionsToFix + 0x10;
+			*(uint8_t*)instructionsToFix = 0x0F;
+			uint32_t *jumpOffsetPtr = (uint32_t*)((uintptr_t)instructionsToFix + 2 );
+			*jumpOffsetPtr = offset + jumpOffset;
+			fixed_size = 6;
+                }
+		
+		originalFunction = (void*)((uintptr_t)originalFunction + instructionSizes[index]);
+		escapeIsland = (void*)((uintptr_t)escapeIsland + instructionSizes[index]);
+		instructionsToFix = (void*)((uintptr_t)instructionsToFix + fixed_size);
+
+		// Expanding short instructions into longer ones may overwrite the next instructions,
+		// so we must restore them.
+		code_size -= fixed_size;
+		if ((code_size > 0) && (fixed_size != instructionSizes[index])) {
+			bcopy(originalFunction, instructionsToFix, code_size);
+		}
+	}
+#ifdef DEBUG_DISASM
+	fprintf(stderr, "AFTER_FIXING:\n");
+	dump16Bytes(initialOriginalFunction);
+	dump16Bytes(initialInstructionsToFix);
+#endif  // DEBUG_DISASM
+}
+
+#ifdef DEBUG_DISASM
+#define HEX_DIGIT(x) ((((x) % 16) < 10) ? ('0' + ((x) % 16)) : ('A' + ((x) % 16 - 10)))
+
+	static void
+dump16Bytes(
+	void 	*ptr) {
+	int i;
+	char buf[3];
+	uint8_t *bytes = (uint8_t*)ptr;
+	for (i = 0; i < 16; i++) {
+		buf[0] = HEX_DIGIT(bytes[i] / 16);
+		buf[1] = HEX_DIGIT(bytes[i] % 16);
+		buf[2] = ' ';
+		write(2, buf, 3);
+	}
+	write(2, "\n", 1);
+}
+#endif  // DEBUG_DISASM
+#endif
+
+#if defined(__i386__)
+__asm(
+			".text;"
+			".align 2, 0x90;"
+			"_atomic_mov64:;"
+			"	pushl %ebp;"
+			"	movl %esp, %ebp;"
+			"	pushl %esi;"
+			"	pushl %ebx;"
+			"	pushl %ecx;"
+			"	pushl %eax;"
+			"	pushl %edx;"
+	
+			// atomic push of value to an address
+			// we use cmpxchg8b, which compares content of an address with 
+			// edx:eax. If they are equal, it atomically puts 64bit value 
+			// ecx:ebx in address. 
+			// We thus put contents of address in edx:eax to force ecx:ebx
+			// in address
+			"	mov		8(%ebp), %esi;"  // esi contains target address
+			"	mov		12(%ebp), %ebx;"
+			"	mov		16(%ebp), %ecx;" // ecx:ebx now contains value to put in target address
+			"	mov		(%esi), %eax;"
+			"	mov		4(%esi), %edx;"  // edx:eax now contains value currently contained in target address
+			"	lock; cmpxchg8b	(%esi);" // atomic move.
+			
+			// restore registers
+			"	popl %edx;"
+			"	popl %eax;"
+			"	popl %ecx;"
+			"	popl %ebx;"
+			"	popl %esi;"
+			"	popl %ebp;"
+			"	ret"
+);
+#elif defined(__x86_64__)
+void atomic_mov64(
+		uint64_t *targetAddress,
+		uint64_t value )
+{
+    *targetAddress = value;
+}
+#endif
+#endif
+#endif  // __APPLE__
--- /dev/null	2012-11-16 10:24:58.000000000 -0500
+++ libsanitizer/interception/mach_override/mach_override.h	2012-11-16 10:26:52.000000000 -0500
@@ -0,0 +1,140 @@
+/*******************************************************************************
+	mach_override.h
+		Copyright (c) 2003-2009 Jonathan 'Wolf' Rentzsch: <http://rentzsch.com>
+		Some rights reserved: <http://opensource.org/licenses/mit-license.php>
+
+	***************************************************************************/
+
+/***************************************************************************//**
+	@mainpage	mach_override
+	@author		Jonathan 'Wolf' Rentzsch: <http://rentzsch.com>
+	
+	This package, coded in C to the Mach API, allows you to override ("patch")
+	program- and system-supplied functions at runtime. You can fully replace
+	functions with your implementations, or merely head- or tail-patch the
+	original implementations.
+	
+	Use it by #include'ing mach_override.h from your .c, .m or .mm file(s).
+	
+	@todo	Discontinue use of Carbon's MakeDataExecutable() and
+			CompareAndSwap() calls and start using the Mach equivalents, if they
+			exist. If they don't, write them and roll them in. That way, this
+			code will be pure Mach, which will make it easier to use everywhere.
+			Update: MakeDataExecutable() has been replaced by
+			msync(MS_INVALIDATE). There is an OSCompareAndSwap in libkern, but
+			I'm currently unsure if I can link against it. May have to roll in
+			my own version...
+	@todo	Stop using an entire 4K high-allocated VM page per 28-byte escape
+			branch island. Done right, this will dramatically speed up escape
+			island allocations when they number over 250. Then again, if you're
+			overriding more than 250 functions, maybe speed isn't your main
+			concern...
+	@todo	Add detection of: b, bl, bla, bc, bcl, bcla, bcctrl, bclrl
+			first-instructions. Initially, we should refuse to override
+			functions beginning with these instructions. Eventually, we should
+			dynamically rewrite them to make them position-independent.
+	@todo	Write mach_unoverride(), which would remove an override placed on a
+			function. Must be multiple-override aware, which means an almost
+			complete rewrite under the covers, because the target address can't
+			be spread across two load instructions like it is now since it will
+			need to be atomically updatable.
+	@todo	Add non-rentry variants of overrides to test_mach_override.
+
+	***************************************************************************/
+
+#ifdef __APPLE__
+
+#ifndef		_mach_override_
+#define		_mach_override_
+
+#include <sys/types.h>
+#include <mach/error.h>
+
+#ifdef	__cplusplus
+	extern	"C"	{
+#endif
+
+/**
+	Returned if the function to be overrided begins with a 'mfctr' instruction.
+*/
+#define	err_cannot_override	(err_local|1)
+
+/************************************************************************************//**
+	Dynamically overrides the function implementation referenced by
+	originalFunctionAddress with the implentation pointed to by overrideFunctionAddress.
+	Optionally returns a pointer to a "reentry island" which, if jumped to, will resume
+	the original implementation.
+	
+	@param	originalFunctionAddress			->	Required address of the function to
+												override (with overrideFunctionAddress).
+	@param	overrideFunctionAddress			->	Required address to the overriding
+												function.
+	@param	originalFunctionReentryIsland	<-	Optional pointer to pointer to the
+												reentry island. Can be NULL.
+	@result									<-	err_cannot_override if the original
+												function's implementation begins with
+												the 'mfctr' instruction.
+
+	************************************************************************************/
+
+// We're prefixing mach_override_ptr() with "__asan_" to avoid name conflicts with other
+// mach_override_ptr() implementations that may appear in the client program.
+    mach_error_t
+__asan_mach_override_ptr(
+	void *originalFunctionAddress,
+    const void *overrideFunctionAddress,
+    void **originalFunctionReentryIsland );
+
+// Allow to use custom allocation and deallocation routines with mach_override_ptr().
+// This should help to speed up the things on x86_64.
+typedef mach_error_t island_malloc( void **ptr, size_t size, void *hint );
+typedef mach_error_t island_free( void *ptr );
+
+    mach_error_t
+__asan_mach_override_ptr_custom(
+	void *originalFunctionAddress,
+    const void *overrideFunctionAddress,
+    void **originalFunctionReentryIsland,
+    island_malloc *alloc,
+    island_free *dealloc );
+
+/************************************************************************************//**
+	
+
+	************************************************************************************/
+ 
+#ifdef	__cplusplus
+
+#define MACH_OVERRIDE( ORIGINAL_FUNCTION_RETURN_TYPE, ORIGINAL_FUNCTION_NAME, ORIGINAL_FUNCTION_ARGS, ERR )			\
+	{																												\
+		static ORIGINAL_FUNCTION_RETURN_TYPE (*ORIGINAL_FUNCTION_NAME##_reenter)ORIGINAL_FUNCTION_ARGS;				\
+		static bool ORIGINAL_FUNCTION_NAME##_overriden = false;														\
+		class mach_override_class__##ORIGINAL_FUNCTION_NAME {														\
+		public:																										\
+			static kern_return_t override(void *originalFunctionPtr) {												\
+				kern_return_t result = err_none;																	\
+				if (!ORIGINAL_FUNCTION_NAME##_overriden) {															\
+					ORIGINAL_FUNCTION_NAME##_overriden = true;														\
+					result = mach_override_ptr( (void*)originalFunctionPtr,											\
+												(void*)mach_override_class__##ORIGINAL_FUNCTION_NAME::replacement,	\
+												(void**)&ORIGINAL_FUNCTION_NAME##_reenter );						\
+				}																									\
+				return result;																						\
+			}																										\
+			static ORIGINAL_FUNCTION_RETURN_TYPE replacement ORIGINAL_FUNCTION_ARGS {
+
+#define END_MACH_OVERRIDE( ORIGINAL_FUNCTION_NAME )																	\
+			}																										\
+		};																											\
+																													\
+		err = mach_override_class__##ORIGINAL_FUNCTION_NAME::override((void*)ORIGINAL_FUNCTION_NAME);				\
+	}
+ 
+#endif
+
+#ifdef	__cplusplus
+	}
+#endif
+#endif	//	_mach_override_
+
+#endif  // __APPLE__
Index: libsanitizer/configure.ac
===================================================================
--- libsanitizer/configure.ac	(revision 193562)
+++ libsanitizer/configure.ac	(working copy)
@@ -22,6 +22,12 @@ AC_CANONICAL_SYSTEM
 target_alias=${target_alias-$host_alias}
 AC_SUBST(target_alias)
 
+case "$host" in
+  *-*-darwin*) MACH_OVERRIDE=true ;;
+  *) MACH_OVERRIDE=false ;;
+esac
+AM_CONDITIONAL(USING_MACH_OVERRIDE, $MACH_OVERRIDE)
+
 AM_INIT_AUTOMAKE(foreign)
 AM_ENABLE_MULTILIB(, ..)
 AM_MAINTAINER_MODE
Index: libsanitizer/interception/Makefile.am
===================================================================
--- libsanitizer/interception/Makefile.am	(revision 193562)
+++ libsanitizer/interception/Makefile.am	(working copy)
@@ -14,7 +14,11 @@ interception_files = \
         interception_mac.cc \
         interception_win.cc
 
-libinterception_la_SOURCES = $(interception_files) 
+if USING_MACH_OVERRIDE
+libinterception_la_SOURCES = $(interception_files) mach_override/mach_override.c
+else
+libinterception_la_SOURCES = $(interception_files)
+endif
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
 # values defined in terms of make variables, as is the case for CC and
Index: libsanitizer/configure.tgt
===================================================================
--- libsanitizer/configure.tgt	(revision 193562)
+++ libsanitizer/configure.tgt	(working copy)
@@ -22,6 +22,8 @@
 case "${target}" in
   x86_64-*-linux* | i?86-*-linux* | sparc*-*-linux*)
 	;;
+  x86_64-*-darwin* | i?86-*-darwin*)
+	;;
   *)
 	UNSUPPORTED=1
 	;;
Index: gcc/config/darwin.h
===================================================================
--- gcc/config/darwin.h	(revision 193562)
+++ gcc/config/darwin.h	(working copy)
@@ -180,6 +180,9 @@ extern GTY(()) int darwin_ms_struct;
     %{L*} %(link_libgcc) %o %{fprofile-arcs|fprofile-generate*|coverage:-lgcov} \
     %{fopenmp|ftree-parallelize-loops=*: \
       %{static|static-libgcc|static-libstdc++|static-libgfortran: libgomp.a%s; : -lgomp } } \
+    %{faddress-sanitizer: \
+      %{static|static-libgcc|static-libgfortran: -framework CoreFoundation -lstdc++ libasan.a%s; \
+      static-libstdc++: -framework CoreFoundation libstdc++.a%s libasan.a%s; : -framework CoreFoundation -lasan } } \
     %{fgnu-tm: \
       %{static|static-libgcc|static-libstdc++|static-libgfortran: libitm.a%s; : -litm } } \
     %{!nostdlib:%{!nodefaultlibs:\

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-16 15:57       ` Jack Howarth
@ 2012-11-16 16:02         ` Jakub Jelinek
  2012-11-16 16:47           ` Jack Howarth
  0 siblings, 1 reply; 80+ messages in thread
From: Jakub Jelinek @ 2012-11-16 16:02 UTC (permalink / raw)
  To: Jack Howarth
  Cc: Dodji Seketeli, Konstantin Serebryany, gcc-patches, dnovillo,
	wmi, davidxl, Alexander Potapenko, mikestump

On Fri, Nov 16, 2012 at 10:57:04AM -0500, Jack Howarth wrote:
> +case "$host" in
> +  *-*-darwin*) MACH_OVERRIDE=true ;;
> +  *) MACH_OVERRIDE=false ;;
> +esac
> +AM_CONDITIONAL(USING_MACH_OVERRIDE, $MACH_OVERRIDE)
> +

Shouldn't AM_CONDITIONAL follow AM_INIT_AUTOMAKE?  I'd say move it
before AC_CONFIG_FILES or so.

	Jakub

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-16 16:02         ` Jakub Jelinek
@ 2012-11-16 16:47           ` Jack Howarth
  0 siblings, 0 replies; 80+ messages in thread
From: Jack Howarth @ 2012-11-16 16:47 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Dodji Seketeli, Konstantin Serebryany, gcc-patches, dnovillo,
	wmi, davidxl, Alexander Potapenko, mikestump

[-- Attachment #1: Type: text/plain, Size: 657 bytes --]

On Fri, Nov 16, 2012 at 05:00:22PM +0100, Jakub Jelinek wrote:
> On Fri, Nov 16, 2012 at 10:57:04AM -0500, Jack Howarth wrote:
> > +case "$host" in
> > +  *-*-darwin*) MACH_OVERRIDE=true ;;
> > +  *) MACH_OVERRIDE=false ;;
> > +esac
> > +AM_CONDITIONAL(USING_MACH_OVERRIDE, $MACH_OVERRIDE)
> > +
> 
> Shouldn't AM_CONDITIONAL follow AM_INIT_AUTOMAKE?  I'd say move it
> before AC_CONFIG_FILES or so.
> 
> 	Jakub

Jakub,
   Done. Any chance you can commit this one (with yourself in the ChangeLog
for the file imports from llvm) before this one bit-rots? Thanks in advance.
         Jack
ps The configure.ac readjustment was tested on x86_64-apple-darwin12.

[-- Attachment #2: asan_v7.diff --]
[-- Type: text/plain, Size: 42762 bytes --]

gcc/

2012-11-16  Jack Howarth <howarth@bromo.med.uc.edu>

	* config/darwin.h (LINK_COMMAND_SPEC_A): Deal with -faddress-sanitizer.

libsanitizer/

2012-11-16  Dodji Seketeli <dodji@redhat.com>
	    Jack Howarth <howarth@bromo.med.uc.edu>

	* interception/mach_override/mach_override.c: Migrate from llvm.
	* interception/mach_override/mach_override.h: Likewise.
	* configure.tgt: Add darwin to supported targets.
	* configure.ac: Define USING_MACH_OVERRIDE when on darwin.
	* interception/Makefile.am: Compile mach_override.c when
	USING_MACH_OVERRIDE defined.
	* configure: Regenerated.
	* interception/Makefile.in: Likewise.

--- /dev/null	2012-11-16 10:24:58.000000000 -0500
+++ libsanitizer/interception/mach_override/mach_override.c	2012-11-16 10:26:42.000000000 -0500
@@ -0,0 +1,970 @@
+/*******************************************************************************
+	mach_override.c
+		Copyright (c) 2003-2009 Jonathan 'Wolf' Rentzsch: <http://rentzsch.com>
+		Some rights reserved: <http://opensource.org/licenses/mit-license.php>
+
+	***************************************************************************/
+#ifdef __APPLE__
+
+#include "mach_override.h"
+
+#include <mach-o/dyld.h>
+#include <mach/mach_host.h>
+#include <mach/mach_init.h>
+#include <mach/vm_map.h>
+#include <sys/mman.h>
+
+#include <CoreServices/CoreServices.h>
+
+//#define DEBUG_DISASM 1
+#undef DEBUG_DISASM
+
+/**************************
+*	
+*	Constants
+*	
+**************************/
+#pragma mark	-
+#pragma mark	(Constants)
+
+#if defined(__ppc__) || defined(__POWERPC__)
+
+static
+long kIslandTemplate[] = {
+	0x9001FFFC,	//	stw		r0,-4(SP)
+	0x3C00DEAD,	//	lis		r0,0xDEAD
+	0x6000BEEF,	//	ori		r0,r0,0xBEEF
+	0x7C0903A6,	//	mtctr	r0
+	0x8001FFFC,	//	lwz		r0,-4(SP)
+	0x60000000,	//	nop		; optionally replaced
+	0x4E800420 	//	bctr
+};
+
+#define kAddressHi			3
+#define kAddressLo			5
+#define kInstructionHi		10
+#define kInstructionLo		11
+
+#elif defined(__i386__) 
+
+#define kOriginalInstructionsSize 16
+
+static
+unsigned char kIslandTemplate[] = {
+	// kOriginalInstructionsSize nop instructions so that we 
+	// should have enough space to host original instructions 
+	0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 
+	0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90,
+	// Now the real jump instruction
+	0xE9, 0xEF, 0xBE, 0xAD, 0xDE
+};
+
+#define kInstructions	0
+#define kJumpAddress    kInstructions + kOriginalInstructionsSize + 1
+#elif defined(__x86_64__)
+
+#define kOriginalInstructionsSize 32
+
+#define kJumpAddress    kOriginalInstructionsSize + 6
+
+static
+unsigned char kIslandTemplate[] = {
+	// kOriginalInstructionsSize nop instructions so that we 
+	// should have enough space to host original instructions 
+	0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 
+	0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90,
+	0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 
+	0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90,
+	// Now the real jump instruction
+	0xFF, 0x25, 0x00, 0x00, 0x00, 0x00,
+        0x00, 0x00, 0x00, 0x00,
+        0x00, 0x00, 0x00, 0x00
+};
+
+#endif
+
+#define	kAllocateHigh		1
+#define	kAllocateNormal		0
+
+/**************************
+*	
+*	Data Types
+*	
+**************************/
+#pragma mark	-
+#pragma mark	(Data Types)
+
+typedef	struct	{
+	char	instructions[sizeof(kIslandTemplate)];
+	int		allocatedHigh;
+}	BranchIsland;
+
+/**************************
+*	
+*	Funky Protos
+*	
+**************************/
+#pragma mark	-
+#pragma mark	(Funky Protos)
+
+
+	static mach_error_t
+allocateBranchIsland(
+		BranchIsland	**island,
+		int				allocateHigh,
+		void *originalFunctionAddress);
+
+	static mach_error_t
+freeBranchIsland(
+		BranchIsland	*island );
+
+	static mach_error_t
+defaultIslandMalloc(
+	  void **ptr, size_t unused_size, void *hint);
+
+	static mach_error_t
+defaultIslandFree(
+   	void *ptr);
+
+#if defined(__ppc__) || defined(__POWERPC__)
+	static mach_error_t
+setBranchIslandTarget(
+		BranchIsland	*island,
+		const void		*branchTo,
+		long			instruction );
+#endif 
+
+#if defined(__i386__) || defined(__x86_64__)
+static mach_error_t
+setBranchIslandTarget_i386(
+						   BranchIsland	*island,
+						   const void		*branchTo,
+						   char*			instructions );
+// Can't be made static because there's no C implementation for atomic_mov64
+// on i386.
+void 
+atomic_mov64(
+		uint64_t *targetAddress,
+		uint64_t value ) __attribute__((visibility("hidden")));
+
+	static Boolean 
+eatKnownInstructions( 
+	unsigned char	*code, 
+	uint64_t		*newInstruction,
+	int				*howManyEaten, 
+	char			*originalInstructions,
+	int				*originalInstructionCount, 
+	uint8_t			*originalInstructionSizes );
+
+	static void
+fixupInstructions(
+    void		*originalFunction,
+    void		*escapeIsland,
+    void		*instructionsToFix,
+	int			instructionCount,
+	uint8_t		*instructionSizes );
+
+#ifdef DEBUG_DISASM
+	static void
+dump16Bytes(
+	void	*ptr);
+#endif  // DEBUG_DISASM
+#endif
+
+/*******************************************************************************
+*	
+*	Interface
+*	
+*******************************************************************************/
+#pragma mark	-
+#pragma mark	(Interface)
+
+#if defined(__i386__) || defined(__x86_64__)
+static mach_error_t makeIslandExecutable(void *address) {
+	mach_error_t err = err_none;
+    vm_size_t pageSize;
+    host_page_size( mach_host_self(), &pageSize );
+    uintptr_t page = (uintptr_t)address & ~(uintptr_t)(pageSize-1);
+    int e = err_none;
+    e |= mprotect((void *)page, pageSize, PROT_EXEC | PROT_READ | PROT_WRITE);
+    e |= msync((void *)page, pageSize, MS_INVALIDATE );
+    if (e) {
+        err = err_cannot_override;
+    }
+    return err;
+}
+#endif
+
+		static mach_error_t
+defaultIslandMalloc(
+	void **ptr, size_t unused_size, void *hint) {
+  return allocateBranchIsland( (BranchIsland**)ptr, kAllocateHigh, hint );
+}
+		static mach_error_t
+defaultIslandFree(
+	void *ptr) {
+	return freeBranchIsland(ptr);
+}
+
+    mach_error_t
+__asan_mach_override_ptr(
+	void *originalFunctionAddress,
+    const void *overrideFunctionAddress,
+    void **originalFunctionReentryIsland )
+{
+  return __asan_mach_override_ptr_custom(originalFunctionAddress,
+		overrideFunctionAddress,
+		originalFunctionReentryIsland,
+		defaultIslandMalloc,
+		defaultIslandFree);
+}
+
+    mach_error_t
+__asan_mach_override_ptr_custom(
+	void *originalFunctionAddress,
+    const void *overrideFunctionAddress,
+    void **originalFunctionReentryIsland,
+		island_malloc *alloc,
+		island_free *dealloc)
+{
+	assert( originalFunctionAddress );
+	assert( overrideFunctionAddress );
+	
+	// this addresses overriding such functions as AudioOutputUnitStart()
+	// test with modified DefaultOutputUnit project
+#if defined(__x86_64__)
+    for(;;){
+        if(*(uint16_t*)originalFunctionAddress==0x25FF)    // jmp qword near [rip+0x????????]
+            originalFunctionAddress=*(void**)((char*)originalFunctionAddress+6+*(int32_t *)((uint16_t*)originalFunctionAddress+1));
+        else break;
+    }
+#elif defined(__i386__)
+    for(;;){
+        if(*(uint16_t*)originalFunctionAddress==0x25FF)    // jmp *0x????????
+            originalFunctionAddress=**(void***)((uint16_t*)originalFunctionAddress+1);
+        else break;
+    }
+#endif
+#ifdef DEBUG_DISASM
+  {
+    fprintf(stderr, "Replacing function at %p\n", originalFunctionAddress);
+    fprintf(stderr, "First 16 bytes of the function: ");
+    unsigned char *orig = (unsigned char *)originalFunctionAddress;
+    int i;
+    for (i = 0; i < 16; i++) {
+       fprintf(stderr, "%x ", (unsigned int) orig[i]);
+    }
+    fprintf(stderr, "\n");
+    fprintf(stderr, 
+            "To disassemble, save the following function as disas.c"
+            " and run:\n  gcc -c disas.c && gobjdump -d disas.o\n"
+            "The first 16 bytes of the original function will start"
+            " after four nop instructions.\n");
+    fprintf(stderr, "\nvoid foo() {\n  asm volatile(\"nop;nop;nop;nop;\");\n");
+    int j = 0;
+    for (j = 0; j < 2; j++) {
+      fprintf(stderr, "  asm volatile(\".byte ");
+      for (i = 8 * j; i < 8 * (j+1) - 1; i++) {
+        fprintf(stderr, "0x%x, ", (unsigned int) orig[i]);
+      }
+      fprintf(stderr, "0x%x;\");\n", (unsigned int) orig[8 * (j+1) - 1]);
+    }
+    fprintf(stderr, "}\n\n");
+  }
+#endif
+
+	long	*originalFunctionPtr = (long*) originalFunctionAddress;
+	mach_error_t	err = err_none;
+	
+#if defined(__ppc__) || defined(__POWERPC__)
+	//	Ensure first instruction isn't 'mfctr'.
+	#define	kMFCTRMask			0xfc1fffff
+	#define	kMFCTRInstruction	0x7c0903a6
+	
+	long	originalInstruction = *originalFunctionPtr;
+	if( !err && ((originalInstruction & kMFCTRMask) == kMFCTRInstruction) )
+		err = err_cannot_override;
+#elif defined(__i386__) || defined(__x86_64__)
+	int eatenCount = 0;
+	int originalInstructionCount = 0;
+	char originalInstructions[kOriginalInstructionsSize];
+	uint8_t originalInstructionSizes[kOriginalInstructionsSize];
+	uint64_t jumpRelativeInstruction = 0; // JMP
+
+	Boolean overridePossible = eatKnownInstructions ((unsigned char *)originalFunctionPtr, 
+										&jumpRelativeInstruction, &eatenCount, 
+										originalInstructions, &originalInstructionCount, 
+										originalInstructionSizes );
+#ifdef DEBUG_DISASM
+  if (!overridePossible) fprintf(stderr, "overridePossible = false @%d\n", __LINE__);
+#endif
+	if (eatenCount > kOriginalInstructionsSize) {
+#ifdef DEBUG_DISASM
+		fprintf(stderr, "Too many instructions eaten\n");
+#endif    
+		overridePossible = false;
+	}
+	if (!overridePossible) err = err_cannot_override;
+	if (err) fprintf(stderr, "err = %x %s:%d\n", err, __FILE__, __LINE__);
+#endif
+	
+	//	Make the original function implementation writable.
+	if( !err ) {
+		err = vm_protect( mach_task_self(),
+				(vm_address_t) originalFunctionPtr, 8, false,
+				(VM_PROT_ALL | VM_PROT_COPY) );
+		if( err )
+			err = vm_protect( mach_task_self(),
+					(vm_address_t) originalFunctionPtr, 8, false,
+					(VM_PROT_DEFAULT | VM_PROT_COPY) );
+	}
+	if (err) fprintf(stderr, "err = %x %s:%d\n", err, __FILE__, __LINE__);
+	
+	//	Allocate and target the escape island to the overriding function.
+	BranchIsland	*escapeIsland = NULL;
+	if( !err )
+		err = alloc( (void**)&escapeIsland, sizeof(BranchIsland), originalFunctionAddress );
+	if ( err ) fprintf(stderr, "err = %x %s:%d\n", err, __FILE__, __LINE__);
+	
+#if defined(__ppc__) || defined(__POWERPC__)
+	if( !err )
+		err = setBranchIslandTarget( escapeIsland, overrideFunctionAddress, 0 );
+	
+	//	Build the branch absolute instruction to the escape island.
+	long	branchAbsoluteInstruction = 0; // Set to 0 just to silence warning.
+	if( !err ) {
+		long escapeIslandAddress = ((long) escapeIsland) & 0x3FFFFFF;
+		branchAbsoluteInstruction = 0x48000002 | escapeIslandAddress;
+	}
+#elif defined(__i386__) || defined(__x86_64__)
+        if (err) fprintf(stderr, "err = %x %s:%d\n", err, __FILE__, __LINE__);
+
+	if( !err )
+		err = setBranchIslandTarget_i386( escapeIsland, overrideFunctionAddress, 0 );
+ 
+	if (err) fprintf(stderr, "err = %x %s:%d\n", err, __FILE__, __LINE__);
+	// Build the jump relative instruction to the escape island
+#endif
+
+
+#if defined(__i386__) || defined(__x86_64__)
+	if (!err) {
+		uint32_t addressOffset = ((char*)escapeIsland - (char*)originalFunctionPtr - 5);
+		addressOffset = OSSwapInt32(addressOffset);
+		
+		jumpRelativeInstruction |= 0xE900000000000000LL; 
+		jumpRelativeInstruction |= ((uint64_t)addressOffset & 0xffffffff) << 24;
+		jumpRelativeInstruction = OSSwapInt64(jumpRelativeInstruction);		
+	}
+#endif
+	
+	//	Optionally allocate & return the reentry island. This may contain relocated
+	//  jmp instructions and so has all the same addressing reachability requirements
+	//  the escape island has to the original function, except the escape island is
+	//  technically our original function.
+	BranchIsland	*reentryIsland = NULL;
+	if( !err && originalFunctionReentryIsland ) {
+		err = alloc( (void**)&reentryIsland, sizeof(BranchIsland), escapeIsland);
+		if( !err )
+			*originalFunctionReentryIsland = reentryIsland;
+	}
+	
+#if defined(__ppc__) || defined(__POWERPC__)	
+	//	Atomically:
+	//	o If the reentry island was allocated:
+	//		o Insert the original instruction into the reentry island.
+	//		o Target the reentry island at the 2nd instruction of the
+	//		  original function.
+	//	o Replace the original instruction with the branch absolute.
+	if( !err ) {
+		int escapeIslandEngaged = false;
+		do {
+			if( reentryIsland )
+				err = setBranchIslandTarget( reentryIsland,
+						(void*) (originalFunctionPtr+1), originalInstruction );
+			if( !err ) {
+				escapeIslandEngaged = CompareAndSwap( originalInstruction,
+										branchAbsoluteInstruction,
+										(UInt32*)originalFunctionPtr );
+				if( !escapeIslandEngaged ) {
+					//	Someone replaced the instruction out from under us,
+					//	re-read the instruction, make sure it's still not
+					//	'mfctr' and try again.
+					originalInstruction = *originalFunctionPtr;
+					if( (originalInstruction & kMFCTRMask) == kMFCTRInstruction)
+						err = err_cannot_override;
+				}
+			}
+		} while( !err && !escapeIslandEngaged );
+	}
+#elif defined(__i386__) || defined(__x86_64__)
+	// Atomically:
+	//	o If the reentry island was allocated:
+	//		o Insert the original instructions into the reentry island.
+	//		o Target the reentry island at the first non-replaced 
+	//        instruction of the original function.
+	//	o Replace the original first instructions with the jump relative.
+	//
+	// Note that on i386, we do not support someone else changing the code under our feet
+	if ( !err ) {
+		fixupInstructions(originalFunctionPtr, reentryIsland, originalInstructions,
+					originalInstructionCount, originalInstructionSizes );
+	
+		if( reentryIsland )
+			err = setBranchIslandTarget_i386( reentryIsland,
+										 (void*) ((char *)originalFunctionPtr+eatenCount), originalInstructions );
+		// try making islands executable before planting the jmp
+#if defined(__x86_64__) || defined(__i386__)
+        if( !err )
+            err = makeIslandExecutable(escapeIsland);
+        if( !err && reentryIsland )
+            err = makeIslandExecutable(reentryIsland);
+#endif
+		if ( !err )
+			atomic_mov64((uint64_t *)originalFunctionPtr, jumpRelativeInstruction);
+	}
+#endif
+	
+	//	Clean up on error.
+	if( err ) {
+		if( reentryIsland )
+			dealloc( reentryIsland );
+		if( escapeIsland )
+			dealloc( escapeIsland );
+	}
+
+#ifdef DEBUG_DISASM
+  {
+    fprintf(stderr, "First 16 bytes of the function after slicing: ");
+    unsigned char *orig = (unsigned char *)originalFunctionAddress;
+    int i;
+    for (i = 0; i < 16; i++) {
+       fprintf(stderr, "%x ", (unsigned int) orig[i]);
+    }
+    fprintf(stderr, "\n");
+  }
+#endif
+	return err;
+}
+
+/*******************************************************************************
+*	
+*	Implementation
+*	
+*******************************************************************************/
+#pragma mark	-
+#pragma mark	(Implementation)
+
+/***************************************************************************//**
+	Implementation: Allocates memory for a branch island.
+	
+	@param	island			<-	The allocated island.
+	@param	allocateHigh	->	Whether to allocate the island at the end of the
+								address space (for use with the branch absolute
+								instruction).
+	@result					<-	mach_error_t
+
+	***************************************************************************/
+
+	static mach_error_t
+allocateBranchIsland(
+		BranchIsland	**island,
+		int				allocateHigh,
+		void *originalFunctionAddress)
+{
+	assert( island );
+	
+	mach_error_t	err = err_none;
+	
+	if( allocateHigh ) {
+		vm_size_t pageSize;
+		err = host_page_size( mach_host_self(), &pageSize );
+		if( !err ) {
+			assert( sizeof( BranchIsland ) <= pageSize );
+#if defined(__ppc__) || defined(__POWERPC__)
+			vm_address_t first = 0xfeffffff;
+			vm_address_t last = 0xfe000000 + pageSize;
+#elif defined(__x86_64__)
+			vm_address_t first = ((uint64_t)originalFunctionAddress & ~(uint64_t)(((uint64_t)1 << 31) - 1)) | ((uint64_t)1 << 31); // start in the middle of the page?
+			vm_address_t last = 0x0;
+#else
+			vm_address_t first = 0xffc00000;
+			vm_address_t last = 0xfffe0000;
+#endif
+
+			vm_address_t page = first;
+			int allocated = 0;
+			vm_map_t task_self = mach_task_self();
+			
+			while( !err && !allocated && page != last ) {
+
+				err = vm_allocate( task_self, &page, pageSize, 0 );
+				if( err == err_none )
+					allocated = 1;
+				else if( err == KERN_NO_SPACE ) {
+#if defined(__x86_64__)
+					page -= pageSize;
+#else
+					page += pageSize;
+#endif
+					err = err_none;
+				}
+			}
+			if( allocated )
+				*island = (BranchIsland*) page;
+			else if( !allocated && !err )
+				err = KERN_NO_SPACE;
+		}
+	} else {
+		void *block = malloc( sizeof( BranchIsland ) );
+		if( block )
+			*island = block;
+		else
+			err = KERN_NO_SPACE;
+	}
+	if( !err )
+		(**island).allocatedHigh = allocateHigh;
+	
+	return err;
+}
+
+/***************************************************************************//**
+	Implementation: Deallocates memory for a branch island.
+	
+	@param	island	->	The island to deallocate.
+	@result			<-	mach_error_t
+
+	***************************************************************************/
+
+	static mach_error_t
+freeBranchIsland(
+		BranchIsland	*island )
+{
+	assert( island );
+	assert( (*(long*)&island->instructions[0]) == kIslandTemplate[0] );
+	assert( island->allocatedHigh );
+	
+	mach_error_t	err = err_none;
+	
+	if( island->allocatedHigh ) {
+		vm_size_t pageSize;
+		err = host_page_size( mach_host_self(), &pageSize );
+		if( !err ) {
+			assert( sizeof( BranchIsland ) <= pageSize );
+			err = vm_deallocate(
+					mach_task_self(),
+					(vm_address_t) island, pageSize );
+		}
+	} else {
+		free( island );
+	}
+	
+	return err;
+}
+
+/***************************************************************************//**
+	Implementation: Sets the branch island's target, with an optional
+	instruction.
+	
+	@param	island		->	The branch island to insert target into.
+	@param	branchTo	->	The address of the target.
+	@param	instruction	->	Optional instruction to execute prior to branch. Set
+							to zero for nop.
+	@result				<-	mach_error_t
+
+	***************************************************************************/
+#if defined(__ppc__) || defined(__POWERPC__)
+	static mach_error_t
+setBranchIslandTarget(
+		BranchIsland	*island,
+		const void		*branchTo,
+		long			instruction )
+{
+	//	Copy over the template code.
+    bcopy( kIslandTemplate, island->instructions, sizeof( kIslandTemplate ) );
+    
+    //	Fill in the address.
+    ((short*)island->instructions)[kAddressLo] = ((long) branchTo) & 0x0000FFFF;
+    ((short*)island->instructions)[kAddressHi]
+    	= (((long) branchTo) >> 16) & 0x0000FFFF;
+    
+    //	Fill in the (optional) instuction.
+    if( instruction != 0 ) {
+        ((short*)island->instructions)[kInstructionLo]
+        	= instruction & 0x0000FFFF;
+        ((short*)island->instructions)[kInstructionHi]
+        	= (instruction >> 16) & 0x0000FFFF;
+    }
+    
+    //MakeDataExecutable( island->instructions, sizeof( kIslandTemplate ) );
+	msync( island->instructions, sizeof( kIslandTemplate ), MS_INVALIDATE );
+    
+    return err_none;
+}
+#endif 
+
+#if defined(__i386__)
+	static mach_error_t
+setBranchIslandTarget_i386(
+	BranchIsland	*island,
+	const void		*branchTo,
+	char*			instructions )
+{
+
+	//	Copy over the template code.
+    bcopy( kIslandTemplate, island->instructions, sizeof( kIslandTemplate ) );
+
+	// copy original instructions
+	if (instructions) {
+		bcopy (instructions, island->instructions + kInstructions, kOriginalInstructionsSize);
+	}
+	
+    // Fill in the address.
+    int32_t addressOffset = (char *)branchTo - (island->instructions + kJumpAddress + 4);
+    *((int32_t *)(island->instructions + kJumpAddress)) = addressOffset; 
+
+    msync( island->instructions, sizeof( kIslandTemplate ), MS_INVALIDATE );
+    return err_none;
+}
+
+#elif defined(__x86_64__)
+static mach_error_t
+setBranchIslandTarget_i386(
+        BranchIsland	*island,
+        const void		*branchTo,
+        char*			instructions )
+{
+    // Copy over the template code.
+    bcopy( kIslandTemplate, island->instructions, sizeof( kIslandTemplate ) );
+
+    // Copy original instructions.
+    if (instructions) {
+        bcopy (instructions, island->instructions, kOriginalInstructionsSize);
+    }
+
+    //	Fill in the address.
+    *((uint64_t *)(island->instructions + kJumpAddress)) = (uint64_t)branchTo; 
+    msync( island->instructions, sizeof( kIslandTemplate ), MS_INVALIDATE );
+
+    return err_none;
+}
+#endif
+
+
+#if defined(__i386__) || defined(__x86_64__)
+// simplistic instruction matching
+typedef struct {
+	unsigned int length; // max 15
+	unsigned char mask[15]; // sequence of bytes in memory order
+	unsigned char constraint[15]; // sequence of bytes in memory order
+}	AsmInstructionMatch;
+
+#if defined(__i386__)
+static AsmInstructionMatch possibleInstructions[] = {
+	{ 0x5, {0xFF, 0x00, 0x00, 0x00, 0x00}, {0xE9, 0x00, 0x00, 0x00, 0x00} },	// jmp 0x????????
+	{ 0x5, {0xFF, 0xFF, 0xFF, 0xFF, 0xFF}, {0x55, 0x89, 0xe5, 0xc9, 0xc3} },	// push %esp; mov %esp,%ebp; leave; ret
+	{ 0x1, {0xFF}, {0x90} },							// nop
+	{ 0x1, {0xF8}, {0x50} },							// push %reg
+	{ 0x2, {0xFF, 0xFF}, {0x89, 0xE5} },				                // mov %esp,%ebp
+	{ 0x3, {0xFF, 0xFF, 0xFF}, {0x89, 0x1C, 0x24} },				                // mov %ebx,(%esp)
+	{ 0x3, {0xFF, 0xFF, 0x00}, {0x83, 0xEC, 0x00} },	                        // sub 0x??, %esp
+	{ 0x6, {0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00}, {0x81, 0xEC, 0x00, 0x00, 0x00, 0x00} },	// sub 0x??, %esp with 32bit immediate
+	{ 0x2, {0xFF, 0xFF}, {0x31, 0xC0} },						// xor %eax, %eax
+	{ 0x3, {0xFF, 0x4F, 0x00}, {0x8B, 0x45, 0x00} },  // mov $imm(%ebp), %reg
+	{ 0x3, {0xFF, 0x4C, 0x00}, {0x8B, 0x40, 0x00} },  // mov $imm(%eax-%edx), %reg
+	{ 0x3, {0xFF, 0xCF, 0x00}, {0x8B, 0x4D, 0x00} },  // mov $imm(%rpb), %reg
+	{ 0x3, {0xFF, 0x4F, 0x00}, {0x8A, 0x4D, 0x00} },  // mov $imm(%ebp), %cl
+	{ 0x4, {0xFF, 0xFF, 0xFF, 0x00}, {0x8B, 0x4C, 0x24, 0x00} },  			// mov $imm(%esp), %ecx
+	{ 0x4, {0xFF, 0x00, 0x00, 0x00}, {0x8B, 0x00, 0x00, 0x00} },  			// mov r16,r/m16 or r32,r/m32
+	{ 0x5, {0xFF, 0x00, 0x00, 0x00, 0x00}, {0xB9, 0x00, 0x00, 0x00, 0x00} }, 	// mov $imm, %ecx
+	{ 0x5, {0xFF, 0x00, 0x00, 0x00, 0x00}, {0xB8, 0x00, 0x00, 0x00, 0x00} }, 	// mov $imm, %eax
+	{ 0x4, {0xFF, 0xFF, 0xFF, 0x00}, {0x66, 0x0F, 0xEF, 0x00} },             	// pxor xmm2/128, xmm1
+	{ 0x2, {0xFF, 0xFF}, {0xDB, 0xE3} }, 						// fninit
+	{ 0x5, {0xFF, 0x00, 0x00, 0x00, 0x00}, {0xE8, 0x00, 0x00, 0x00, 0x00} },	// call $imm
+	{ 0x4, {0xFF, 0xFF, 0xFF, 0x00}, {0x0F, 0xBE, 0x55, 0x00} },                    // movsbl $imm(%ebp), %edx
+	{ 0x0, {0x00}, {0x00} }
+};
+#elif defined(__x86_64__)
+// TODO(glider): disassembling the "0x48, 0x89" sequences is trickier than it's done below.
+// If it stops working, refer to http://ref.x86asm.net/geek.html#modrm_byte_32_64 to do it
+// more accurately.
+// Note: 0x48 is in fact the REX.W prefix, but it might be wrong to treat it as a separate
+// instruction.
+static AsmInstructionMatch possibleInstructions[] = {
+	{ 0x5, {0xFF, 0x00, 0x00, 0x00, 0x00}, {0xE9, 0x00, 0x00, 0x00, 0x00} },	// jmp 0x????????
+	{ 0x1, {0xFF}, {0x90} },							// nop
+	{ 0x1, {0xF8}, {0x50} },							// push %rX
+	{ 0x1, {0xFF}, {0x65} },							// GS prefix
+	{ 0x3, {0xFF, 0xFF, 0xFF}, {0x48, 0x89, 0xE5} },				// mov %rsp,%rbp
+	{ 0x4, {0xFF, 0xFF, 0xFF, 0x00}, {0x48, 0x83, 0xEC, 0x00} },	                // sub 0x??, %rsp
+	{ 0x4, {0xFB, 0xFF, 0x07, 0x00}, {0x48, 0x89, 0x05, 0x00} },	                // move onto rbp
+	{ 0x3, {0xFB, 0xFF, 0x00}, {0x48, 0x89, 0x00} },	                            // mov %reg, %reg
+	{ 0x3, {0xFB, 0xFF, 0x00}, {0x49, 0x89, 0x00} },	                            // mov %reg, %reg (REX.WB)
+	{ 0x2, {0xFF, 0x00}, {0x41, 0x00} },						// push %rXX
+	{ 0x2, {0xFF, 0x00}, {0x84, 0x00} },						// test %rX8,%rX8
+	{ 0x2, {0xFF, 0x00}, {0x85, 0x00} },						// test %rX,%rX
+	{ 0x2, {0xFF, 0x00}, {0x77, 0x00} },						// ja $i8
+	{ 0x2, {0xFF, 0x00}, {0x74, 0x00} },						// je $i8
+	{ 0x5, {0xF8, 0x00, 0x00, 0x00, 0x00}, {0xB8, 0x00, 0x00, 0x00, 0x00} },	// mov $imm, %reg
+	{ 0x3, {0xFF, 0xFF, 0x00}, {0xFF, 0x77, 0x00} },				// pushq $imm(%rdi)
+	{ 0x2, {0xFF, 0xFF}, {0x31, 0xC0} },						// xor %eax, %eax
+	{ 0x5, {0xFF, 0x00, 0x00, 0x00, 0x00}, {0x25, 0x00, 0x00, 0x00, 0x00} },	// and $imm, %eax
+	{ 0x3, {0xFF, 0xFF, 0xFF}, {0x80, 0x3F, 0x00} },				// cmpb $imm, (%rdi)
+
+  { 0x8, {0xFF, 0xFF, 0xCF, 0xFF, 0x00, 0x00, 0x00, 0x00},
+         {0x48, 0x8B, 0x04, 0x25, 0x00, 0x00, 0x00, 0x00}, },                     // mov $imm, %{rax,rdx,rsp,rsi}
+  { 0x4, {0xFF, 0xFF, 0xFF, 0x00}, {0x48, 0x83, 0xFA, 0x00}, },   // cmp $i8, %rdx
+	{ 0x4, {0xFF, 0xFF, 0x00, 0x00}, {0x83, 0x7f, 0x00, 0x00}, },			// cmpl $imm, $imm(%rdi)
+	{ 0xa, {0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
+               {0x48, 0xB8, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00} },    // mov $imm, %rax
+        { 0x6, {0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00},
+               {0x81, 0xE6, 0x00, 0x00, 0x00, 0x00} },                            // and $imm, %esi
+        { 0x6, {0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00},
+               {0xFF, 0x25, 0x00, 0x00, 0x00, 0x00} },                            // jmpq *(%rip)
+        { 0x4, {0xFF, 0xFF, 0xFF, 0x00}, {0x66, 0x0F, 0xEF, 0x00} },              // pxor xmm2/128, xmm1
+        { 0x2, {0xFF, 0x00}, {0x89, 0x00} },                               // mov r/m32,r32 or r/m16,r16
+        { 0x3, {0xFF, 0xFF, 0xFF}, {0x49, 0x89, 0xF8} },                   // mov %rdi,%r8
+        { 0x4, {0xFF, 0xFF, 0xFF, 0xFF}, {0x40, 0x0F, 0xBE, 0xCE} },       // movsbl %sil,%ecx
+        { 0x7, {0xFF, 0xFF, 0xFF, 0x00, 0x00, 0x00, 0x00},
+               {0x48, 0x8D, 0x05, 0x00, 0x00, 0x00, 0x00} },  // lea $imm(%rip),%rax
+        { 0x3, {0xFF, 0xFF, 0xFF}, {0x0F, 0xBE, 0xCE} },  // movsbl, %dh, %ecx
+        { 0x3, {0xFF, 0xFF, 0x00}, {0xFF, 0x77, 0x00} },  // pushq $imm(%rdi)
+        { 0x2, {0xFF, 0xFF}, {0xDB, 0xE3} }, // fninit
+        { 0x3, {0xFF, 0xFF, 0xFF}, {0x48, 0x85, 0xD2} },  // test %rdx,%rdx
+	{ 0x0, {0x00}, {0x00} }
+};
+#endif
+
+static Boolean codeMatchesInstruction(unsigned char *code, AsmInstructionMatch* instruction) 
+{
+	Boolean match = true;
+	
+	size_t i;
+  assert(instruction);
+#ifdef DEBUG_DISASM
+	fprintf(stderr, "Matching: ");
+#endif  
+	for (i=0; i<instruction->length; i++) {
+		unsigned char mask = instruction->mask[i];
+		unsigned char constraint = instruction->constraint[i];
+		unsigned char codeValue = code[i];
+#ifdef DEBUG_DISASM
+		fprintf(stderr, "%x ", (unsigned)codeValue);
+#endif    
+		match = ((codeValue & mask) == constraint);
+		if (!match) break;
+	}
+#ifdef DEBUG_DISASM
+	if (match) {
+		fprintf(stderr, " OK\n");
+	} else {
+		fprintf(stderr, " FAIL\n");
+	}
+#endif  
+	return match;
+}
+
+#if defined(__i386__) || defined(__x86_64__)
+	static Boolean 
+eatKnownInstructions( 
+	unsigned char	*code, 
+	uint64_t		*newInstruction,
+	int				*howManyEaten, 
+	char			*originalInstructions,
+	int				*originalInstructionCount, 
+	uint8_t			*originalInstructionSizes )
+{
+	Boolean allInstructionsKnown = true;
+	int totalEaten = 0;
+	unsigned char* ptr = code;
+	int remainsToEat = 5; // a JMP instruction takes 5 bytes
+	int instructionIndex = 0;
+	
+	if (howManyEaten) *howManyEaten = 0;
+	if (originalInstructionCount) *originalInstructionCount = 0;
+	while (remainsToEat > 0) {
+		Boolean curInstructionKnown = false;
+		
+		// See if instruction matches one  we know
+		AsmInstructionMatch* curInstr = possibleInstructions;
+		do { 
+			if ((curInstructionKnown = codeMatchesInstruction(ptr, curInstr))) break;
+			curInstr++;
+		} while (curInstr->length > 0);
+		
+		// if all instruction matches failed, we don't know current instruction then, stop here
+		if (!curInstructionKnown) { 
+			allInstructionsKnown = false;
+			fprintf(stderr, "mach_override: some instructions unknown! Need to update mach_override.c\n");
+			break;
+		}
+		
+		// At this point, we've matched curInstr
+		int eaten = curInstr->length;
+		ptr += eaten;
+		remainsToEat -= eaten;
+		totalEaten += eaten;
+		
+		if (originalInstructionSizes) originalInstructionSizes[instructionIndex] = eaten;
+		instructionIndex += 1;
+		if (originalInstructionCount) *originalInstructionCount = instructionIndex;
+	}
+
+
+	if (howManyEaten) *howManyEaten = totalEaten;
+
+	if (originalInstructions) {
+		Boolean enoughSpaceForOriginalInstructions = (totalEaten < kOriginalInstructionsSize);
+		
+		if (enoughSpaceForOriginalInstructions) {
+			memset(originalInstructions, 0x90 /* NOP */, kOriginalInstructionsSize); // fill instructions with NOP
+			bcopy(code, originalInstructions, totalEaten);
+		} else {
+#ifdef DEBUG_DISASM
+			fprintf(stderr, "Not enough space in island to store original instructions. Adapt the island definition and kOriginalInstructionsSize\n");
+#endif      
+			return false;
+		}
+	}
+	
+	if (allInstructionsKnown) {
+		// save last 3 bytes of first 64bits of codre we'll replace
+		uint64_t currentFirst64BitsOfCode = *((uint64_t *)code);
+		currentFirst64BitsOfCode = OSSwapInt64(currentFirst64BitsOfCode); // back to memory representation
+		currentFirst64BitsOfCode &= 0x0000000000FFFFFFLL; 
+		
+		// keep only last 3 instructions bytes, first 5 will be replaced by JMP instr
+		*newInstruction &= 0xFFFFFFFFFF000000LL; // clear last 3 bytes
+		*newInstruction |= (currentFirst64BitsOfCode & 0x0000000000FFFFFFLL); // set last 3 bytes
+	}
+
+	return allInstructionsKnown;
+}
+
+	static void
+fixupInstructions(
+    void		*originalFunction,
+    void		*escapeIsland,
+    void		*instructionsToFix,
+	int			instructionCount,
+	uint8_t		*instructionSizes )
+{
+	void *initialOriginalFunction = originalFunction;
+	int	index, fixed_size, code_size = 0;
+	for (index = 0;index < instructionCount;index += 1)
+		code_size += instructionSizes[index];
+
+#ifdef DEBUG_DISASM
+	void *initialInstructionsToFix = instructionsToFix;
+	fprintf(stderr, "BEFORE FIXING:\n");
+	dump16Bytes(initialOriginalFunction);
+	dump16Bytes(initialInstructionsToFix);
+#endif  // DEBUG_DISASM
+
+	for (index = 0;index < instructionCount;index += 1)
+	{
+                fixed_size = instructionSizes[index];
+		if ((*(uint8_t*)instructionsToFix == 0xE9) || // 32-bit jump relative
+		    (*(uint8_t*)instructionsToFix == 0xE8))   // 32-bit call relative
+		{
+			uint32_t offset = (uintptr_t)originalFunction - (uintptr_t)escapeIsland;
+			uint32_t *jumpOffsetPtr = (uint32_t*)((uintptr_t)instructionsToFix + 1);
+			*jumpOffsetPtr += offset;
+		}
+		if ((*(uint8_t*)instructionsToFix == 0x74) ||  // Near jump if equal (je), 2 bytes.
+		    (*(uint8_t*)instructionsToFix == 0x77))    // Near jump if above (ja), 2 bytes.
+		{
+			// We replace a near je/ja instruction, "7P JJ", with a 32-bit je/ja, "0F 8P WW XX YY ZZ".
+			// This is critical, otherwise a near jump will likely fall outside the original function.
+			uint32_t offset = (uintptr_t)initialOriginalFunction - (uintptr_t)escapeIsland;
+			uint32_t jumpOffset = *(uint8_t*)((uintptr_t)instructionsToFix + 1);
+			*((uint8_t*)instructionsToFix + 1) = *(uint8_t*)instructionsToFix + 0x10;
+			*(uint8_t*)instructionsToFix = 0x0F;
+			uint32_t *jumpOffsetPtr = (uint32_t*)((uintptr_t)instructionsToFix + 2 );
+			*jumpOffsetPtr = offset + jumpOffset;
+			fixed_size = 6;
+                }
+		
+		originalFunction = (void*)((uintptr_t)originalFunction + instructionSizes[index]);
+		escapeIsland = (void*)((uintptr_t)escapeIsland + instructionSizes[index]);
+		instructionsToFix = (void*)((uintptr_t)instructionsToFix + fixed_size);
+
+		// Expanding short instructions into longer ones may overwrite the next instructions,
+		// so we must restore them.
+		code_size -= fixed_size;
+		if ((code_size > 0) && (fixed_size != instructionSizes[index])) {
+			bcopy(originalFunction, instructionsToFix, code_size);
+		}
+	}
+#ifdef DEBUG_DISASM
+	fprintf(stderr, "AFTER_FIXING:\n");
+	dump16Bytes(initialOriginalFunction);
+	dump16Bytes(initialInstructionsToFix);
+#endif  // DEBUG_DISASM
+}
+
+#ifdef DEBUG_DISASM
+#define HEX_DIGIT(x) ((((x) % 16) < 10) ? ('0' + ((x) % 16)) : ('A' + ((x) % 16 - 10)))
+
+	static void
+dump16Bytes(
+	void 	*ptr) {
+	int i;
+	char buf[3];
+	uint8_t *bytes = (uint8_t*)ptr;
+	for (i = 0; i < 16; i++) {
+		buf[0] = HEX_DIGIT(bytes[i] / 16);
+		buf[1] = HEX_DIGIT(bytes[i] % 16);
+		buf[2] = ' ';
+		write(2, buf, 3);
+	}
+	write(2, "\n", 1);
+}
+#endif  // DEBUG_DISASM
+#endif
+
+#if defined(__i386__)
+__asm(
+			".text;"
+			".align 2, 0x90;"
+			"_atomic_mov64:;"
+			"	pushl %ebp;"
+			"	movl %esp, %ebp;"
+			"	pushl %esi;"
+			"	pushl %ebx;"
+			"	pushl %ecx;"
+			"	pushl %eax;"
+			"	pushl %edx;"
+	
+			// atomic push of value to an address
+			// we use cmpxchg8b, which compares content of an address with 
+			// edx:eax. If they are equal, it atomically puts 64bit value 
+			// ecx:ebx in address. 
+			// We thus put contents of address in edx:eax to force ecx:ebx
+			// in address
+			"	mov		8(%ebp), %esi;"  // esi contains target address
+			"	mov		12(%ebp), %ebx;"
+			"	mov		16(%ebp), %ecx;" // ecx:ebx now contains value to put in target address
+			"	mov		(%esi), %eax;"
+			"	mov		4(%esi), %edx;"  // edx:eax now contains value currently contained in target address
+			"	lock; cmpxchg8b	(%esi);" // atomic move.
+			
+			// restore registers
+			"	popl %edx;"
+			"	popl %eax;"
+			"	popl %ecx;"
+			"	popl %ebx;"
+			"	popl %esi;"
+			"	popl %ebp;"
+			"	ret"
+);
+#elif defined(__x86_64__)
+void atomic_mov64(
+		uint64_t *targetAddress,
+		uint64_t value )
+{
+    *targetAddress = value;
+}
+#endif
+#endif
+#endif  // __APPLE__
--- /dev/null	2012-11-16 10:24:58.000000000 -0500
+++ libsanitizer/interception/mach_override/mach_override.h	2012-11-16 10:26:52.000000000 -0500
@@ -0,0 +1,140 @@
+/*******************************************************************************
+	mach_override.h
+		Copyright (c) 2003-2009 Jonathan 'Wolf' Rentzsch: <http://rentzsch.com>
+		Some rights reserved: <http://opensource.org/licenses/mit-license.php>
+
+	***************************************************************************/
+
+/***************************************************************************//**
+	@mainpage	mach_override
+	@author		Jonathan 'Wolf' Rentzsch: <http://rentzsch.com>
+	
+	This package, coded in C to the Mach API, allows you to override ("patch")
+	program- and system-supplied functions at runtime. You can fully replace
+	functions with your implementations, or merely head- or tail-patch the
+	original implementations.
+	
+	Use it by #include'ing mach_override.h from your .c, .m or .mm file(s).
+	
+	@todo	Discontinue use of Carbon's MakeDataExecutable() and
+			CompareAndSwap() calls and start using the Mach equivalents, if they
+			exist. If they don't, write them and roll them in. That way, this
+			code will be pure Mach, which will make it easier to use everywhere.
+			Update: MakeDataExecutable() has been replaced by
+			msync(MS_INVALIDATE). There is an OSCompareAndSwap in libkern, but
+			I'm currently unsure if I can link against it. May have to roll in
+			my own version...
+	@todo	Stop using an entire 4K high-allocated VM page per 28-byte escape
+			branch island. Done right, this will dramatically speed up escape
+			island allocations when they number over 250. Then again, if you're
+			overriding more than 250 functions, maybe speed isn't your main
+			concern...
+	@todo	Add detection of: b, bl, bla, bc, bcl, bcla, bcctrl, bclrl
+			first-instructions. Initially, we should refuse to override
+			functions beginning with these instructions. Eventually, we should
+			dynamically rewrite them to make them position-independent.
+	@todo	Write mach_unoverride(), which would remove an override placed on a
+			function. Must be multiple-override aware, which means an almost
+			complete rewrite under the covers, because the target address can't
+			be spread across two load instructions like it is now since it will
+			need to be atomically updatable.
+	@todo	Add non-rentry variants of overrides to test_mach_override.
+
+	***************************************************************************/
+
+#ifdef __APPLE__
+
+#ifndef		_mach_override_
+#define		_mach_override_
+
+#include <sys/types.h>
+#include <mach/error.h>
+
+#ifdef	__cplusplus
+	extern	"C"	{
+#endif
+
+/**
+	Returned if the function to be overrided begins with a 'mfctr' instruction.
+*/
+#define	err_cannot_override	(err_local|1)
+
+/************************************************************************************//**
+	Dynamically overrides the function implementation referenced by
+	originalFunctionAddress with the implentation pointed to by overrideFunctionAddress.
+	Optionally returns a pointer to a "reentry island" which, if jumped to, will resume
+	the original implementation.
+	
+	@param	originalFunctionAddress			->	Required address of the function to
+												override (with overrideFunctionAddress).
+	@param	overrideFunctionAddress			->	Required address to the overriding
+												function.
+	@param	originalFunctionReentryIsland	<-	Optional pointer to pointer to the
+												reentry island. Can be NULL.
+	@result									<-	err_cannot_override if the original
+												function's implementation begins with
+												the 'mfctr' instruction.
+
+	************************************************************************************/
+
+// We're prefixing mach_override_ptr() with "__asan_" to avoid name conflicts with other
+// mach_override_ptr() implementations that may appear in the client program.
+    mach_error_t
+__asan_mach_override_ptr(
+	void *originalFunctionAddress,
+    const void *overrideFunctionAddress,
+    void **originalFunctionReentryIsland );
+
+// Allow to use custom allocation and deallocation routines with mach_override_ptr().
+// This should help to speed up the things on x86_64.
+typedef mach_error_t island_malloc( void **ptr, size_t size, void *hint );
+typedef mach_error_t island_free( void *ptr );
+
+    mach_error_t
+__asan_mach_override_ptr_custom(
+	void *originalFunctionAddress,
+    const void *overrideFunctionAddress,
+    void **originalFunctionReentryIsland,
+    island_malloc *alloc,
+    island_free *dealloc );
+
+/************************************************************************************//**
+	
+
+	************************************************************************************/
+ 
+#ifdef	__cplusplus
+
+#define MACH_OVERRIDE( ORIGINAL_FUNCTION_RETURN_TYPE, ORIGINAL_FUNCTION_NAME, ORIGINAL_FUNCTION_ARGS, ERR )			\
+	{																												\
+		static ORIGINAL_FUNCTION_RETURN_TYPE (*ORIGINAL_FUNCTION_NAME##_reenter)ORIGINAL_FUNCTION_ARGS;				\
+		static bool ORIGINAL_FUNCTION_NAME##_overriden = false;														\
+		class mach_override_class__##ORIGINAL_FUNCTION_NAME {														\
+		public:																										\
+			static kern_return_t override(void *originalFunctionPtr) {												\
+				kern_return_t result = err_none;																	\
+				if (!ORIGINAL_FUNCTION_NAME##_overriden) {															\
+					ORIGINAL_FUNCTION_NAME##_overriden = true;														\
+					result = mach_override_ptr( (void*)originalFunctionPtr,											\
+												(void*)mach_override_class__##ORIGINAL_FUNCTION_NAME::replacement,	\
+												(void**)&ORIGINAL_FUNCTION_NAME##_reenter );						\
+				}																									\
+				return result;																						\
+			}																										\
+			static ORIGINAL_FUNCTION_RETURN_TYPE replacement ORIGINAL_FUNCTION_ARGS {
+
+#define END_MACH_OVERRIDE( ORIGINAL_FUNCTION_NAME )																	\
+			}																										\
+		};																											\
+																													\
+		err = mach_override_class__##ORIGINAL_FUNCTION_NAME::override((void*)ORIGINAL_FUNCTION_NAME);				\
+	}
+ 
+#endif
+
+#ifdef	__cplusplus
+	}
+#endif
+#endif	//	_mach_override_
+
+#endif  // __APPLE__
Index: libsanitizer/configure.ac
===================================================================
--- libsanitizer/configure.ac	(revision 193563)
+++ libsanitizer/configure.ac	(working copy)
@@ -73,6 +73,12 @@ else
   multilib_arg=
 fi
 
+case "$host" in
+  *-*-darwin*) MACH_OVERRIDE=true ;;
+  *) MACH_OVERRIDE=false ;;
+esac
+AM_CONDITIONAL(USING_MACH_OVERRIDE, $MACH_OVERRIDE)
+
 AC_CONFIG_FILES([Makefile])
 
 AC_CONFIG_FILES(AC_FOREACH([DIR], [interception sanitizer_common asan], [DIR/Makefile ]),
Index: libsanitizer/interception/Makefile.am
===================================================================
--- libsanitizer/interception/Makefile.am	(revision 193563)
+++ libsanitizer/interception/Makefile.am	(working copy)
@@ -14,7 +14,11 @@ interception_files = \
         interception_mac.cc \
         interception_win.cc
 
-libinterception_la_SOURCES = $(interception_files) 
+if USING_MACH_OVERRIDE
+libinterception_la_SOURCES = $(interception_files) mach_override/mach_override.c
+else
+libinterception_la_SOURCES = $(interception_files)
+endif
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
 # values defined in terms of make variables, as is the case for CC and
Index: libsanitizer/configure.tgt
===================================================================
--- libsanitizer/configure.tgt	(revision 193563)
+++ libsanitizer/configure.tgt	(working copy)
@@ -22,6 +22,8 @@
 case "${target}" in
   x86_64-*-linux* | i?86-*-linux* | sparc*-*-linux*)
 	;;
+  x86_64-*-darwin* | i?86-*-darwin*)
+	;;
   *)
 	UNSUPPORTED=1
 	;;
Index: gcc/config/darwin.h
===================================================================
--- gcc/config/darwin.h	(revision 193563)
+++ gcc/config/darwin.h	(working copy)
@@ -180,6 +180,9 @@ extern GTY(()) int darwin_ms_struct;
     %{L*} %(link_libgcc) %o %{fprofile-arcs|fprofile-generate*|coverage:-lgcov} \
     %{fopenmp|ftree-parallelize-loops=*: \
       %{static|static-libgcc|static-libstdc++|static-libgfortran: libgomp.a%s; : -lgomp } } \
+    %{faddress-sanitizer: \
+      %{static|static-libgcc|static-libgfortran: -framework CoreFoundation -lstdc++ libasan.a%s; \
+      static-libstdc++: -framework CoreFoundation libstdc++.a%s libasan.a%s; : -framework CoreFoundation -lasan } } \
     %{fgnu-tm: \
       %{static|static-libgcc|static-libstdc++|static-libgfortran: libitm.a%s; : -litm } } \
     %{!nostdlib:%{!nodefaultlibs:\

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-16  8:27     ` Dodji Seketeli
  2012-11-16 14:03       ` Jack Howarth
  2012-11-16 15:57       ` Jack Howarth
@ 2012-11-16 16:56       ` Alexander Potapenko
  2012-11-16 17:06         ` Jack Howarth
  2 siblings, 1 reply; 80+ messages in thread
From: Alexander Potapenko @ 2012-11-16 16:56 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Konstantin Serebryany, Jack Howarth, gcc-patches, Diego Novillo,
	jakub, Wei Mi, David Li, mikestump

>> Also, Alexander Potapenko is the best person to ask about asan-darwin.
>
> .... here.
>
>> Maybe we can add him to the list of sanitizer maintainers?
>
> Seconded.  At least for libsanitier/Darwin.
>
> Cheers.

I can take this, but I'll be busy within the several upcoming days
(till mid-next-week), so I won't be able to merge any patches from
LLVM into GCC.
Also, I'm not a GCC committer yet, and I guess someone will need to nominate me.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH 00/13] Request to merge Address Sanitizer in
  2012-11-16 16:56       ` Alexander Potapenko
@ 2012-11-16 17:06         ` Jack Howarth
  0 siblings, 0 replies; 80+ messages in thread
From: Jack Howarth @ 2012-11-16 17:06 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Dodji Seketeli, Konstantin Serebryany, gcc-patches,
	Diego Novillo, jakub, Wei Mi, David Li, mikestump

On Fri, Nov 16, 2012 at 08:55:52PM +0400, Alexander Potapenko wrote:
> >> Also, Alexander Potapenko is the best person to ask about asan-darwin.
> >
> > .... here.
> >
> >> Maybe we can add him to the list of sanitizer maintainers?
> >
> > Seconded.  At least for libsanitier/Darwin.
> >
> > Cheers.
> 
> I can take this, but I'll be busy within the several upcoming days
> (till mid-next-week), so I won't be able to merge any patches from
> LLVM into GCC.
> Also, I'm not a GCC committer yet, and I guess someone will need to nominate me.

Hopefully either Jakub or Dodji can handle the initial import of the
mach_override files for now. I rather not have to keep updating the patches
in the meanwhile for bit-rot.
        Jack

^ permalink raw reply	[flat|nested] 80+ messages in thread

end of thread, other threads:[~2012-11-16 17:06 UTC | newest]

Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-01 19:53 [PATCH 00/13] Request to merge Address Sanitizer in dodji
2012-11-01 19:53 ` [PATCH 08/13] Fix a couple of ICEs dodji
2012-11-01 19:53 ` [PATCH 10/13] Make build_check_stmt accept an SSA_NAME for its base dodji
2012-11-01 19:53 ` [PATCH 09/13] Don't forget to protect 32 bytes aligned global variables dodji
2012-11-01 19:53 ` [PATCH 06/13] Implement protection of stack variables dodji
     [not found]   ` <CAGQ9bdweH8Pn=8vLTNa8FSzAh92OYrWScxK78n9znCodADJUvw@mail.gmail.com>
2012-11-02  4:35     ` Xinliang David Li
2012-11-02 15:25       ` Dodji Seketeli
2012-11-02 14:44     ` Dodji Seketeli
     [not found]       ` <CAGQ9bdxQG3i=BrSYmaN-ssdv4omW6F5VTg50viskKNcYrF-8BQ@mail.gmail.com>
2012-11-02 16:02         ` Dodji Seketeli
2012-11-01 19:53 ` [PATCH 02/13] Rename tree-asan.[ch] to asan.[ch] dodji
2012-11-01 21:54   ` Joseph S. Myers
2012-11-02 22:44     ` Dodji Seketeli
2012-11-01 19:53 ` [PATCH 01/13] Initial import of asan from the Google branch dodji
2012-11-01 19:53 ` [PATCH 03/13] Initial asan cleanups dodji
2012-11-01 19:53 ` [PATCH 11/13] Factorize condition insertion code out of build_check_stmt dodji
2012-11-01 19:53 ` [PATCH 07/13] Implement protection of global variables dodji
2012-11-01 19:53 ` [PATCH 05/13] Allow asan at -O0 dodji
2012-11-01 19:53 ` [PATCH 12/13] Instrument built-in memory access function calls dodji
2012-11-01 19:54 ` [PATCH 04/13] Emit GIMPLE directly instead of gimplifying GENERIC dodji
2012-11-02 22:53 ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
2012-11-02 22:56   ` [PATCH 01/10] Initial import of asan from the Google branch into trunk Dodji Seketeli
2012-11-06 17:04     ` Diego Novillo
2012-11-09 13:14     ` Tobias Burnus
2012-11-09 13:58       ` Jakub Jelinek
2012-11-09 16:53         ` Xinliang David Li
2012-11-09 17:13         ` Tobias Burnus
2012-11-09 17:18       ` Wei Mi
2012-11-12 11:09       ` [PATCH 03/11] Emit GIMPLE directly instead of gimplifying GENERIC Dodji Seketeli
2012-11-12 11:20       ` [PATCH 01/10] Initial import of asan from the Google branch into trunk Dodji Seketeli
2012-11-02 22:57   ` [PATCH 02/10] Initial asan cleanups Dodji Seketeli
2012-11-06 17:04     ` Diego Novillo
2012-11-12 11:12       ` Dodji Seketeli
2012-11-02 22:58   ` [PATCH 03/10] Emit GIMPLE directly instead of gimplifying GENERIC Dodji Seketeli
2012-11-06 17:08     ` Diego Novillo
2012-11-02 22:59   ` [PATCH 04/10] Allow asan at -O0 Dodji Seketeli
2012-11-06 17:12     ` Diego Novillo
2012-11-02 23:00   ` [PATCH 05/10] Implement protection of stack variables Dodji Seketeli
2012-11-06 17:22     ` Diego Novillo
2012-11-12 11:31       ` Dodji Seketeli
2012-11-12 11:51         ` Jakub Jelinek
2012-11-12 16:08           ` Dodji Seketeli
2012-11-02 23:01   ` [PATCH 06/10] Implement protection of global variables Dodji Seketeli
2012-11-06 17:27     ` Diego Novillo
2012-11-12 11:32       ` Dodji Seketeli
2012-11-02 23:02   ` [PATCH 07/10] Make build_check_stmt accept an SSA_NAME for its base Dodji Seketeli
2012-11-06 17:28     ` Diego Novillo
2012-11-02 23:03   ` [PATCH 08/10] Factorize condition insertion code out of build_check_stmt Dodji Seketeli
2012-11-05 15:50     ` Jakub Jelinek
2012-11-05 20:25       ` Dodji Seketeli
2012-11-06 17:30     ` Diego Novillo
2012-11-02 23:05   ` [PATCH 09/10] Instrument built-in memory access function calls Dodji Seketeli
2012-11-06 17:37     ` Diego Novillo
2012-11-12 11:40       ` Dodji Seketeli
2012-11-03  8:22   ` [PATCH 10/10] Import the asan runtime library into GCC tree Dodji Seketeli
     [not found]   ` <87fw4r7g8w.fsf_-_@redhat.com>
2012-11-06 17:41     ` Diego Novillo
2012-11-12 11:47       ` Dodji Seketeli
2012-11-12 18:59         ` H.J. Lu
2012-11-14 11:11           ` H.J. Lu
2012-11-14 11:42             ` H.J. Lu
2012-11-12 16:07   ` [PATCH 00/13] Request to merge Address Sanitizer in Dodji Seketeli
2012-11-12 16:21     ` Jakub Jelinek
2012-11-12 16:45       ` Tobias Burnus
2012-11-12 16:51         ` Konstantin Serebryany
2012-11-12 17:20     ` Jack Howarth
2012-11-12 17:34       ` Jack Howarth
2012-11-12 17:37         ` Tobias Burnus
2012-11-12 17:57           ` Jack Howarth
2012-11-12 17:55         ` Dodji Seketeli
2012-11-12 18:40           ` Jack Howarth
2012-11-12 20:39 ` H.J. Lu
2012-11-12 22:15   ` Ian Lance Taylor
2012-11-15 19:42 ` Jack Howarth
2012-11-15 23:42   ` Konstantin Serebryany
2012-11-16  8:27     ` Dodji Seketeli
2012-11-16 14:03       ` Jack Howarth
2012-11-16 15:57       ` Jack Howarth
2012-11-16 16:02         ` Jakub Jelinek
2012-11-16 16:47           ` Jack Howarth
2012-11-16 16:56       ` Alexander Potapenko
2012-11-16 17:06         ` Jack Howarth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).