public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/3][asan] Instrument memory access builtins calls
@ 2012-10-23 13:08 Dodji Seketeli
  2012-10-23 13:11 ` [PATCH 2/3] [asan] Factorize condition insertion code out of build_check_stmt Dodji Seketeli
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Dodji Seketeli @ 2012-10-23 13:08 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jakub Jelinek, Diego Novillo, Xinliang David Li, Wei Mi

Hello,

The three patches following up this message implement instrumenting
memory access builtins calls in AddressSanitizer, like what the llvm
implementation does.

The first two patches do some factorizing that is used by the third
one.  I have split them up like this to ease the review and to ensure
that applying them up one by one keeps the tree in a build-able state.

As we don't yet seem to have have a test harness on the tree, I have
tested the patches by starring at the gimple outputs and by
bootstrapping the tree for basic sanity.

This is my first foray in gimple and middle-end land, so I guess some
apologies are due in advance for the barbarisms you might find in
these patches.

Below is the summary of the changes.

Thank you in advance.

  [asan] Make build_check_stmt accept an SSA_NAME for its base
  [asan] Factorize condition insertion code out of build_check_stmt
  [asan] Instrument built-in memory access function calls

 gcc/asan.c | 482 +++++++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 419 insertions(+), 63 deletions(-)

-- 
		Dodji

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 2/3] [asan] Factorize condition insertion code out of build_check_stmt
  2012-10-23 13:08 [PATCH 0/3][asan] Instrument memory access builtins calls Dodji Seketeli
@ 2012-10-23 13:11 ` Dodji Seketeli
  2012-10-23 13:27   ` Jakub Jelinek
  2012-10-23 13:11 ` [PATCH 1/3] [asan] Make build_check_stmt accept an SSA_NAME for its base Dodji Seketeli
  2012-10-23 13:21 ` [PATCH 3/3] [asan] Instrument built-in memory access function calls Dodji Seketeli
  2 siblings, 1 reply; 25+ messages in thread
From: Dodji Seketeli @ 2012-10-23 13:11 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jakub Jelinek, Diego Novillo, Xinliang David Li, Wei Mi

This patch splits a new create_cond_insert_point_before_iter function
out of build_check_stmt, to be used by a later patch.

Tested by running cc1 -fasan on the test program below with and
without the patch and by inspecting the gimple output to see that
there is no change.

void
foo ()
{
  char foo[1] = {0};

  foo[0] = 1;
}

gcc/

	* asan.c (create_cond_insert_point_before_iter): Factorize out of ...
	(build_check_stmt): ... here.
---
 gcc/asan.c | 122 +++++++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 78 insertions(+), 44 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index e201f75..aed1a60 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -397,6 +397,77 @@ asan_init_func (void)
 #define PROB_VERY_UNLIKELY	(REG_BR_PROB_BASE / 2000 - 1)
 #define PROB_ALWAYS		(REG_BR_PROB_BASE)
 
+/* Split the current basic block and create a condition statement
+   insertion point right before the statement pointed to by ITER.
+   Return an iterator to the point at which the caller might safely
+   insert the condition statement.
+
+   THEN_BLOCK must be set to the address of an uninitialized instance
+   of basic_block.  The function will then set *THEN_BLOCK to the
+   'then block' of the condition statement to be inserted by the
+   caller.
+
+   Similarly, the function will set *FALLTRHOUGH_BLOCK to the 'else
+   block' of the condition statement to be inserted by the caller.
+
+   Note that *FALLTHROUGH_BLOCK is a new block that contains the
+   statements starting from *ITER, and *THEN_BLOCK is a new empty
+   block.
+
+   *ITER is adjusted to still point to the same statement it was
+   *pointing to initially.  */
+
+static gimple_stmt_iterator
+create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
+				      bool then_more_likely_p,
+				      basic_block *then_block,
+				      basic_block *fallthrough_block)
+{
+  gcc_assert (then_block != NULL && fallthrough_block != NULL);
+
+  gimple_stmt_iterator gsi = *iter;
+
+  if (!gsi_end_p (gsi))
+    gsi_prev (&gsi);
+
+  basic_block cur_bb = gsi_bb (*iter);
+
+  edge e = split_block (cur_bb, gsi_stmt (gsi));
+
+  /* Get a hold on the 'condition block', the 'then block' and the
+     'else block'.  */
+  basic_block cond_bb = e->src;
+  basic_block fallthru_bb = e->dest;
+  basic_block then_bb = create_empty_bb (cond_bb);
+
+  /* Set up the newly created 'then block'.  */
+  e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  int fallthrough_probability =
+    then_more_likely_p
+    ? PROB_VERY_UNLIKELY
+    : PROB_ALWAYS - PROB_VERY_UNLIKELY;
+  e->probability = PROB_ALWAYS - fallthrough_probability;
+  make_single_succ_edge (then_bb, fallthru_bb, EDGE_FALLTHRU);
+
+  /* Set up the the fallthrough basic block.  */
+  e = find_edge (cond_bb, fallthru_bb);
+  e->flags = EDGE_FALSE_VALUE;
+  e->count = cond_bb->count;
+  e->probability = fallthrough_probability;
+
+  /* Update dominance info for the newly created then_bb; note that
+     fallthru_bb's dominance info has already been updated by
+     split_bock.  */
+  if (dom_info_available_p (CDI_DOMINATORS))
+    set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
+
+  *then_block = then_bb;
+  *fallthrough_block = fallthru_bb;
+  *iter = gsi_start_bb (fallthru_bb);
+
+  return gsi_last_bb (cond_bb);
+}
+
 /* Instrument the memory access instruction BASE.  Insert new
    statements before ITER.
 
@@ -411,8 +482,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 		  int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
-  basic_block cond_bb, then_bb, else_bb;
-  edge e;
+  basic_block then_bb, else_bb;
   tree t, base_addr, shadow;
   gimple g;
   tree shadow_ptr_type = shadow_ptr_types[size_in_bytes == 16 ? 1 : 0];
@@ -421,51 +491,15 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
     = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE (base)), 1);
   tree base_ssa = base;
 
-  /* We first need to split the current basic block, and start altering
-     the CFG.  This allows us to insert the statements we're about to
-     construct into the right basic blocks.  */
-
-  cond_bb = gimple_bb (gsi_stmt (*iter));
-  gsi = *iter;
-  gsi_prev (&gsi);
-  if (!gsi_end_p (gsi))
-    e = split_block (cond_bb, gsi_stmt (gsi));
-  else
-    e = split_block_after_labels (cond_bb);
-  cond_bb = e->src;
-  else_bb = e->dest;
-
-  /* A recap at this point: else_bb is the basic block at whose head
-     is the gimple statement for which this check expression is being
-     built.  cond_bb is the (possibly new, synthetic) basic block the
-     end of which will contain the cache-lookup code, and a
-     conditional that jumps to the cache-miss code or, much more
-     likely, over to else_bb.  */
-
-  /* Create the bb that contains the crash block.  */
-  then_bb = create_empty_bb (cond_bb);
-  e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
-  e->probability = PROB_VERY_UNLIKELY;
-  make_single_succ_edge (then_bb, else_bb, EDGE_FALLTHRU);
-
-  /* Mark the pseudo-fallthrough edge from cond_bb to else_bb.  */
-  e = find_edge (cond_bb, else_bb);
-  e->flags = EDGE_FALSE_VALUE;
-  e->count = cond_bb->count;
-  e->probability = PROB_ALWAYS - PROB_VERY_UNLIKELY;
-
-  /* Update dominance info.  Note that bb_join's data was
-     updated by split_block.  */
-  if (dom_info_available_p (CDI_DOMINATORS))
-    {
-      set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
-      set_immediate_dominator (CDI_DOMINATORS, else_bb, cond_bb);
-    }
+  /* Get an iterator on the point where we can add the condition
+     statement for the instrumentation.  */
+  gsi = create_cond_insert_point_before_iter (iter,
+					      /*then_more_likely_p=*/false,
+					      &then_bb,
+					      &else_bb);
 
   base = unshare_expr (base);
 
-  gsi = gsi_last_bb (cond_bb);
-
   /* If BASE can already be an SSA_NAME; in that case, do not create a
      new SSA_NAME for it.  */
   if (TREE_CODE (base) != SSA_NAME)
-- 
1.7.11.7



-- 
		Dodji

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/3] [asan] Make build_check_stmt accept an SSA_NAME for its base
  2012-10-23 13:08 [PATCH 0/3][asan] Instrument memory access builtins calls Dodji Seketeli
  2012-10-23 13:11 ` [PATCH 2/3] [asan] Factorize condition insertion code out of build_check_stmt Dodji Seketeli
@ 2012-10-23 13:11 ` Dodji Seketeli
  2012-10-23 13:14   ` Jakub Jelinek
  2012-10-23 13:21 ` [PATCH 3/3] [asan] Instrument built-in memory access function calls Dodji Seketeli
  2 siblings, 1 reply; 25+ messages in thread
From: Dodji Seketeli @ 2012-10-23 13:11 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jakub Jelinek, Diego Novillo, Xinliang David Li, Wei Mi

This patch makes build_check_stmt accept its memory access parameter
to be an SSA name.  This is useful for a subsequent patch that will
re-use.

Tested by running cc1 -fasan on the program below with and without the
patch and inspecting the gimple output to see that there is no change.

void
foo ()
{
  char foo[1] = {0};

  foo[0] = 1;
}

gcc/
	* asan.c (build_check_stmt): Accept the memory access to be
	represented by an SSA_NAME.
---
 gcc/asan.c | 36 +++++++++++++++++++++++-------------
 1 file changed, 23 insertions(+), 13 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index 9464836..e201f75 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -397,16 +397,18 @@ asan_init_func (void)
 #define PROB_VERY_UNLIKELY	(REG_BR_PROB_BASE / 2000 - 1)
 #define PROB_ALWAYS		(REG_BR_PROB_BASE)
 
-/* Instrument the memory access instruction BASE.
-   Insert new statements before ITER.
-   LOCATION is source code location.
-   IS_STORE is either 1 (for a store) or 0 (for a load).
+/* Instrument the memory access instruction BASE.  Insert new
+   statements before ITER.
+
+   Note that the memory access represented by BASE can be either an
+   SSA_NAME, or a non-SSA expression.  LOCATION is the source code
+   location.  IS_STORE is TRUE for a store, FALSE for a load.
    SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
 
 static void
-build_check_stmt (tree base,
-                  gimple_stmt_iterator *iter,
-                  location_t location, bool is_store, int size_in_bytes)
+build_check_stmt (tree base, gimple_stmt_iterator *iter,
+                  location_t location, bool is_store,
+		  int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
   basic_block cond_bb, then_bb, else_bb;
@@ -417,6 +419,7 @@ build_check_stmt (tree base,
   tree shadow_type = TREE_TYPE (shadow_ptr_type);
   tree uintptr_type
     = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE (base)), 1);
+  tree base_ssa = base;
 
   /* We first need to split the current basic block, and start altering
      the CFG.  This allows us to insert the statements we're about to
@@ -462,15 +465,22 @@ build_check_stmt (tree base,
   base = unshare_expr (base);
 
   gsi = gsi_last_bb (cond_bb);
-  g = gimple_build_assign_with_ops (TREE_CODE (base),
-				    make_ssa_name (TREE_TYPE (base), NULL),
-				    base, NULL_TREE);
-  gimple_set_location (g, location);
-  gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+
+  /* If BASE can already be an SSA_NAME; in that case, do not create a
+     new SSA_NAME for it.  */
+  if (TREE_CODE (base) != SSA_NAME)
+    {
+      g = gimple_build_assign_with_ops (TREE_CODE (base),
+					make_ssa_name (TREE_TYPE (base), NULL),
+					base, NULL_TREE);
+      gimple_set_location (g, location);
+      gsi_insert_after (&gsi, g, GSI_NEW_STMT);
+      base_ssa = gimple_assign_lhs (g);
+    }
 
   g = gimple_build_assign_with_ops (NOP_EXPR,
 				    make_ssa_name (uintptr_type, NULL),
-				    gimple_assign_lhs (g), NULL_TREE);
+				    base_ssa, NULL_TREE);
   gimple_set_location (g, location);
   gsi_insert_after (&gsi, g, GSI_NEW_STMT);
   base_addr = gimple_assign_lhs (g);
-- 
		Dodji

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] [asan] Make build_check_stmt accept an SSA_NAME for its base
  2012-10-23 13:11 ` [PATCH 1/3] [asan] Make build_check_stmt accept an SSA_NAME for its base Dodji Seketeli
@ 2012-10-23 13:14   ` Jakub Jelinek
  0 siblings, 0 replies; 25+ messages in thread
From: Jakub Jelinek @ 2012-10-23 13:14 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

On Tue, Oct 23, 2012 at 03:07:14PM +0200, Dodji Seketeli wrote:
> 	* asan.c (build_check_stmt): Accept the memory access to be
> 	represented by an SSA_NAME.

This is ok for asan, thanks.

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-23 13:08 [PATCH 0/3][asan] Instrument memory access builtins calls Dodji Seketeli
  2012-10-23 13:11 ` [PATCH 2/3] [asan] Factorize condition insertion code out of build_check_stmt Dodji Seketeli
  2012-10-23 13:11 ` [PATCH 1/3] [asan] Make build_check_stmt accept an SSA_NAME for its base Dodji Seketeli
@ 2012-10-23 13:21 ` Dodji Seketeli
  2012-10-23 14:12   ` Jakub Jelinek
                     ` (2 more replies)
  2 siblings, 3 replies; 25+ messages in thread
From: Dodji Seketeli @ 2012-10-23 13:21 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jakub Jelinek, Diego Novillo, Xinliang David Li, Wei Mi

This patch instruments many memory access patterns through builtins.

Basically, for a call like:

     __builtin_memset (from, 0, n_bytes);

the patch would only instrument the accesses at the beginning and at
the end of the memory region [from, from + n_bytes].  This is the
strategy used by the llvm implementation of asan.

This instrumentation for all the memory access builtin functions that
expose a well specified memory region -- one that explicitly states
the number of bytes accessed in the region.

Tested by running cc1 -fasan on variations of simple programs like:

    void
    foo ()
    {
      char foo[1] = {0};

      __builtin_memset (foo, 0, 2);
    }

and by starring at the gimple output.

gcc/
	* asan.c (insert_if_then_before_iter)
	(instrument_mem_region_access)
	(maybe_instrument_builtin_call, maybe_instrument_call): New static
	functions.
	(instrument_assignment): Factorize from ...
	(transform_statements): ... here.  Use maybe_instrument_call to
	instrument (builtin) function calls as well.
---
 gcc/asan.c | 328 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 320 insertions(+), 8 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index aed1a60..a8e3827 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -468,6 +468,40 @@ create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
   return gsi_last_bb (cond_bb);
 }
 
+/* Insert an if condition followed by a 'then block' right before the
+   statement pointed to by ITER.  The fallthrough block -- which is the
+   else block of the condition as well as the destination of the
+   outcoming edge of the 'then block' -- starts with the statement
+   pointed to by ITER.
+
+   COND is the condition of the if.  
+
+   If THEN_MORE_LIKELY_P is true,
+   the the probability of the edge to the 'then block' is higher than
+   the probability of the edge to the fallthrough block.
+
+   Upon completion of the function, *THEN_BB is set to the newly
+   inserted 'then block' and similarly, *FALLTHROUGH_BB is set to the
+   fallthrough block.
+
+   *ITER is adjusted to still point to the same statement it was
+   pointing to initially.  */
+
+static void
+insert_if_then_before_iter (gimple cond,
+			    gimple_stmt_iterator *iter,
+			    bool then_more_likely_p,
+			    basic_block *then_bb,
+			    basic_block *fallthrough_bb)
+{
+  gimple_stmt_iterator cond_insert_point =
+    create_cond_insert_point_before_iter (iter,
+					  then_more_likely_p,
+					  then_bb,
+					  fallthrough_bb);
+  gsi_insert_after (&cond_insert_point, cond, GSI_NEW_STMT);
+}
+
 /* Instrument the memory access instruction BASE.  Insert new
    statements before ITER.
 
@@ -628,7 +662,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 
 /* If T represents a memory access, add instrumentation code before ITER.
    LOCATION is source code location.
-   IS_STORE is either 1 (for a store) or 0 (for a load).  */
+   IS_STORE is either TRUE (for a store) or FALSE (for a load).  */
 
 static void
 instrument_derefs (gimple_stmt_iterator *iter, tree t,
@@ -670,6 +704,285 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
   build_check_stmt (base, iter, location, is_store, size_in_bytes);
 }
 
+/* Instrument an access to a contiguous memory region that starts at
+   the address pointed to by BASE, over a length of LEN (expressed in
+   the sizeof (*BASE) bytes).  ITER points to the the instruction
+   before which the instrumentation instructions must be inserted.
+   LOCATION  is the source location that the instrumentation
+   instructions must have.  If IS_STORE is true, then the memory
+   access is a store; otherwise, it's a load.  */
+
+static void
+instrument_mem_region_access (tree base, tree len,
+			      gimple_stmt_iterator *iter,
+			      location_t location, bool is_store)
+{
+  if (integer_zerop (len))
+    return;
+
+  gimple_stmt_iterator gsi = *iter;
+  tree pointed_to_type = TREE_TYPE (TREE_TYPE (base));
+
+  if (!is_gimple_constant (len))
+    {
+      /* So, the length of the memory area to asan-protect is
+	 non-constant.  Let's guard the generated instrumentation code
+	 like:
+
+	 if (len != 0)
+	   {
+	     //asan instrumentation code goes here.
+           }
+	   // falltrough instructions, starting with *ITER.  */
+
+      basic_block fallthrough_bb, then_bb;
+      gimple g = gimple_build_cond (NE_EXPR,
+				    len,
+				    build_int_cst (TREE_TYPE (len), 0),
+				    NULL_TREE, NULL_TREE);
+      gimple_set_location (g, location);
+      insert_if_then_before_iter (g, iter, /*then_more_likely_p=*/true,
+				  &then_bb, &fallthrough_bb);
+      /* Note that fallthrough_bb starts with the statement that was
+	 pointed to by ITER.  */
+
+      /* The 'then block' of the 'if (len != 0) condition is where
+	 we'll generate the asan instrumentation code now.  */
+      gsi = gsi_start_bb (then_bb);
+
+      /* Instrument the beginning of the memory region to be accessed,
+	 and arrange for the rest of the intrumentation code to be
+	 inserted in the then block *after* the current gsi.  */
+      build_check_stmt (base, &gsi, location, is_store,
+			int_size_in_bytes (pointed_to_type));
+      gsi = gsi_last_bb (then_bb);
+    }
+  else
+    {
+      /* Instrument the beginning of the memory region to be
+	 accessed.  */
+      build_check_stmt (base, iter, location, is_store,
+			int_size_in_bytes (pointed_to_type));
+      gsi = *iter;
+    }
+
+  /* We want to instrument the access at the end of the memory region,
+     which is at (base + len - 1).  */
+
+  /* offset = len - 1;  */
+  len = unshare_expr (len);
+  gimple offset =
+    gimple_build_assign_with_ops (TREE_CODE (len),
+				  make_ssa_name (TREE_TYPE (len), NULL),
+				  len, NULL);
+  gimple_set_location (offset, location);
+  gsi_insert_before (&gsi, offset, GSI_NEW_STMT);
+
+  offset =
+    gimple_build_assign_with_ops (MINUS_EXPR,
+				  make_ssa_name (size_type_node, NULL),
+				  gimple_assign_lhs (offset),
+				  build_int_cst (size_type_node, 1));
+  gimple_set_location (offset, location);
+  gsi_insert_after (&gsi, offset, GSI_NEW_STMT);
+
+  /* _1 = base;  */
+  base = unshare_expr (base);
+  gimple region_end =
+    gimple_build_assign_with_ops (TREE_CODE (base),
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  base, NULL);
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* _2 = _1 + offset;  */
+  region_end =
+    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  gimple_assign_lhs (region_end), 
+				  gimple_assign_lhs (offset));
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* instrument access at _2;  */
+  gsi_next (&gsi);
+  tree end = gimple_assign_lhs (region_end);
+  build_check_stmt (end, &gsi, location, is_store,
+		    int_size_in_bytes (TREE_TYPE (end)));
+}
+
+/* If the statement pointed to by the iterator ITER is a call to a
+   builtin memory access function, instrumented it and return TRUE.
+   Otherwise, return false.  */
+
+static bool
+maybe_instrument_builtin_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+
+  if (!is_gimple_call (call))
+    return false;
+
+  tree callee = gimple_call_fndecl (call);
+
+  if (!is_builtin_fn (callee)
+      || DECL_BUILT_IN_CLASS (callee) != BUILT_IN_NORMAL)
+    return false;
+
+  tree source0 = NULL_TREE, source1 = NULL_TREE,
+    dest = NULL_TREE, len = NULL_TREE;
+
+  switch (DECL_FUNCTION_CODE (callee))
+    {
+      /* (s, s, n) style memops.  */
+    case BUILT_IN_BCMP:
+    case BUILT_IN_MEMCMP:
+      /* These cannot be safely instrumented as their length parameter
+         is just a mere limit.
+
+    case BUILT_IN_STRNCASECMP:
+    case BUILT_IN_STRNCMP:  */
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      source1 = gimple_call_arg (call, 1);
+      break;
+
+      /* (src, dest, n) style memops.  */
+    case BUILT_IN_BCOPY:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      dest = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, src, n) style memops.  */
+    case BUILT_IN_MEMCPY:
+    case BUILT_IN_MEMCPY_CHK:
+    case BUILT_IN_MEMMOVE:
+    case BUILT_IN_MEMMOVE_CHK:
+    case BUILT_IN_MEMPCPY:
+    case BUILT_IN_MEMPCPY_CHK:
+
+      /* The builtin below cannot be safely instrumented as their
+         length parameter is just a mere limit.
+
+    case BUILT_IN_STPNCPY:
+    case BUILT_IN_STPNCPY_CHK:
+    case BUILT_IN_STRNCPY:
+    case BUILT_IN_STRNCPY_CHK: */
+      dest = gimple_call_arg (call, 0);
+      source0 = gimple_call_arg (call, 1);
+      len = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, src, n) where src is appended to the end of dest.
+	 These cannot be instrumented either.
+    case BUILT_IN_STRNCAT:
+    case BUILT_IN_STRNCAT_CHK:
+      break;  */
+
+      /* (dest, n) style memops.  */
+    case BUILT_IN_BZERO:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 1);
+      break;
+
+      /* (dest, x, n) style memops*/
+    case BUILT_IN_MEMSET:
+    case BUILT_IN_MEMSET_CHK:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 2);
+      break;
+
+      /* (src, n) style memops.  */
+    case BUILT_IN_STRNDUP:
+      source0 = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 1);
+      break;
+
+      /* (src, x, n) style memops.  */      
+    case BUILT_IN_MEMCHR:
+      source0 = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 2);
+    break;
+
+      /* memops that have no length information.
+    case BUILT_IN_INDEX:
+    case BUILT_IN_RINDEX:
+    case BUILT_IN_STRCHR:
+    case BUILT_IN_STRDUP:
+    case BUILT_IN_STRLEN:
+    case BUILT_IN_STRRCHR:
+      break;  */
+
+      /* (dest, src) and no length info.
+    case BUILT_IN_STPCPY:
+    case BUILT_IN_STPCPY_CHK:
+    case BUILT_IN_STRCASECMP:
+    case BUILT_IN_STRCAT:
+    case BUILT_IN_STRCAT_CHK:
+    case BUILT_IN_STRCPY:
+    case BUILT_IN_STRCPY_CHK:
+      break;  */
+
+      /* (s, s) and no length info.
+    case BUILT_IN_STRCMP:
+    case BUILT_IN_STRCSPN:
+    case BUILT_IN_STRPBRK:
+    case BUILT_IN_STRSPN:
+    case BUILT_IN_STRSTR:
+      break;  */
+
+    default:
+      break;
+    }
+
+  if (len != NULL_TREE)
+    {
+      if (source0 != NULL_TREE)
+	instrument_mem_region_access (source0, len, iter,
+				      gimple_location (call),
+				      /*is_store=*/false);
+      if (source1 != NULL_TREE)
+	instrument_mem_region_access (source1, len, iter,
+				      gimple_location (call),
+				      /*is_store=*/false);
+      else if (dest != NULL_TREE)
+	instrument_mem_region_access (dest, len, iter,
+				      gimple_location (call),
+				      /*is_store=*/true);
+      return true;
+    }
+
+  return false;
+}
+
+/*  Instrument the assignment statement ITER if it is subject to
+    instrumentation.  */
+
+static void
+instrument_assignment (gimple_stmt_iterator *iter)
+{
+  gimple s = gsi_stmt (*iter);
+
+  gcc_assert (gimple_assign_single_p (s));
+
+  instrument_derefs (iter, gimple_assign_lhs (s),
+		     gimple_location (s), true);
+  instrument_derefs (iter, gimple_assign_rhs1 (s),
+		     gimple_location (s), false);
+}
+
+/* Instrument the function call pointed to by the iterator ITER, if it
+   is subject to instrumentation.  At the moment, the only function
+   calls that are instrumented are some built-in functions that access
+   memory.  Look at maybe_instrument_builtin_call to learn more.  */
+
+static void
+maybe_instrument_call (gimple_stmt_iterator *iter)
+{
+  maybe_instrument_builtin_call (iter);
+}
+
 /* asan: this looks too complex. Can this be done simpler? */
 /* Transform
    1) Memory references.
@@ -688,13 +1001,12 @@ transform_statements (void)
       if (bb->index >= saved_last_basic_block) continue;
       for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
         {
-          gimple s = gsi_stmt (i);
-          if (!gimple_assign_single_p (s))
-	    continue;
-          instrument_derefs (&i, gimple_assign_lhs (s),
-                             gimple_location (s), true);
-          instrument_derefs (&i, gimple_assign_rhs1 (s),
-                             gimple_location (s), false);
+	  gimple s = gsi_stmt (i);
+
+	  if (gimple_assign_single_p (s))
+	    instrument_assignment (&i);
+	  else if (is_gimple_call (s))
+	    maybe_instrument_call (&i);
         }
     }
 }
-- 
		Dodji

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] [asan] Factorize condition insertion code out of build_check_stmt
  2012-10-23 13:11 ` [PATCH 2/3] [asan] Factorize condition insertion code out of build_check_stmt Dodji Seketeli
@ 2012-10-23 13:27   ` Jakub Jelinek
  2012-10-24 15:11     ` Dodji Seketeli
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2012-10-23 13:27 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

On Tue, Oct 23, 2012 at 03:08:07PM +0200, Dodji Seketeli wrote:
> +static gimple_stmt_iterator
> +create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
> +				      bool then_more_likely_p,
> +				      basic_block *then_block,
> +				      basic_block *fallthrough_block)
> +{
> +  gcc_assert (then_block != NULL && fallthrough_block != NULL);

I think this assert is useless, if they are NULL

> +  *then_block = then_bb;
> +  *fallthrough_block = fallthru_bb;

the above two stmts will just crash and be as useful for debugging
as the assert.

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-23 13:21 ` [PATCH 3/3] [asan] Instrument built-in memory access function calls Dodji Seketeli
@ 2012-10-23 14:12   ` Jakub Jelinek
  2012-10-23 15:51   ` Xinliang David Li
  2012-10-23 16:06   ` Jakub Jelinek
  2 siblings, 0 replies; 25+ messages in thread
From: Jakub Jelinek @ 2012-10-23 14:12 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

On Tue, Oct 23, 2012 at 03:11:29PM +0200, Dodji Seketeli wrote:
> 	* asan.c (insert_if_then_before_iter)
> 	(instrument_mem_region_access)
> 	(maybe_instrument_builtin_call, maybe_instrument_call): New static

Why not just write it:
	* asan.c (insert_if_then_before_iter, instrument_mem_region_access,
	maybe_instrument_builtin_call, maybe_instrument_call): New static
?

> 	functions.

+  tree pointed_to_type = TREE_TYPE (TREE_TYPE (base));                                                                                            

Shouldn't pointed_to_type be always char_type_node?
I mean it shouldn't be VOID_TYPE, even when the argument is (void *),
etc.

> +      /* The 'then block' of the 'if (len != 0) condition is where
> +	 we'll generate the asan instrumentation code now.  */
> +      gsi = gsi_start_bb (then_bb);
> +
> +      /* Instrument the beginning of the memory region to be accessed,
> +	 and arrange for the rest of the intrumentation code to be
> +	 inserted in the then block *after* the current gsi.  */
> +      build_check_stmt (base, &gsi, location, is_store,
> +			int_size_in_bytes (pointed_to_type));
> +      gsi = gsi_last_bb (then_bb);
> +    }
> +  else
> +    {
> +      /* Instrument the beginning of the memory region to be
> +	 accessed.  */
> +      build_check_stmt (base, iter, location, is_store,
> +			int_size_in_bytes (pointed_to_type));
> +      gsi = *iter;
> +    }

Is there any reason why you can't call build_check_stmt just once, after
the conditional?  I.e. do
      ...
      gsi = gsi_start_bb (then_bb);
    }
  else
    gsi = *iter;

  build_check_stmt (base, &gsi, location, is_store,
		    int_size_in_bytes (pointed_to_type));

> +  /* instrument access at _2;  */
> +  gsi_next (&gsi);
> +  tree end = gimple_assign_lhs (region_end);
> +  build_check_stmt (end, &gsi, location, is_store,

Can't you just pass gimple_assign_lhs (region_end) as first
argument to build_check_stmt?  And again, I think you want
to test a single byte there, not more.

> +		    int_size_in_bytes (TREE_TYPE (end)));

> +  switch (DECL_FUNCTION_CODE (callee))
> +    {
> +      /* (s, s, n) style memops.  */
> +    case BUILT_IN_BCMP:
> +    case BUILT_IN_MEMCMP:
> +      /* These cannot be safely instrumented as their length parameter
> +         is just a mere limit.
> +
> +    case BUILT_IN_STRNCASECMP:
> +    case BUILT_IN_STRNCMP:  */

I think these comments make the code less readable instead of more readable,
I'd move the comments why something can't be instrumented to the default:
case.

On the other side, you IMHO want to handle here also __atomic_* and __sync_*
builtins (not by using instrument_mem_region_access, but
just instrument_derefs (if the argument is ADDR_EXPR, on what it points to,
otherwise if it is SSA_NAME, on MEM_REF created for it).

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-23 13:21 ` [PATCH 3/3] [asan] Instrument built-in memory access function calls Dodji Seketeli
  2012-10-23 14:12   ` Jakub Jelinek
@ 2012-10-23 15:51   ` Xinliang David Li
  2012-10-23 16:03     ` Jakub Jelinek
  2012-10-23 16:06   ` Jakub Jelinek
  2 siblings, 1 reply; 25+ messages in thread
From: Xinliang David Li @ 2012-10-23 15:51 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: GCC Patches, Jakub Jelinek, Diego Novillo, Wei Mi

On Tue, Oct 23, 2012 at 6:11 AM, Dodji Seketeli <dodji@redhat.com> wrote:
> This patch instruments many memory access patterns through builtins.
>
> Basically, for a call like:
>
>      __builtin_memset (from, 0, n_bytes);
>
> the patch would only instrument the accesses at the beginning and at
> the end of the memory region [from, from + n_bytes].  This is the
> strategy used by the llvm implementation of asan.
>
> This instrumentation for all the memory access builtin functions that
> expose a well specified memory region -- one that explicitly states
> the number of bytes accessed in the region.
>
> Tested by running cc1 -fasan on variations of simple programs like:
>
>     void
>     foo ()
>     {
>       char foo[1] = {0};
>
>       __builtin_memset (foo, 0, 2);
>     }
>
> and by starring at the gimple output.
>
> gcc/
>         * asan.c (insert_if_then_before_iter)
>         (instrument_mem_region_access)
>         (maybe_instrument_builtin_call, maybe_instrument_call): New static
>         functions.
>         (instrument_assignment): Factorize from ...
>         (transform_statements): ... here.  Use maybe_instrument_call to
>         instrument (builtin) function calls as well.
> ---
>  gcc/asan.c | 328 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 320 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/asan.c b/gcc/asan.c
> index aed1a60..a8e3827 100644
> --- a/gcc/asan.c
> +++ b/gcc/asan.c
> @@ -468,6 +468,40 @@ create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
>    return gsi_last_bb (cond_bb);
>  }
>
> +/* Insert an if condition followed by a 'then block' right before the
> +   statement pointed to by ITER.  The fallthrough block -- which is the
> +   else block of the condition as well as the destination of the
> +   outcoming edge of the 'then block' -- starts with the statement
> +   pointed to by ITER.
> +
> +   COND is the condition of the if.
> +
> +   If THEN_MORE_LIKELY_P is true,
> +   the the probability of the edge to the 'then block' is higher than
> +   the probability of the edge to the fallthrough block.
> +
> +   Upon completion of the function, *THEN_BB is set to the newly
> +   inserted 'then block' and similarly, *FALLTHROUGH_BB is set to the
> +   fallthrough block.
> +
> +   *ITER is adjusted to still point to the same statement it was
> +   pointing to initially.  */
> +
> +static void
> +insert_if_then_before_iter (gimple cond,
> +                           gimple_stmt_iterator *iter,
> +                           bool then_more_likely_p,
> +                           basic_block *then_bb,
> +                           basic_block *fallthrough_bb)
> +{
> +  gimple_stmt_iterator cond_insert_point =
> +    create_cond_insert_point_before_iter (iter,
> +                                         then_more_likely_p,
> +                                         then_bb,
> +                                         fallthrough_bb);
> +  gsi_insert_after (&cond_insert_point, cond, GSI_NEW_STMT);
> +}
> +
>  /* Instrument the memory access instruction BASE.  Insert new
>     statements before ITER.
>
> @@ -628,7 +662,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
>
>  /* If T represents a memory access, add instrumentation code before ITER.
>     LOCATION is source code location.
> -   IS_STORE is either 1 (for a store) or 0 (for a load).  */
> +   IS_STORE is either TRUE (for a store) or FALSE (for a load).  */
>
>  static void
>  instrument_derefs (gimple_stmt_iterator *iter, tree t,
> @@ -670,6 +704,285 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
>    build_check_stmt (base, iter, location, is_store, size_in_bytes);
>  }
>
> +/* Instrument an access to a contiguous memory region that starts at
> +   the address pointed to by BASE, over a length of LEN (expressed in
> +   the sizeof (*BASE) bytes).  ITER points to the the instruction
> +   before which the instrumentation instructions must be inserted.
> +   LOCATION  is the source location that the instrumentation
> +   instructions must have.  If IS_STORE is true, then the memory
> +   access is a store; otherwise, it's a load.  */
> +
> +static void
> +instrument_mem_region_access (tree base, tree len,
> +                             gimple_stmt_iterator *iter,
> +                             location_t location, bool is_store)
> +{
> +  if (integer_zerop (len))
> +    return;
> +
> +  gimple_stmt_iterator gsi = *iter;
> +  tree pointed_to_type = TREE_TYPE (TREE_TYPE (base));
> +
> +  if (!is_gimple_constant (len))
> +    {
> +      /* So, the length of the memory area to asan-protect is
> +        non-constant.  Let's guard the generated instrumentation code
> +        like:
> +
> +        if (len != 0)
> +          {
> +            //asan instrumentation code goes here.
> +           }
> +          // falltrough instructions, starting with *ITER.  */
> +
> +      basic_block fallthrough_bb, then_bb;
> +      gimple g = gimple_build_cond (NE_EXPR,
> +                                   len,
> +                                   build_int_cst (TREE_TYPE (len), 0),
> +                                   NULL_TREE, NULL_TREE);
> +      gimple_set_location (g, location);
> +      insert_if_then_before_iter (g, iter, /*then_more_likely_p=*/true,
> +                                 &then_bb, &fallthrough_bb);
> +      /* Note that fallthrough_bb starts with the statement that was
> +        pointed to by ITER.  */
> +
> +      /* The 'then block' of the 'if (len != 0) condition is where
> +        we'll generate the asan instrumentation code now.  */
> +      gsi = gsi_start_bb (then_bb);
> +
> +      /* Instrument the beginning of the memory region to be accessed,
> +        and arrange for the rest of the intrumentation code to be
> +        inserted in the then block *after* the current gsi.  */
> +      build_check_stmt (base, &gsi, location, is_store,
> +                       int_size_in_bytes (pointed_to_type));
> +      gsi = gsi_last_bb (then_bb);
> +    }
> +  else
> +    {
> +      /* Instrument the beginning of the memory region to be
> +        accessed.  */
> +      build_check_stmt (base, iter, location, is_store,
> +                       int_size_in_bytes (pointed_to_type));
> +      gsi = *iter;
> +    }
> +
> +  /* We want to instrument the access at the end of the memory region,
> +     which is at (base + len - 1).  */
> +
> +  /* offset = len - 1;  */
> +  len = unshare_expr (len);
> +  gimple offset =
> +    gimple_build_assign_with_ops (TREE_CODE (len),
> +                                 make_ssa_name (TREE_TYPE (len), NULL),
> +                                 len, NULL);
> +  gimple_set_location (offset, location);
> +  gsi_insert_before (&gsi, offset, GSI_NEW_STMT);
> +
> +  offset =
> +    gimple_build_assign_with_ops (MINUS_EXPR,
> +                                 make_ssa_name (size_type_node, NULL),
> +                                 gimple_assign_lhs (offset),
> +                                 build_int_cst (size_type_node, 1));
> +  gimple_set_location (offset, location);
> +  gsi_insert_after (&gsi, offset, GSI_NEW_STMT);
> +
> +  /* _1 = base;  */
> +  base = unshare_expr (base);
> +  gimple region_end =
> +    gimple_build_assign_with_ops (TREE_CODE (base),
> +                                 make_ssa_name (TREE_TYPE (base), NULL),
> +                                 base, NULL);
> +  gimple_set_location (region_end, location);
> +  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
> +
> +  /* _2 = _1 + offset;  */
> +  region_end =
> +    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
> +                                 make_ssa_name (TREE_TYPE (base), NULL),
> +                                 gimple_assign_lhs (region_end),
> +                                 gimple_assign_lhs (offset));
> +  gimple_set_location (region_end, location);
> +  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
> +
> +  /* instrument access at _2;  */
> +  gsi_next (&gsi);
> +  tree end = gimple_assign_lhs (region_end);
> +  build_check_stmt (end, &gsi, location, is_store,
> +                   int_size_in_bytes (TREE_TYPE (end)));
> +}
> +
> +/* If the statement pointed to by the iterator ITER is a call to a
> +   builtin memory access function, instrumented it and return TRUE.
> +   Otherwise, return false.  */
> +
> +static bool
> +maybe_instrument_builtin_call (gimple_stmt_iterator *iter)
> +{
> +  gimple call = gsi_stmt (*iter);
> +
> +  if (!is_gimple_call (call))
> +    return false;
> +
> +  tree callee = gimple_call_fndecl (call);
> +
> +  if (!is_builtin_fn (callee)
> +      || DECL_BUILT_IN_CLASS (callee) != BUILT_IN_NORMAL)
> +    return false;
> +
> +  tree source0 = NULL_TREE, source1 = NULL_TREE,
> +    dest = NULL_TREE, len = NULL_TREE;
> +
> +  switch (DECL_FUNCTION_CODE (callee))
> +    {
> +      /* (s, s, n) style memops.  */
> +    case BUILT_IN_BCMP:
> +    case BUILT_IN_MEMCMP:
> +      /* These cannot be safely instrumented as their length parameter
> +         is just a mere limit.
> +
> +    case BUILT_IN_STRNCASECMP:
> +    case BUILT_IN_STRNCMP:  */
> +      len = gimple_call_arg (call, 2);
> +      source0 = gimple_call_arg (call, 0);
> +      source1 = gimple_call_arg (call, 1);
> +      break;
> +
> +      /* (src, dest, n) style memops.  */
> +    case BUILT_IN_BCOPY:
> +      len = gimple_call_arg (call, 2);
> +      source0 = gimple_call_arg (call, 0);
> +      dest = gimple_call_arg (call, 2);
> +      break;
> +
> +      /* (dest, src, n) style memops.  */
> +    case BUILT_IN_MEMCPY:
> +    case BUILT_IN_MEMCPY_CHK:
> +    case BUILT_IN_MEMMOVE:
> +    case BUILT_IN_MEMMOVE_CHK:
> +    case BUILT_IN_MEMPCPY:
> +    case BUILT_IN_MEMPCPY_CHK:
> +
> +      /* The builtin below cannot be safely instrumented as their
> +         length parameter is just a mere limit.
> +

Why can't the following be instrumented? The length is min (n, strlen (str)).


> +    case BUILT_IN_STPNCPY:
> +    case BUILT_IN_STPNCPY_CHK:
> +    case BUILT_IN_STRNCPY:
> +    case BUILT_IN_STRNCPY_CHK: */
> +      dest = gimple_call_arg (call, 0);
> +      source0 = gimple_call_arg (call, 1);
> +      len = gimple_call_arg (call, 2);
> +      break;
> +
> +      /* (dest, src, n) where src is appended to the end of dest.
> +        These cannot be instrumented either.
> +    case BUILT_IN_STRNCAT:
> +    case BUILT_IN_STRNCAT_CHK:
> +      break;  */
> +
> +      /* (dest, n) style memops.  */
> +    case BUILT_IN_BZERO:
> +      dest = gimple_call_arg (call, 0);
> +      len = gimple_call_arg (call, 1);
> +      break;
> +
> +      /* (dest, x, n) style memops*/
> +    case BUILT_IN_MEMSET:
> +    case BUILT_IN_MEMSET_CHK:
> +      dest = gimple_call_arg (call, 0);
> +      len = gimple_call_arg (call, 2);
> +      break;
> +
> +      /* (src, n) style memops.  */
> +    case BUILT_IN_STRNDUP:
> +      source0 = gimple_call_arg (call, 0);
> +      len = gimple_call_arg (call, 1);
> +      break;
> +
> +      /* (src, x, n) style memops.  */
> +    case BUILT_IN_MEMCHR:
> +      source0 = gimple_call_arg (call, 0);
> +      len = gimple_call_arg (call, 2);
> +    break;
> +
> +      /* memops that have no length information.
> +    case BUILT_IN_INDEX:
> +    case BUILT_IN_RINDEX:
> +    case BUILT_IN_STRCHR:
> +    case BUILT_IN_STRDUP:
> +    case BUILT_IN_STRLEN:
> +    case BUILT_IN_STRRCHR:
> +      break;  */


For 'strlen', can the memory check be done at the end of the string
using the returned length?

For strchr, the instrumentation can be done at the returned pointer, right?

thanks,

David


> +
> +      /* (dest, src) and no length info.
> +    case BUILT_IN_STPCPY:
> +    case BUILT_IN_STPCPY_CHK:
> +    case BUILT_IN_STRCASECMP:
> +    case BUILT_IN_STRCAT:
> +    case BUILT_IN_STRCAT_CHK:
> +    case BUILT_IN_STRCPY:
> +    case BUILT_IN_STRCPY_CHK:
> +      break;  */
> +
> +      /* (s, s) and no length info.
> +    case BUILT_IN_STRCMP:
> +    case BUILT_IN_STRCSPN:
> +    case BUILT_IN_STRPBRK:
> +    case BUILT_IN_STRSPN:
> +    case BUILT_IN_STRSTR:
> +      break;  */
> +
> +    default:
> +      break;
> +    }
> +
> +  if (len != NULL_TREE)
> +    {
> +      if (source0 != NULL_TREE)
> +       instrument_mem_region_access (source0, len, iter,
> +                                     gimple_location (call),
> +                                     /*is_store=*/false);
> +      if (source1 != NULL_TREE)
> +       instrument_mem_region_access (source1, len, iter,
> +                                     gimple_location (call),
> +                                     /*is_store=*/false);
> +      else if (dest != NULL_TREE)
> +       instrument_mem_region_access (dest, len, iter,
> +                                     gimple_location (call),
> +                                     /*is_store=*/true);
> +      return true;
> +    }
> +
> +  return false;
> +}
> +
> +/*  Instrument the assignment statement ITER if it is subject to
> +    instrumentation.  */
> +
> +static void
> +instrument_assignment (gimple_stmt_iterator *iter)
> +{
> +  gimple s = gsi_stmt (*iter);
> +
> +  gcc_assert (gimple_assign_single_p (s));
> +
> +  instrument_derefs (iter, gimple_assign_lhs (s),
> +                    gimple_location (s), true);
> +  instrument_derefs (iter, gimple_assign_rhs1 (s),
> +                    gimple_location (s), false);
> +}
> +
> +/* Instrument the function call pointed to by the iterator ITER, if it
> +   is subject to instrumentation.  At the moment, the only function
> +   calls that are instrumented are some built-in functions that access
> +   memory.  Look at maybe_instrument_builtin_call to learn more.  */
> +
> +static void
> +maybe_instrument_call (gimple_stmt_iterator *iter)
> +{
> +  maybe_instrument_builtin_call (iter);
> +}
> +
>  /* asan: this looks too complex. Can this be done simpler? */
>  /* Transform
>     1) Memory references.
> @@ -688,13 +1001,12 @@ transform_statements (void)
>        if (bb->index >= saved_last_basic_block) continue;
>        for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
>          {
> -          gimple s = gsi_stmt (i);
> -          if (!gimple_assign_single_p (s))
> -           continue;
> -          instrument_derefs (&i, gimple_assign_lhs (s),
> -                             gimple_location (s), true);
> -          instrument_derefs (&i, gimple_assign_rhs1 (s),
> -                             gimple_location (s), false);
> +         gimple s = gsi_stmt (i);
> +
> +         if (gimple_assign_single_p (s))
> +           instrument_assignment (&i);
> +         else if (is_gimple_call (s))
> +           maybe_instrument_call (&i);
>          }
>      }
>  }
> --
>                 Dodji

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-23 15:51   ` Xinliang David Li
@ 2012-10-23 16:03     ` Jakub Jelinek
  2012-10-23 16:13       ` Xinliang David Li
  2012-10-24 15:35       ` Dodji Seketeli
  0 siblings, 2 replies; 25+ messages in thread
From: Jakub Jelinek @ 2012-10-23 16:03 UTC (permalink / raw)
  To: Xinliang David Li; +Cc: Dodji Seketeli, GCC Patches, Diego Novillo, Wei Mi

On Tue, Oct 23, 2012 at 08:47:48AM -0700, Xinliang David Li wrote:
> > +      /* The builtin below cannot be safely instrumented as their
> > +         length parameter is just a mere limit.
> > +
> 
> Why can't the following be instrumented? The length is min (n, strlen (str)).

Because that would be too expensive, and libasan intercepts (most of the)
str* functions anyway, both so that it can check this and test argument
overlap.  The memory builtin instrumentation is done primary for the cases
where the builtins are expanded inline, without calling library routine,
because then nothing is verified in libasan.

> For 'strlen', can the memory check be done at the end of the string
> using the returned length?

Guess strlen is commonly expanded inline, so it would be worthwhile to check
the shadow memory after the call (well, we could check the first byte
before the call and the last one after the call).

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-23 13:21 ` [PATCH 3/3] [asan] Instrument built-in memory access function calls Dodji Seketeli
  2012-10-23 14:12   ` Jakub Jelinek
  2012-10-23 15:51   ` Xinliang David Li
@ 2012-10-23 16:06   ` Jakub Jelinek
  2012-10-24 15:27     ` Dodji Seketeli
  2 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2012-10-23 16:06 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

On Tue, Oct 23, 2012 at 03:11:29PM +0200, Dodji Seketeli wrote:
> +      /* (src, n) style memops.  */
> +    case BUILT_IN_STRNDUP:
> +      source0 = gimple_call_arg (call, 0);
> +      len = gimple_call_arg (call, 1);
> +      break;

I think you can't instrument strndup either, the length is just a limit
there, it can copy fewer characters than that if strlen (source0) is
shorter.  libasan intercepts strndup I think.

> +      /* (src, x, n) style memops.  */      
> +    case BUILT_IN_MEMCHR:
> +      source0 = gimple_call_arg (call, 0);
> +      len = gimple_call_arg (call, 2);

And similarly for memchr, you could call
p = malloc (4096);
p[4095] = 1;
x = memchr (p, 1, 8192);
and it shouldn't read anything past the end of the
allocated area.

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-23 16:03     ` Jakub Jelinek
@ 2012-10-23 16:13       ` Xinliang David Li
  2012-10-24 15:35       ` Dodji Seketeli
  1 sibling, 0 replies; 25+ messages in thread
From: Xinliang David Li @ 2012-10-23 16:13 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Dodji Seketeli, GCC Patches, Diego Novillo, Wei Mi

On Tue, Oct 23, 2012 at 8:58 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Oct 23, 2012 at 08:47:48AM -0700, Xinliang David Li wrote:
>> > +      /* The builtin below cannot be safely instrumented as their
>> > +         length parameter is just a mere limit.
>> > +
>>
>> Why can't the following be instrumented? The length is min (n, strlen (str)).
>
> Because that would be too expensive, and libasan intercepts (most of the)
> str* functions anyway, both so that it can check this and test argument
> overlap.  The memory builtin instrumentation is done primary for the cases
> where the builtins are expanded inline, without calling library routine,
> because then nothing is verified in libasan.
>

Ok that makes sense.

thanks,

David


>> For 'strlen', can the memory check be done at the end of the string
>> using the returned length?
>
> Guess strlen is commonly expanded inline, so it would be worthwhile to check
> the shadow memory after the call (well, we could check the first byte
> before the call and the last one after the call).
>
>         Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] [asan] Factorize condition insertion code out of build_check_stmt
  2012-10-23 13:27   ` Jakub Jelinek
@ 2012-10-24 15:11     ` Dodji Seketeli
  2012-10-24 15:44       ` Jakub Jelinek
  0 siblings, 1 reply; 25+ messages in thread
From: Dodji Seketeli @ 2012-10-24 15:11 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

Jakub Jelinek <jakub@redhat.com> writes:

> On Tue, Oct 23, 2012 at 03:08:07PM +0200, Dodji Seketeli wrote:
>> +static gimple_stmt_iterator
>> +create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
>> +				      bool then_more_likely_p,
>> +				      basic_block *then_block,
>> +				      basic_block *fallthrough_block)
>> +{
>> +  gcc_assert (then_block != NULL && fallthrough_block != NULL);
>
> I think this assert is useless, if they are NULL
>
>> +  *then_block = then_bb;
>> +  *fallthrough_block = fallthru_bb;
>
> the above two stmts will just crash and be as useful for debugging
> as the assert.

Fixed.  Below is the updated patch.

gcc/

	* asan.c (create_cond_insert_point_before_iter): Factorize out of ...
	(build_check_stmt): ... here.
---
 gcc/asan.c | 120 ++++++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 76 insertions(+), 44 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index e8660a6..39e77e6 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -397,6 +397,75 @@ asan_init_func (void)
 #define PROB_VERY_UNLIKELY	(REG_BR_PROB_BASE / 2000 - 1)
 #define PROB_ALWAYS		(REG_BR_PROB_BASE)
 
+/* Split the current basic block and create a condition statement
+   insertion point right before the statement pointed to by ITER.
+   Return an iterator to the point at which the caller might safely
+   insert the condition statement.
+
+   THEN_BLOCK must be set to the address of an uninitialized instance
+   of basic_block.  The function will then set *THEN_BLOCK to the
+   'then block' of the condition statement to be inserted by the
+   caller.
+
+   Similarly, the function will set *FALLTRHOUGH_BLOCK to the 'else
+   block' of the condition statement to be inserted by the caller.
+
+   Note that *FALLTHROUGH_BLOCK is a new block that contains the
+   statements starting from *ITER, and *THEN_BLOCK is a new empty
+   block.
+
+   *ITER is adjusted to still point to the same statement it was
+   *pointing to initially.  */
+
+static gimple_stmt_iterator
+create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
+				      bool then_more_likely_p,
+				      basic_block *then_block,
+				      basic_block *fallthrough_block)
+{
+  gimple_stmt_iterator gsi = *iter;
+
+  if (!gsi_end_p (gsi))
+    gsi_prev (&gsi);
+
+  basic_block cur_bb = gsi_bb (*iter);
+
+  edge e = split_block (cur_bb, gsi_stmt (gsi));
+
+  /* Get a hold on the 'condition block', the 'then block' and the
+     'else block'.  */
+  basic_block cond_bb = e->src;
+  basic_block fallthru_bb = e->dest;
+  basic_block then_bb = create_empty_bb (cond_bb);
+
+  /* Set up the newly created 'then block'.  */
+  e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
+  int fallthrough_probability =
+    then_more_likely_p
+    ? PROB_VERY_UNLIKELY
+    : PROB_ALWAYS - PROB_VERY_UNLIKELY;
+  e->probability = PROB_ALWAYS - fallthrough_probability;
+  make_single_succ_edge (then_bb, fallthru_bb, EDGE_FALLTHRU);
+
+  /* Set up the fallthrough basic block.  */
+  e = find_edge (cond_bb, fallthru_bb);
+  e->flags = EDGE_FALSE_VALUE;
+  e->count = cond_bb->count;
+  e->probability = fallthrough_probability;
+
+  /* Update dominance info for the newly created then_bb; note that
+     fallthru_bb's dominance info has already been updated by
+     split_bock.  */
+  if (dom_info_available_p (CDI_DOMINATORS))
+    set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
+
+  *then_block = then_bb;
+  *fallthrough_block = fallthru_bb;
+  *iter = gsi_start_bb (fallthru_bb);
+
+  return gsi_last_bb (cond_bb);
+}
+
 /* Instrument the memory access instruction BASE.  Insert new
    statements before ITER.
 
@@ -411,8 +480,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 		  int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
-  basic_block cond_bb, then_bb, else_bb;
-  edge e;
+  basic_block then_bb, else_bb;
   tree t, base_addr, shadow;
   gimple g;
   tree shadow_ptr_type = shadow_ptr_types[size_in_bytes == 16 ? 1 : 0];
@@ -421,51 +489,15 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
     = build_nonstandard_integer_type (TYPE_PRECISION (TREE_TYPE (base)), 1);
   tree base_ssa = base;
 
-  /* We first need to split the current basic block, and start altering
-     the CFG.  This allows us to insert the statements we're about to
-     construct into the right basic blocks.  */
-
-  cond_bb = gimple_bb (gsi_stmt (*iter));
-  gsi = *iter;
-  gsi_prev (&gsi);
-  if (!gsi_end_p (gsi))
-    e = split_block (cond_bb, gsi_stmt (gsi));
-  else
-    e = split_block_after_labels (cond_bb);
-  cond_bb = e->src;
-  else_bb = e->dest;
-
-  /* A recap at this point: else_bb is the basic block at whose head
-     is the gimple statement for which this check expression is being
-     built.  cond_bb is the (possibly new, synthetic) basic block the
-     end of which will contain the cache-lookup code, and a
-     conditional that jumps to the cache-miss code or, much more
-     likely, over to else_bb.  */
-
-  /* Create the bb that contains the crash block.  */
-  then_bb = create_empty_bb (cond_bb);
-  e = make_edge (cond_bb, then_bb, EDGE_TRUE_VALUE);
-  e->probability = PROB_VERY_UNLIKELY;
-  make_single_succ_edge (then_bb, else_bb, EDGE_FALLTHRU);
-
-  /* Mark the pseudo-fallthrough edge from cond_bb to else_bb.  */
-  e = find_edge (cond_bb, else_bb);
-  e->flags = EDGE_FALSE_VALUE;
-  e->count = cond_bb->count;
-  e->probability = PROB_ALWAYS - PROB_VERY_UNLIKELY;
-
-  /* Update dominance info.  Note that bb_join's data was
-     updated by split_block.  */
-  if (dom_info_available_p (CDI_DOMINATORS))
-    {
-      set_immediate_dominator (CDI_DOMINATORS, then_bb, cond_bb);
-      set_immediate_dominator (CDI_DOMINATORS, else_bb, cond_bb);
-    }
+  /* Get an iterator on the point where we can add the condition
+     statement for the instrumentation.  */
+  gsi = create_cond_insert_point_before_iter (iter,
+					      /*then_more_likely_p=*/false,
+					      &then_bb,
+					      &else_bb);
 
   base = unshare_expr (base);
 
-  gsi = gsi_last_bb (cond_bb);
-
   /* BASE can already be an SSA_NAME; in that case, do not create a
      new SSA_NAME for it.  */
   if (TREE_CODE (base) != SSA_NAME)
-- 
1.7.11.7


> 	Jakub

-- 
		Dodji

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-23 16:06   ` Jakub Jelinek
@ 2012-10-24 15:27     ` Dodji Seketeli
  2012-10-24 16:46       ` Jakub Jelinek
  2012-10-29 15:15       ` Hans-Peter Nilsson
  0 siblings, 2 replies; 25+ messages in thread
From: Dodji Seketeli @ 2012-10-24 15:27 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

Jakub Jelinek <jakub@redhat.com> writes:

> On Tue, Oct 23, 2012 at 03:11:29PM +0200, Dodji Seketeli wrote:
>> +      /* (src, n) style memops.  */
>> +    case BUILT_IN_STRNDUP:
>> +      source0 = gimple_call_arg (call, 0);
>> +      len = gimple_call_arg (call, 1);
>> +      break;
>
> I think you can't instrument strndup either, the length is just a limit
> there, it can copy fewer characters than that if strlen (source0) is
> shorter.  libasan intercepts strndup I think.

Fixed.

>> +      /* (src, x, n) style memops.  */      
>> +    case BUILT_IN_MEMCHR:
>> +      source0 = gimple_call_arg (call, 0);
>> +      len = gimple_call_arg (call, 2);
>
> And similarly for memchr, you could call
> p = malloc (4096);
> p[4095] = 1;
> x = memchr (p, 1, 8192);
> and it shouldn't read anything past the end of the
> allocated area.

Fixed as well.

> On Tue, Oct 23, 2012 at 03:11:29PM +0200, Dodji Seketeli wrote:
> > 	* asan.c (insert_if_then_before_iter)
> > 	(instrument_mem_region_access)
> > 	(maybe_instrument_builtin_call, maybe_instrument_call): New static
> 
> Why not just write it:
> 	* asan.c (insert_if_then_before_iter, instrument_mem_region_access,
> 	maybe_instrument_builtin_call, maybe_instrument_call): New static
> ?

It's emacs that formats it like that automatically.  I am not sure how
to teach him otherwise.  I have fixed this as you want by doing it "by
hand".

> 
> > 	functions.
> 
> + tree pointed_to_type = TREE_TYPE (TREE_TYPE (base));
> 
> Shouldn't pointed_to_type be always char_type_node?
> I mean it shouldn't be VOID_TYPE, even when the argument is (void *),
> etc.

Right.  Fixed.

> 
> > +      /* The 'then block' of the 'if (len != 0) condition is where
> > +	 we'll generate the asan instrumentation code now.  */
> > +      gsi = gsi_start_bb (then_bb);
> > +
> > +      /* Instrument the beginning of the memory region to be accessed,
> > +	 and arrange for the rest of the intrumentation code to be
> > +	 inserted in the then block *after* the current gsi.  */
> > +      build_check_stmt (base, &gsi, location, is_store,
> > +			int_size_in_bytes (pointed_to_type));
> > +      gsi = gsi_last_bb (then_bb);
> > +    }
> > +  else
> > +    {
> > +      /* Instrument the beginning of the memory region to be
> > +	 accessed.  */
> > +      build_check_stmt (base, iter, location, is_store,
> > +			int_size_in_bytes (pointed_to_type));
> > +      gsi = *iter;
> > +    }
> 
> Is there any reason why you can't call build_check_stmt just once, after
> the conditional?  I.e. do
>       ...
>       gsi = gsi_start_bb (then_bb);
>     }
>   else
>     gsi = *iter;
> 
>   build_check_stmt (base, &gsi, location, is_store,
> 		    int_size_in_bytes (pointed_to_type));


If I write exactly what you wrote here, I am getting an error for e.g:

    void
    bar ()
    {
      char bar[1] = {0};
      int n = 0;

      __builtin_memset (bar, 0, n);
    }

test.cc: In function ‘void bar()’:
test.cc:11:1: erreur: definition in block 9 does not dominate use in block 7
 bar ()
 ^
for SSA_NAME: _28 in statement:
_29 = (unsigned long) _28;
test.cc:11:1: erreur interne du compilateur: verify_ssa failed

I think the issue is that when we insert the "if (len != 0)" condition
and 'then block',  we want the instrumentation to happen in the then
block;  but then build_check_stmt as in

    gsi = gsi_start_bb (then_bb);
    build_check_stmt (base, &gsi, location, is_store,
		      int_size_in_bytes (pointed_to_type));

arranges to leave gsi to point at the beginning of the then_bb, and
then the subsequent insertion of instrumentation code that happens
later at:

    gsi_insert_before (&gsi, offset, GSI_NEW_STMT);

puts said instrumentation code logically /before/ the beginning of the
then_bb; so bad things happen.  Or maybe I am missing something?

To avoid calling build_check_stmt twice I am proposing this change:

diff --git a/gcc/asan.c b/gcc/asan.c
index ecc0d0b..2ccc082 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -721,6 +721,7 @@ instrument_mem_region_access (tree base, tree len,
   gimple_stmt_iterator gsi = *iter;
   tree pointed_to_type = TREE_TYPE (TREE_TYPE (base));
 
+  basic_block fallthrough_bb = NULL, then_bb = NULL;
   if (!is_gimple_constant (len))
     {
       /* So, the length of the memory area to asan-protect is
@@ -733,7 +734,6 @@ instrument_mem_region_access (tree base, tree len,
            }
 	   // falltrough instructions, starting with *ITER.  */
 
-      basic_block fallthrough_bb, then_bb;
       gimple g = gimple_build_cond (NE_EXPR,
 				    len,
 				    build_int_cst (TREE_TYPE (len), 0),
@@ -747,22 +747,22 @@ instrument_mem_region_access (tree base, tree len,
       /* The 'then block' of the 'if (len != 0) condition is where
 	 we'll generate the asan instrumentation code now.  */
       gsi = gsi_start_bb (then_bb);
-
-      /* Instrument the beginning of the memory region to be accessed,
-	 and arrange for the rest of the intrumentation code to be
-	 inserted in the then block *after* the current gsi.  */
-      build_check_stmt (base, &gsi, location, is_store,
-			int_size_in_bytes (pointed_to_type));
-      gsi = gsi_last_bb (then_bb);
     }
+
+  /* Instrument the beginning of the memory region to be accessed,
+     and arrange for the rest of the intrumentation code to be
+     inserted in the then block *after* the current gsi.  */
+  build_check_stmt (base, &gsi, location, is_store, 1);
+
+  if (then_bb)
+    /* We are in the case where the length of the region is not
+       constant; so instrumentation code is being generated in the
+       'then block' of the 'if (len != 0) condition.  Let's arrange
+       for the subsequent instrumentation statements to go in the
+       'then block'.  */
+    gsi = gsi_last_bb (then_bb);
   else
-    {
-      /* Instrument the beginning of the memory region to be
-	 accessed.  */
-      build_check_stmt (base, iter, location, is_store,
-			int_size_in_bytes (pointed_to_type));
-      gsi = *iter;
-    }
+    *iter = gsi;
 
   /* We want to instrument the access at the end of the memory region,
      which is at (base + len - 1).  */


Is that any better?

> > +  /* instrument access at _2;  */
> > +  gsi_next (&gsi);
> > +  tree end = gimple_assign_lhs (region_end);
> > +  build_check_stmt (end, &gsi, location, is_store,
> 
> Can't you just pass gimple_assign_lhs (region_end) as first
> argument to build_check_stmt?  And again, I think you want
> to test a single byte there, not more.

Done.

> 
> > +		    int_size_in_bytes (TREE_TYPE (end)));
> 
> > +  switch (DECL_FUNCTION_CODE (callee))
> > +    {
> > +      /* (s, s, n) style memops.  */
> > +    case BUILT_IN_BCMP:
> > +    case BUILT_IN_MEMCMP:
> > +      /* These cannot be safely instrumented as their length parameter
> > +         is just a mere limit.
> > +
> > +    case BUILT_IN_STRNCASECMP:
> > +    case BUILT_IN_STRNCMP:  */
> 
> I think these comments make the code less readable instead of more readable,
> I'd move the comments why something can't be instrumented to the default:
> case.

Fixed.

> On the other side, you IMHO want to handle here also __atomic_* and
> __sync_* builtins (not by using instrument_mem_region_access, but
> just instrument_derefs

Updated in the patch below.

> (if the argument is ADDR_EXPR, on what it points
> to, otherwise if it is SSA_NAME, on MEM_REF created for it).

I updated instrument_derefs to handle ADDR_EXPR and SSA_NAME.  While
doing this, it occurred to me that the bit field detection code in
there looked not to be quite right as it was considering the MEM_REF
access foo[10] like a bit field access in the example:

    void
    foo ()
    {
      char foo[10] = {0};
      __sync_fetch_and_add (&foo[10], 1);
    }

Here is the updated patch. 

gcc/
	* asan.c (insert_if_then_before_iter, instrument_mem_region_access,
	maybe_instrument_builtin_call, maybe_instrument_call): New static
	functions.
	(instrument_assignment): Factorize from ...
	(transform_statements): ... here.  Use maybe_instrument_call to
	instrument builtin function calls as well.
	(instrument_derefs): Support ADDR_EXPR and SSA_NAME nodes.  Fix
	detection of bit-field access.
---
 gcc/asan.c | 522 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 513 insertions(+), 9 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index 39e77e6..215cab8 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -466,6 +466,40 @@ create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
   return gsi_last_bb (cond_bb);
 }
 
+/* Insert an if condition followed by a 'then block' right before the
+   statement pointed to by ITER.  The fallthrough block -- which is the
+   else block of the condition as well as the destination of the
+   outcoming edge of the 'then block' -- starts with the statement
+   pointed to by ITER.
+
+   COND is the condition of the if.  
+
+   If THEN_MORE_LIKELY_P is true, the probability of the edge to the
+   'then block' is higher than the probability of the edge to the
+   fallthrough block.
+
+   Upon completion of the function, *THEN_BB is set to the newly
+   inserted 'then block' and similarly, *FALLTHROUGH_BB is set to the
+   fallthrough block.
+
+   *ITER is adjusted to still point to the same statement it was
+   pointing to initially.  */
+
+static void
+insert_if_then_before_iter (gimple cond,
+			    gimple_stmt_iterator *iter,
+			    bool then_more_likely_p,
+			    basic_block *then_bb,
+			    basic_block *fallthrough_bb)
+{
+  gimple_stmt_iterator cond_insert_point =
+    create_cond_insert_point_before_iter (iter,
+					  then_more_likely_p,
+					  then_bb,
+					  fallthrough_bb);
+  gsi_insert_after (&cond_insert_point, cond, GSI_NEW_STMT);
+}
+
 /* Instrument the memory access instruction BASE.  Insert new
    statements before ITER.
 
@@ -626,7 +660,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 
 /* If T represents a memory access, add instrumentation code before ITER.
    LOCATION is source code location.
-   IS_STORE is either 1 (for a store) or 0 (for a load).  */
+   IS_STORE is either TRUE (for a store) or FALSE (for a load).  */
 
 static void
 instrument_derefs (gimple_stmt_iterator *iter, tree t,
@@ -638,6 +672,13 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
   type = TREE_TYPE (t);
   switch (TREE_CODE (t))
     {
+    case ADDR_EXPR:
+      t = TREE_OPERAND (t, 0);
+      type = TREE_TYPE (t);
+      break;
+    case SSA_NAME:
+      instrument_derefs (iter, SSA_NAME_VAR (t), location, is_store);
+      return;
     case ARRAY_REF:
     case COMPONENT_REF:
     case INDIRECT_REF:
@@ -661,13 +702,477 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
   int volatilep = 0, unsignedp = 0;
   get_inner_reference (t, &bitsize, &bitpos, &offset,
 		       &mode, &unsignedp, &volatilep, false);
-  if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
+  if (bitpos % BITS_PER_UNIT || bitsize != size_in_bytes * BITS_PER_UNIT)
     return;
 
   base = build_fold_addr_expr (t);
   build_check_stmt (base, iter, location, is_store, size_in_bytes);
 }
 
+/* Instrument an access to a contiguous memory region that starts at
+   the address pointed to by BASE, over a length of LEN (expressed in
+   the sizeof (*BASE) bytes).  ITER points to the instruction before
+   which the instrumentation instructions must be inserted.  LOCATION
+   is the source location that the instrumentation instructions must
+   have.  If IS_STORE is true, then the memory access is a store;
+   otherwise, it's a load.  */
+
+static void
+instrument_mem_region_access (tree base, tree len,
+			      gimple_stmt_iterator *iter,
+			      location_t location, bool is_store)
+{
+  if (integer_zerop (len))
+    return;
+
+  gimple_stmt_iterator gsi = *iter;
+
+  basic_block fallthrough_bb = NULL, then_bb = NULL;
+  if (!is_gimple_constant (len))
+    {
+      /* So, the length of the memory area to asan-protect is
+	 non-constant.  Let's guard the generated instrumentation code
+	 like:
+
+	 if (len != 0)
+	   {
+	     //asan instrumentation code goes here.
+           }
+	   // falltrough instructions, starting with *ITER.  */
+
+      gimple g = gimple_build_cond (NE_EXPR,
+				    len,
+				    build_int_cst (TREE_TYPE (len), 0),
+				    NULL_TREE, NULL_TREE);
+      gimple_set_location (g, location);
+      insert_if_then_before_iter (g, iter, /*then_more_likely_p=*/true,
+				  &then_bb, &fallthrough_bb);
+      /* Note that fallthrough_bb starts with the statement that was
+	 pointed to by ITER.  */
+
+      /* The 'then block' of the 'if (len != 0) condition is where
+	 we'll generate the asan instrumentation code now.  */
+      gsi = gsi_start_bb (then_bb);
+    }
+
+  /* Instrument the beginning of the memory region to be accessed,
+     and arrange for the rest of the intrumentation code to be
+     inserted in the then block *after* the current gsi.  */
+  build_check_stmt (base, &gsi, location, is_store, 1);
+
+  if (then_bb)
+    /* We are in the case where the length of the region is not
+       constant; so instrumentation code is being generated in the
+       'then block' of the 'if (len != 0) condition.  Let's arrange
+       for the subsequent instrumentation statements to go in the
+       'then block'.  */
+    gsi = gsi_last_bb (then_bb);
+  else
+    *iter = gsi;
+
+  /* We want to instrument the access at the end of the memory region,
+     which is at (base + len - 1).  */
+
+  /* offset = len - 1;  */
+  len = unshare_expr (len);
+  gimple offset =
+    gimple_build_assign_with_ops (TREE_CODE (len),
+				  make_ssa_name (TREE_TYPE (len), NULL),
+				  len, NULL);
+  gimple_set_location (offset, location);
+  gsi_insert_before (&gsi, offset, GSI_NEW_STMT);
+
+  offset =
+    gimple_build_assign_with_ops (MINUS_EXPR,
+				  make_ssa_name (size_type_node, NULL),
+				  gimple_assign_lhs (offset),
+				  build_int_cst (size_type_node, 1));
+  gimple_set_location (offset, location);
+  gsi_insert_after (&gsi, offset, GSI_NEW_STMT);
+
+  /* _1 = base;  */
+  base = unshare_expr (base);
+  gimple region_end =
+    gimple_build_assign_with_ops (TREE_CODE (base),
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  base, NULL);
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* _2 = _1 + offset;  */
+  region_end =
+    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  gimple_assign_lhs (region_end), 
+				  gimple_assign_lhs (offset));
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* instrument access at _2;  */
+  gsi_next (&gsi);
+  build_check_stmt (gimple_assign_lhs (region_end),
+		    &gsi, location, is_store, 1);
+}
+
+/* If the statement pointed to by the iterator ITER is a call to a
+   builtin memory access function, instrumented it and return TRUE.
+   Otherwise, return false.  */
+
+static bool
+maybe_instrument_builtin_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+  location_t location = gimple_location (call);
+
+  if (!is_gimple_call (call))
+    return false;
+
+  tree callee = gimple_call_fndecl (call);
+
+  if (!is_builtin_fn (callee)
+      || DECL_BUILT_IN_CLASS (callee) != BUILT_IN_NORMAL)
+    return false;
+
+  tree source0 = NULL_TREE, source1 = NULL_TREE,
+    dest = NULL_TREE, len = NULL_TREE;
+  bool is_store = true;
+
+  switch (DECL_FUNCTION_CODE (callee))
+    {
+      /* (s, s, n) style memops.  */
+    case BUILT_IN_BCMP:
+    case BUILT_IN_MEMCMP:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      source1 = gimple_call_arg (call, 1);
+      break;
+
+      /* (src, dest, n) style memops.  */
+    case BUILT_IN_BCOPY:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      dest = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, src, n) style memops.  */
+    case BUILT_IN_MEMCPY:
+    case BUILT_IN_MEMCPY_CHK:
+    case BUILT_IN_MEMMOVE:
+    case BUILT_IN_MEMMOVE_CHK:
+    case BUILT_IN_MEMPCPY:
+    case BUILT_IN_MEMPCPY_CHK:
+      dest = gimple_call_arg (call, 0);
+      source0 = gimple_call_arg (call, 1);
+      len = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, n) style memops.  */
+    case BUILT_IN_BZERO:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 1);
+      break;
+
+      /* (dest, x, n) style memops*/
+    case BUILT_IN_MEMSET:
+    case BUILT_IN_MEMSET_CHK:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 2);
+      break;
+
+    /* And now the __atomic* and __sync builtins.
+       These are handled differently from the classical memory memory
+       access builtins above.  */
+
+    case BUILT_IN_ATOMIC_ALWAYS_LOCK_FREE:
+    case BUILT_IN_ATOMIC_IS_LOCK_FREE:
+      is_store = false;
+
+      /* Fall through.  */
+
+    case BUILT_IN_SYNC_FETCH_AND_ADD_N:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_1:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_2:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_4:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_8:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_SUB_N:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_1:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_2:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_4:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_8:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_OR_N:
+    case BUILT_IN_SYNC_FETCH_AND_OR_1:
+    case BUILT_IN_SYNC_FETCH_AND_OR_2:
+    case BUILT_IN_SYNC_FETCH_AND_OR_4:
+    case BUILT_IN_SYNC_FETCH_AND_OR_8:
+    case BUILT_IN_SYNC_FETCH_AND_OR_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_AND_N:
+    case BUILT_IN_SYNC_FETCH_AND_AND_1:
+    case BUILT_IN_SYNC_FETCH_AND_AND_2:
+    case BUILT_IN_SYNC_FETCH_AND_AND_4:
+    case BUILT_IN_SYNC_FETCH_AND_AND_8:
+    case BUILT_IN_SYNC_FETCH_AND_AND_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_XOR_N:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_1:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_2:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_4:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_8:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_NAND_N:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_1:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_2:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_4:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_8:
+
+    case BUILT_IN_SYNC_ADD_AND_FETCH_N:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_1:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_2:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_4:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_8:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_SUB_AND_FETCH_N:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_1:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_2:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_4:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_8:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_OR_AND_FETCH_N:
+    case BUILT_IN_SYNC_OR_AND_FETCH_1:
+    case BUILT_IN_SYNC_OR_AND_FETCH_2:
+    case BUILT_IN_SYNC_OR_AND_FETCH_4:
+    case BUILT_IN_SYNC_OR_AND_FETCH_8:
+    case BUILT_IN_SYNC_OR_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_AND_AND_FETCH_N:
+    case BUILT_IN_SYNC_AND_AND_FETCH_1:
+    case BUILT_IN_SYNC_AND_AND_FETCH_2:
+    case BUILT_IN_SYNC_AND_AND_FETCH_4:
+    case BUILT_IN_SYNC_AND_AND_FETCH_8:
+    case BUILT_IN_SYNC_AND_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_XOR_AND_FETCH_N:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_1:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_2:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_4:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_8:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_NAND_AND_FETCH_N:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_1:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_2:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_4:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_8:
+
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_N:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_1:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_2:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_4:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_8:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_16:
+
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_N:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_1:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_2:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_4:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_8:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_16:
+
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_N:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_1:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_2:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_4:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_8:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_16:
+
+    case BUILT_IN_SYNC_LOCK_RELEASE_N:
+    case BUILT_IN_SYNC_LOCK_RELEASE_1:
+    case BUILT_IN_SYNC_LOCK_RELEASE_2:
+    case BUILT_IN_SYNC_LOCK_RELEASE_4:
+    case BUILT_IN_SYNC_LOCK_RELEASE_8:
+    case BUILT_IN_SYNC_LOCK_RELEASE_16:
+
+    case BUILT_IN_ATOMIC_TEST_AND_SET:
+    case BUILT_IN_ATOMIC_CLEAR:
+    case BUILT_IN_ATOMIC_EXCHANGE:
+    case BUILT_IN_ATOMIC_EXCHANGE_N:
+    case BUILT_IN_ATOMIC_EXCHANGE_1:
+    case BUILT_IN_ATOMIC_EXCHANGE_2:
+    case BUILT_IN_ATOMIC_EXCHANGE_4:
+    case BUILT_IN_ATOMIC_EXCHANGE_8:
+    case BUILT_IN_ATOMIC_EXCHANGE_16:
+
+    case BUILT_IN_ATOMIC_LOAD:
+    case BUILT_IN_ATOMIC_LOAD_N:
+    case BUILT_IN_ATOMIC_LOAD_1:
+    case BUILT_IN_ATOMIC_LOAD_2:
+    case BUILT_IN_ATOMIC_LOAD_4:
+    case BUILT_IN_ATOMIC_LOAD_8:
+    case BUILT_IN_ATOMIC_LOAD_16:
+
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_N:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_2:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_8:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_16:
+
+    case BUILT_IN_ATOMIC_STORE:
+    case BUILT_IN_ATOMIC_STORE_N:
+    case BUILT_IN_ATOMIC_STORE_1:
+    case BUILT_IN_ATOMIC_STORE_2:
+    case BUILT_IN_ATOMIC_STORE_4:
+    case BUILT_IN_ATOMIC_STORE_8:
+    case BUILT_IN_ATOMIC_STORE_16:
+
+    case BUILT_IN_ATOMIC_ADD_FETCH_N:
+    case BUILT_IN_ATOMIC_ADD_FETCH_1:
+    case BUILT_IN_ATOMIC_ADD_FETCH_2:
+    case BUILT_IN_ATOMIC_ADD_FETCH_4:
+    case BUILT_IN_ATOMIC_ADD_FETCH_8:
+    case BUILT_IN_ATOMIC_ADD_FETCH_16:
+
+    case BUILT_IN_ATOMIC_SUB_FETCH_N:
+    case BUILT_IN_ATOMIC_SUB_FETCH_1:
+    case BUILT_IN_ATOMIC_SUB_FETCH_2:
+    case BUILT_IN_ATOMIC_SUB_FETCH_4:
+    case BUILT_IN_ATOMIC_SUB_FETCH_8:
+    case BUILT_IN_ATOMIC_SUB_FETCH_16:
+
+    case BUILT_IN_ATOMIC_AND_FETCH_N:
+    case BUILT_IN_ATOMIC_AND_FETCH_1:
+    case BUILT_IN_ATOMIC_AND_FETCH_2:
+    case BUILT_IN_ATOMIC_AND_FETCH_4:
+    case BUILT_IN_ATOMIC_AND_FETCH_8:
+    case BUILT_IN_ATOMIC_AND_FETCH_16:
+
+    case BUILT_IN_ATOMIC_NAND_FETCH_N:
+    case BUILT_IN_ATOMIC_NAND_FETCH_1:
+    case BUILT_IN_ATOMIC_NAND_FETCH_2:
+    case BUILT_IN_ATOMIC_NAND_FETCH_4:
+    case BUILT_IN_ATOMIC_NAND_FETCH_8:
+    case BUILT_IN_ATOMIC_NAND_FETCH_16:
+
+    case BUILT_IN_ATOMIC_XOR_FETCH_N:
+    case BUILT_IN_ATOMIC_XOR_FETCH_1:
+    case BUILT_IN_ATOMIC_XOR_FETCH_2:
+    case BUILT_IN_ATOMIC_XOR_FETCH_4:
+    case BUILT_IN_ATOMIC_XOR_FETCH_8:
+    case BUILT_IN_ATOMIC_XOR_FETCH_16:
+
+    case BUILT_IN_ATOMIC_OR_FETCH_N:
+    case BUILT_IN_ATOMIC_OR_FETCH_1:
+    case BUILT_IN_ATOMIC_OR_FETCH_2:
+    case BUILT_IN_ATOMIC_OR_FETCH_4:
+    case BUILT_IN_ATOMIC_OR_FETCH_8:
+    case BUILT_IN_ATOMIC_OR_FETCH_16:
+
+    case BUILT_IN_ATOMIC_FETCH_ADD_N:
+    case BUILT_IN_ATOMIC_FETCH_ADD_1:
+    case BUILT_IN_ATOMIC_FETCH_ADD_2:
+    case BUILT_IN_ATOMIC_FETCH_ADD_4:
+    case BUILT_IN_ATOMIC_FETCH_ADD_8:
+    case BUILT_IN_ATOMIC_FETCH_ADD_16:
+
+    case BUILT_IN_ATOMIC_FETCH_SUB_N:
+    case BUILT_IN_ATOMIC_FETCH_SUB_1:
+    case BUILT_IN_ATOMIC_FETCH_SUB_2:
+    case BUILT_IN_ATOMIC_FETCH_SUB_4:
+    case BUILT_IN_ATOMIC_FETCH_SUB_8:
+    case BUILT_IN_ATOMIC_FETCH_SUB_16:
+
+    case BUILT_IN_ATOMIC_FETCH_AND_N:
+    case BUILT_IN_ATOMIC_FETCH_AND_1:
+    case BUILT_IN_ATOMIC_FETCH_AND_2:
+    case BUILT_IN_ATOMIC_FETCH_AND_4:
+    case BUILT_IN_ATOMIC_FETCH_AND_8:
+    case BUILT_IN_ATOMIC_FETCH_AND_16:
+
+    case BUILT_IN_ATOMIC_FETCH_NAND_N:
+    case BUILT_IN_ATOMIC_FETCH_NAND_1:
+    case BUILT_IN_ATOMIC_FETCH_NAND_2:
+    case BUILT_IN_ATOMIC_FETCH_NAND_4:
+    case BUILT_IN_ATOMIC_FETCH_NAND_8:
+    case BUILT_IN_ATOMIC_FETCH_NAND_16:
+
+    case BUILT_IN_ATOMIC_FETCH_XOR_N:
+    case BUILT_IN_ATOMIC_FETCH_XOR_1:
+    case BUILT_IN_ATOMIC_FETCH_XOR_2:
+    case BUILT_IN_ATOMIC_FETCH_XOR_4:
+    case BUILT_IN_ATOMIC_FETCH_XOR_8:
+    case BUILT_IN_ATOMIC_FETCH_XOR_16:
+
+    case BUILT_IN_ATOMIC_FETCH_OR_N:
+    case BUILT_IN_ATOMIC_FETCH_OR_1:
+    case BUILT_IN_ATOMIC_FETCH_OR_2:
+    case BUILT_IN_ATOMIC_FETCH_OR_4:
+    case BUILT_IN_ATOMIC_FETCH_OR_8:
+    case BUILT_IN_ATOMIC_FETCH_OR_16:
+      source0 = gimple_call_arg (call, 0);
+      break;
+
+    default:
+      /* The other builtins memory access are not instrumented in this
+	 function because they either don't have any length parameter,
+	 or their length parameter is just a limit.  */
+      break;
+    }
+
+  if (len != NULL_TREE)
+    {
+      is_store = (dest != NULL_TREE);
+
+      if (source0 != NULL_TREE)
+	instrument_mem_region_access (source0, len, iter,
+				      location, is_store);
+      if (source1 != NULL_TREE)
+	instrument_mem_region_access (source1, len, iter,
+				      location, is_store);
+      else if (dest != NULL_TREE)
+	instrument_mem_region_access (dest, len, iter,
+				      location, is_store);
+      return true;
+    }
+  else if (source0 != NULL)
+    instrument_derefs (iter, source0, location, is_store);
+  return false;
+}
+
+/*  Instrument the assignment statement ITER if it is subject to
+    instrumentation.  */
+
+static void
+instrument_assignment (gimple_stmt_iterator *iter)
+{
+  gimple s = gsi_stmt (*iter);
+
+  gcc_assert (gimple_assign_single_p (s));
+
+  instrument_derefs (iter, gimple_assign_lhs (s),
+		     gimple_location (s), true);
+  instrument_derefs (iter, gimple_assign_rhs1 (s),
+		     gimple_location (s), false);
+}
+
+/* Instrument the function call pointed to by the iterator ITER, if it
+   is subject to instrumentation.  At the moment, the only function
+   calls that are instrumented are some built-in functions that access
+   memory.  Look at maybe_instrument_builtin_call to learn more.  */
+
+static void
+maybe_instrument_call (gimple_stmt_iterator *iter)
+{
+  maybe_instrument_builtin_call (iter);
+}
+
 /* asan: this looks too complex. Can this be done simpler? */
 /* Transform
    1) Memory references.
@@ -686,13 +1191,12 @@ transform_statements (void)
       if (bb->index >= saved_last_basic_block) continue;
       for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
         {
-          gimple s = gsi_stmt (i);
-          if (!gimple_assign_single_p (s))
-	    continue;
-          instrument_derefs (&i, gimple_assign_lhs (s),
-                             gimple_location (s), true);
-          instrument_derefs (&i, gimple_assign_rhs1 (s),
-                             gimple_location (s), false);
+	  gimple s = gsi_stmt (i);
+
+	  if (gimple_assign_single_p (s))
+	    instrument_assignment (&i);
+	  else if (is_gimple_call (s))
+	    maybe_instrument_call (&i);
         }
     }
 }
-- 
1.7.11.7


-- 
		Dodji

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-23 16:03     ` Jakub Jelinek
  2012-10-23 16:13       ` Xinliang David Li
@ 2012-10-24 15:35       ` Dodji Seketeli
  2012-10-24 16:10         ` Jakub Jelinek
  1 sibling, 1 reply; 25+ messages in thread
From: Dodji Seketeli @ 2012-10-24 15:35 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Xinliang David Li, GCC Patches, Diego Novillo, Wei Mi

Jakub Jelinek <jakub@redhat.com> writes:

>> For 'strlen', can the memory check be done at the end of the string
>> using the returned length?
>
> Guess strlen is commonly expanded inline, so it would be worthwhile to check
> the shadow memory after the call (well, we could check the first byte
> before the call and the last one after the call).

How do I get the result of the (strlen) call in gimple?

-- 
		Dodji

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] [asan] Factorize condition insertion code out of build_check_stmt
  2012-10-24 15:11     ` Dodji Seketeli
@ 2012-10-24 15:44       ` Jakub Jelinek
  0 siblings, 0 replies; 25+ messages in thread
From: Jakub Jelinek @ 2012-10-24 15:44 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

On Wed, Oct 24, 2012 at 04:46:05PM +0200, Dodji Seketeli wrote:
> Fixed.  Below is the updated patch.

Ok, thanks.

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-24 15:35       ` Dodji Seketeli
@ 2012-10-24 16:10         ` Jakub Jelinek
  2012-10-24 16:15           ` Dodji Seketeli
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2012-10-24 16:10 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: Xinliang David Li, GCC Patches, Diego Novillo, Wei Mi

On Wed, Oct 24, 2012 at 05:16:26PM +0200, Dodji Seketeli wrote:
> Jakub Jelinek <jakub@redhat.com> writes:
> 
> >> For 'strlen', can the memory check be done at the end of the string
> >> using the returned length?
> >
> > Guess strlen is commonly expanded inline, so it would be worthwhile to check
> > the shadow memory after the call (well, we could check the first byte
> > before the call and the last one after the call).
> 
> How do I get the result of the (strlen) call in gimple?

That is gimple_call_lhs (call_stmt).
So for insturmenting strlen, you want to ammend:
  tmp = strlen (ptr);
as
  asan_addr_check (ptr);
  tmp = strlen (ptr);
  asan_addr_check (ptr + tmp);

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-24 16:10         ` Jakub Jelinek
@ 2012-10-24 16:15           ` Dodji Seketeli
  0 siblings, 0 replies; 25+ messages in thread
From: Dodji Seketeli @ 2012-10-24 16:15 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Xinliang David Li, GCC Patches, Diego Novillo, Wei Mi

Jakub Jelinek <jakub@redhat.com> writes:

> On Wed, Oct 24, 2012 at 05:16:26PM +0200, Dodji Seketeli wrote:
>> Jakub Jelinek <jakub@redhat.com> writes:
>> 
>> >> For 'strlen', can the memory check be done at the end of the string
>> >> using the returned length?
>> >
>> > Guess strlen is commonly expanded inline, so it would be worthwhile to check
>> > the shadow memory after the call (well, we could check the first byte
>> > before the call and the last one after the call).
>> 
>> How do I get the result of the (strlen) call in gimple?
>
> That is gimple_call_lhs (call_stmt).

OK, thank you.

> So for insturmenting strlen, you want to ammend:
>   tmp = strlen (ptr);
> as
>   asan_addr_check (ptr);
>   tmp = strlen (ptr);
>   asan_addr_check (ptr + tmp);

I see, thanks.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-24 15:27     ` Dodji Seketeli
@ 2012-10-24 16:46       ` Jakub Jelinek
  2012-10-25 21:46         ` Dodji Seketeli
  2012-10-29 15:15       ` Hans-Peter Nilsson
  1 sibling, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2012-10-24 16:46 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

On Wed, Oct 24, 2012 at 05:11:23PM +0200, Dodji Seketeli wrote:
> If I write exactly what you wrote here, I am getting an error for e.g:
> 
>     void
>     bar ()
>     {
>       char bar[1] = {0};
>       int n = 0;
> 
>       __builtin_memset (bar, 0, n);
>     }

I see, the problem is that build_check_stmt is initially called on an
iterator in empty bb.  Your solution is fine.

> > On the other side, you IMHO want to handle here also __atomic_* and
> > __sync_* builtins (not by using instrument_mem_region_access, but
> > just instrument_derefs
> 
> Updated in the patch below.

    case BUILT_IN_ATOMIC_ALWAYS_LOCK_FREE:
    case BUILT_IN_ATOMIC_IS_LOCK_FREE:
I think don't touch the memory at all (or not necessarily),
and IMHO you don't want to handle the BUILT_IN_*_N variants either,
those are just FE builtins that are lowered to the corresponding
_{1,2,4,8,16} variants.

> @@ -661,13 +702,477 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
>    int volatilep = 0, unsignedp = 0;
>    get_inner_reference (t, &bitsize, &bitpos, &offset,
>  		       &mode, &unsignedp, &volatilep, false);
> -  if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
> +  if (bitpos % BITS_PER_UNIT || bitsize != size_in_bytes * BITS_PER_UNIT)
>      return;

Shouldn't that be bitpos % (size_in_bytes * BITS_PER_UNIT) ?
1 byte or 2 byte access at bitpos 80 is fine, but 4 byte access is not.

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-24 16:46       ` Jakub Jelinek
@ 2012-10-25 21:46         ` Dodji Seketeli
  2012-10-25 22:13           ` Jakub Jelinek
  0 siblings, 1 reply; 25+ messages in thread
From: Dodji Seketeli @ 2012-10-25 21:46 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

Jakub Jelinek <jakub@redhat.com> writes:

>     case BUILT_IN_ATOMIC_ALWAYS_LOCK_FREE:
>     case BUILT_IN_ATOMIC_IS_LOCK_FREE:
> I think don't touch the memory at all (or not necessarily),
> and IMHO you don't want to handle the BUILT_IN_*_N variants either,
> those are just FE builtins that are lowered to the corresponding
> _{1,2,4,8,16} variants.

Fixed.

> 
> > @@ -661,13 +702,477 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
> >    int volatilep = 0, unsignedp = 0;
> >    get_inner_reference (t, &bitsize, &bitpos, &offset,
> >  		       &mode, &unsignedp, &volatilep, false);
> > -  if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
> > +  if (bitpos % BITS_PER_UNIT || bitsize != size_in_bytes * BITS_PER_UNIT)
> >      return;
> 
> Shouldn't that be bitpos % (size_in_bytes * BITS_PER_UNIT) ?
> 1 byte or 2 byte access at bitpos 80 is fine, but 4 byte access is not.

Of course it should be that.  I have updated the patch.

Jakub Jelinek <jakub@redhat.com> writes:

> On Wed, Oct 24, 2012 at 05:16:26PM +0200, Dodji Seketeli wrote:
>> Jakub Jelinek <jakub@redhat.com> writes:
>> 
>> >> For 'strlen', can the memory check be done at the end of the string
>> >> using the returned length?
>> >
>> > Guess strlen is commonly expanded inline, so it would be worthwhile to check
>> > the shadow memory after the call (well, we could check the first byte
>> > before the call and the last one after the call).
>> 
>> How do I get the result of the (strlen) call in gimple?
>
> That is gimple_call_lhs (call_stmt).
> So for insturmenting strlen, you want to ammend:
>   tmp = strlen (ptr);
> as
>   asan_addr_check (ptr);
>   tmp = strlen (ptr);
>   asan_addr_check (ptr + tmp);

Right.  I have done that in the updated patch.


So, to ease the review, I am re-posting the updated patch, below with
its amended introductory "cover letter" and Changelog.

Thanks.

This patch instruments many memory access patterns through builtins.

Basically, for a call like:

     __builtin_memset (from, 0, n_bytes);

the patch would only instrument the accesses at the beginning and at
the end of the memory region [from, from + n_bytes].  This is the
strategy used by the llvm implementation of asan.

This instrumentation is done for all the memory access builtin
functions that expose a well specified memory region -- one that
explicitly states the number of bytes accessed in the region.

A special treatment is used for __builtin_strlen.  The patch
instruments the access to the first byte of its argument, as well as
the access to the byte (of the argument) at the offset returned by
strlen.

For the __sync_* and __atomic* calls the patch instruments the access
to the bytes pointed to by the argument.

While doing this, I have added a new parameter to build_check_stmt to
decide whether to insert the instrumentation code before or after the
statement iterator.  This allows us to do away with the
gsi_{next,prev} dance we were doing in the callers of this function.

Tested by running cc1 -fasan on variations of simple programs like:

    int
    foo ()
    {
      char foo[10] = {0};

      foo[0] = 't';
      foo[1] = 'e';
      foo[2] = 's';
      foo[3] = 't';
      int l = __builtin_strlen (foo);
      int n = sizeof (foo);
      __builtin_memset (&foo[4], 0, n - 4);
      __sync_fetch_and_add (&foo[11], 1);

      return l;
    }

and by starring at the gimple output which for this function is:

    ;; Function foo (foo, funcdef_no=0, decl_uid=1714, cgraph_uid=0)

    foo ()
    {
      int n;
      int l;
      char foo[10];
      int D.1723;
      long unsigned int D.1722;
      int D.1721;
      long unsigned int D.1720;
      long unsigned int _1;
      int _4;
      long unsigned int _5;
      int _6;
      char * _7;
      unsigned long _8;
      unsigned long _9;
      unsigned long _10;
      signed char * _11;
      signed char _12;
      _Bool _13;
      unsigned long _14;
      signed char _15;
      _Bool _16;
      _Bool _17;
      char * _18;
      unsigned long _19;
      unsigned long _20;
      unsigned long _21;
      signed char * _22;
      signed char _23;
      _Bool _24;
      unsigned long _25;
      signed char _26;
      _Bool _27;
      _Bool _28;
      char * _29;
      unsigned long _30;
      unsigned long _31;
      unsigned long _32;
      signed char * _33;
      signed char _34;
      _Bool _35;
      unsigned long _36;
      signed char _37;
      _Bool _38;
      _Bool _39;
      char * _40;
      unsigned long _41;
      unsigned long _42;
      unsigned long _43;
      signed char * _44;
      signed char _45;
      _Bool _46;
      unsigned long _47;
      signed char _48;
      _Bool _49;
      _Bool _50;
      char * _51;
      unsigned long _52;
      unsigned long _53;
      unsigned long _54;
      signed char * _55;
      signed char _56;
      _Bool _57;
      unsigned long _58;
      signed char _59;
      _Bool _60;
      _Bool _61;
      long unsigned int _62;
      char[10] * _63;
      unsigned long _64;
      unsigned long _65;
      unsigned long _66;
      signed char * _67;
      signed char _68;
      _Bool _69;
      unsigned long _70;
      signed char _71;
      _Bool _72;
      _Bool _73;
      char * _74;
      unsigned long _75;
      unsigned long _76;
      unsigned long _77;
      signed char * _78;
      signed char _79;
      _Bool _80;
      unsigned long _81;
      signed char _82;
      _Bool _83;
      _Bool _84;
      long unsigned int _85;
      long unsigned int _86;
      char * _87;
      char * _88;
      unsigned long _89;
      unsigned long _90;
      unsigned long _91;
      signed char * _92;
      signed char _93;
      _Bool _94;
      unsigned long _95;
      signed char _96;
      _Bool _97;
      _Bool _98;
      char * _99;
      unsigned long _100;
      unsigned long _101;
      unsigned long _102;
      signed char * _103;
      signed char _104;
      _Bool _105;
      unsigned long _106;
      signed char _107;
      _Bool _108;
      _Bool _109;

      <bb 2>:
      foo = {};
      _7 = &foo[0];
      _8 = (unsigned long) _7;
      _9 = _8 >> 3;
      _10 = _9 + 17592186044416;
      _11 = (signed char *) _10;
      _12 = *_11;
      _13 = _12 != 0;
      _14 = _8 & 7;
      _15 = (signed char) _14;
      _16 = _15 >= _12;
      _17 = _13 & _16;
      if (_17 != 0)
	goto <bb 5>;
      else
	goto <bb 4>;

      <bb 5>:
      __asan_report_store1 (_8);

      <bb 4>:
      foo[0] = 116;
      _18 = &foo[1];
      _19 = (unsigned long) _18;
      _20 = _19 >> 3;
      _21 = _20 + 17592186044416;
      _22 = (signed char *) _21;
      _23 = *_22;
      _24 = _23 != 0;
      _25 = _19 & 7;
      _26 = (signed char) _25;
      _27 = _26 >= _23;
      _28 = _24 & _27;
      if (_28 != 0)
	goto <bb 7>;
      else
	goto <bb 6>;

      <bb 7>:
      __asan_report_store1 (_19);

      <bb 6>:
      foo[1] = 101;
      _29 = &foo[2];
      _30 = (unsigned long) _29;
      _31 = _30 >> 3;
      _32 = _31 + 17592186044416;
      _33 = (signed char *) _32;
      _34 = *_33;
      _35 = _34 != 0;
      _36 = _30 & 7;
      _37 = (signed char) _36;
      _38 = _37 >= _34;
      _39 = _35 & _38;
      if (_39 != 0)
	goto <bb 9>;
      else
	goto <bb 8>;

      <bb 9>:
      __asan_report_store1 (_30);

      <bb 8>:
      foo[2] = 115;
      _40 = &foo[3];
      _41 = (unsigned long) _40;
      _42 = _41 >> 3;
      _43 = _42 + 17592186044416;
      _44 = (signed char *) _43;
      _45 = *_44;
      _46 = _45 != 0;
      _47 = _41 & 7;
      _48 = (signed char) _47;
      _49 = _48 >= _45;
      _50 = _46 & _49;
      if (_50 != 0)
	goto <bb 11>;
      else
	goto <bb 10>;

      <bb 11>:
      __asan_report_store1 (_41);

      <bb 10>:
      foo[3] = 116;
      _51 = (char *) &foo;
      _52 = (unsigned long) _51;
      _53 = _52 >> 3;
      _54 = _53 + 17592186044416;
      _55 = (signed char *) _54;
      _56 = *_55;
      _57 = _56 != 0;
      _58 = _52 & 7;
      _59 = (signed char) _58;
      _60 = _59 >= _56;
      _61 = _57 & _60;
      if (_61 != 0)
	goto <bb 13>;
      else
	goto <bb 12>;

      <bb 13>:
      __asan_report_load1 (_52);

      <bb 12>:
      _1 = __builtin_strlen (&foo);
      if (_1 > 0)
	goto <bb 15>;
      else
	goto <bb 14>;

      <bb 15>:
      _62 = _1 - 1;
      _63 = _51 + _62;
      _64 = (unsigned long) _63;
      _65 = _64 >> 3;
      _66 = _65 + 17592186044416;
      _67 = (signed char *) _66;
      _68 = *_67;
      _69 = _68 != 0;
      _70 = _64 & 7;
      _71 = (signed char) _70;
      _72 = _71 >= _68;
      _73 = _69 & _72;
      if (_73 != 0)
	goto <bb 17>;
      else
	goto <bb 16>;

      <bb 17>:
      __asan_report_load1 (_64);

      <bb 16>:

      <bb 14>:
      l_2 = (int) _1;
      n_3 = 10;
      _4 = n_3 + -4;
      _5 = (long unsigned int) _4;
      if (_5 != 0)
	goto <bb 19>;
      else
	goto <bb 18>;

      <bb 19>:
      _74 = &foo[4];
      _75 = (unsigned long) _74;
      _76 = _75 >> 3;
      _77 = _76 + 17592186044416;
      _78 = (signed char *) _77;
      _79 = *_78;
      _80 = _79 != 0;
      _81 = _75 & 7;
      _82 = (signed char) _81;
      _83 = _82 >= _79;
      _84 = _80 & _83;
      _85 = _5;
      _86 = _85 - 1;
      _87 = &foo[4];
      _88 = _87 + _86;
      _89 = (unsigned long) _88;
      _90 = _89 >> 3;
      _91 = _90 + 17592186044416;
      _92 = (signed char *) _91;
      _93 = *_92;
      _94 = _93 != 0;
      _95 = _89 & 7;
      _96 = (signed char) _95;
      _97 = _96 >= _93;
      _98 = _94 & _97;
      if (_98 != 0)
	goto <bb 23>;
      else
	goto <bb 22>;

      <bb 23>:
      __asan_report_store1 (_89);

      <bb 22>:
      if (_84 != 0)
	goto <bb 21>;
      else
	goto <bb 20>;

      <bb 21>:
      __asan_report_store1 (_75);

      <bb 20>:

      <bb 18>:
      __builtin_memset (&foo[4], 0, _5);
      _99 = &foo[11];
      _100 = (unsigned long) _99;
      _101 = _100 >> 3;
      _102 = _101 + 17592186044416;
      _103 = (signed char *) _102;
      _104 = *_103;
      _105 = _104 != 0;
      _106 = _100 & 7;
      _107 = (signed char) _106;
      _108 = _107 >= _104;
      _109 = _105 & _108;
      if (_109 != 0)
	goto <bb 25>;
      else
	goto <bb 24>;

      <bb 25>:
      __asan_report_store1 (_100);

      <bb 24>:
      __sync_fetch_and_add_1 (&foo[11], 1);
      _6 = l_2;
      foo ={v} {CLOBBER};

    <L1>:
      return _6;

    }

    ;; Function _GLOBAL__sub_I_00099_0_foo (_GLOBAL__sub_I_00099_0_foo, funcdef_no=1, decl_uid=1750, cgraph_uid=4)

    _GLOBAL__sub_I_00099_0_foo ()
    {
      <bb 2>:
      __asan_init ();
      return;

    }

gcc/
	* asan.c (insert_if_then_before_iter, instrument_if_then_after_iter,
	instrument_mem_region_access, instrument_strlen_call,
	maybe_instrument_builtin_call, maybe_instrument_call): New static
	functions.
	(create_cond_insert_point): Renamed
	create_cond_insert_point_before_iter into this.  Add a new
	parameter to decide whether to insert the condition before or
	after the statement iterator.
	(build_check_stmt): Adjust for the new create_cond_insert_point.
	Add a new parameter to decide whether to add the instrumentation
	code before or after the statement iterator.
	(instrument_assignment): Factorize from ...
	(transform_statements): ... here.  Use maybe_instrument_call to
	instrument builtin function calls as well.
	(instrument_derefs): Adjust for the new parameter of
	build_check_stmt.  Fix detection of bit-field access.
---
 gcc/asan.c | 667 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 638 insertions(+), 29 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index 39e77e6..89750b0 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -398,9 +398,9 @@ asan_init_func (void)
 #define PROB_ALWAYS		(REG_BR_PROB_BASE)
 
 /* Split the current basic block and create a condition statement
-   insertion point right before the statement pointed to by ITER.
-   Return an iterator to the point at which the caller might safely
-   insert the condition statement.
+   insertion point right before or after the statement pointed to by
+   ITER.  Return an iterator to the point at which the caller might
+   safely insert the condition statement.
 
    THEN_BLOCK must be set to the address of an uninitialized instance
    of basic_block.  The function will then set *THEN_BLOCK to the
@@ -414,18 +414,21 @@ asan_init_func (void)
    statements starting from *ITER, and *THEN_BLOCK is a new empty
    block.
 
-   *ITER is adjusted to still point to the same statement it was
-   *pointing to initially.  */
+   *ITER is adjusted to point to always point to the first statement
+    of the basic block * FALLTHROUGH_BLOCK.  That statement is the
+    same as what ITER was pointing to prior to calling this function,
+    if BEFORE_P is true; otherwise, it is its following statement.  */
 
 static gimple_stmt_iterator
-create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
-				      bool then_more_likely_p,
-				      basic_block *then_block,
-				      basic_block *fallthrough_block)
+create_cond_insert_point (gimple_stmt_iterator *iter,
+			  bool before_p,
+			  bool then_more_likely_p,
+			  basic_block *then_block,
+			  basic_block *fallthrough_block)
 {
   gimple_stmt_iterator gsi = *iter;
 
-  if (!gsi_end_p (gsi))
+  if (!gsi_end_p (gsi) && before_p)
     gsi_prev (&gsi);
 
   basic_block cur_bb = gsi_bb (*iter);
@@ -466,18 +469,93 @@ create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
   return gsi_last_bb (cond_bb);
 }
 
+/* Insert an if condition followed by a 'then block' right before the
+   statement pointed to by ITER.  The fallthrough block -- which is the
+   else block of the condition as well as the destination of the
+   outcoming edge of the 'then block' -- starts with the statement
+   pointed to by ITER.
+
+   COND is the condition of the if.  
+
+   If THEN_MORE_LIKELY_P is true, the probability of the edge to the
+   'then block' is higher than the probability of the edge to the
+   fallthrough block.
+
+   Upon completion of the function, *THEN_BB is set to the newly
+   inserted 'then block' and similarly, *FALLTHROUGH_BB is set to the
+   fallthrough block.
+
+   *ITER is adjusted to still point to the same statement it was
+   pointing to initially.  */
+
+static void
+insert_if_then_before_iter (gimple cond,
+			    gimple_stmt_iterator *iter,
+			    bool then_more_likely_p,
+			    basic_block *then_bb,
+			    basic_block *fallthrough_bb)
+{
+  gimple_stmt_iterator cond_insert_point =
+    create_cond_insert_point (iter,
+			      /*before_p=*/true,
+			      then_more_likely_p,
+			      then_bb,
+			      fallthrough_bb);
+  gsi_insert_after (&cond_insert_point, cond, GSI_NEW_STMT);
+}
+
+/* Insert an if condition followed by a 'then block' right after the
+   statement pointed to by ITER.  The fallthrough block -- which is
+   the else block of the condition as well as the destination of the
+   outcoming edge of the 'then block' -- starts with the statement
+   pointed to by ITER.
+
+   COND is the condition of the if.  
+
+   If THEN_MORE_LIKELY_P is true, the probability of the edge to the
+   'then block' is higher than the probability of the edge to the
+   fallthrough block.
+
+   Upon completion of the function, *THEN_BB is set to the newly
+   inserted 'then block' and similarly, *FALLTHROUGH_BB is set to the
+   fallthrough block.
+
+   *ITER is adjusted to point to the same statement following the one
+   it was pointing to prior to calling this function.  */
+
+static void
+insert_if_then_after_iter (gimple cond,
+			   gimple_stmt_iterator *iter,
+			   bool then_more_likely_p,
+			   basic_block *then_bb,
+			   basic_block *fallthrough_bb)
+{
+  gimple_stmt_iterator cond_insert_point =
+    create_cond_insert_point (iter,
+			      /*before_p=*/false,
+			      then_more_likely_p,
+			      then_bb,
+			      fallthrough_bb);
+  gsi_insert_after (&cond_insert_point, cond, GSI_NEW_STMT);
+}
+
 /* Instrument the memory access instruction BASE.  Insert new
-   statements before ITER.
+   statements before or after ITER.
 
    Note that the memory access represented by BASE can be either an
    SSA_NAME, or a non-SSA expression.  LOCATION is the source code
    location.  IS_STORE is TRUE for a store, FALSE for a load.
-   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
+   BEFORE_P is TRUE for inserting the instrumentation code before
+   ITER, FALSE for inserting it after ITER.  SIZE_IN_BYTES is one of
+   1, 2, 4, 8, 16.
+
+   If BEFORE_P is TRUE, *ITER is arranged to still point to the
+   statement it was pointing to prior to calling this function,
+   otherwise, it points to the statement logically following it.  */
 
 static void
-build_check_stmt (tree base, gimple_stmt_iterator *iter,
-                  location_t location, bool is_store,
-		  int size_in_bytes)
+build_check_stmt (location_t location, tree base, gimple_stmt_iterator *iter,
+		  bool before_p, bool is_store, int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
   basic_block then_bb, else_bb;
@@ -491,10 +569,10 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 
   /* Get an iterator on the point where we can add the condition
      statement for the instrumentation.  */
-  gsi = create_cond_insert_point_before_iter (iter,
-					      /*then_more_likely_p=*/false,
-					      &then_bb,
-					      &else_bb);
+  gsi = create_cond_insert_point (iter, before_p,
+				  /*then_more_likely_p=*/false,
+				  &then_bb,
+				  &else_bb);
 
   base = unshare_expr (base);
 
@@ -626,7 +704,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 
 /* If T represents a memory access, add instrumentation code before ITER.
    LOCATION is source code location.
-   IS_STORE is either 1 (for a store) or 0 (for a load).  */
+   IS_STORE is either TRUE (for a store) or FALSE (for a load).  */
 
 static void
 instrument_derefs (gimple_stmt_iterator *iter, tree t,
@@ -661,11 +739,543 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
   int volatilep = 0, unsignedp = 0;
   get_inner_reference (t, &bitsize, &bitpos, &offset,
 		       &mode, &unsignedp, &volatilep, false);
-  if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
+  if (bitpos % (size_in_bytes * BITS_PER_UNIT)
+      || bitsize != size_in_bytes * BITS_PER_UNIT)
     return;
 
   base = build_fold_addr_expr (t);
-  build_check_stmt (base, iter, location, is_store, size_in_bytes);
+  build_check_stmt (location, base, iter, /*before_p=*/true,
+		    is_store, size_in_bytes);
+}
+
+/* Instrument an access to a contiguous memory region that starts at
+   the address pointed to by BASE, over a length of LEN (expressed in
+   the sizeof (*BASE) bytes).  ITER points to the instruction before
+   which the instrumentation instructions must be inserted.  LOCATION
+   is the source location that the instrumentation instructions must
+   have.  If IS_STORE is true, then the memory access is a store;
+   otherwise, it's a load.  */
+
+static void
+instrument_mem_region_access (tree base, tree len,
+			      gimple_stmt_iterator *iter,
+			      location_t location, bool is_store)
+{
+  if (integer_zerop (len))
+    return;
+
+  gimple_stmt_iterator gsi = *iter;
+
+  basic_block fallthrough_bb = NULL, then_bb = NULL;
+  if (!is_gimple_constant (len))
+    {
+      /* So, the length of the memory area to asan-protect is
+	 non-constant.  Let's guard the generated instrumentation code
+	 like:
+
+	 if (len != 0)
+	   {
+	     //asan instrumentation code goes here.
+           }
+	   // falltrough instructions, starting with *ITER.  */
+
+      gimple g = gimple_build_cond (NE_EXPR,
+				    len,
+				    build_int_cst (TREE_TYPE (len), 0),
+				    NULL_TREE, NULL_TREE);
+      gimple_set_location (g, location);
+      insert_if_then_before_iter (g, iter, /*then_more_likely_p=*/true,
+				  &then_bb, &fallthrough_bb);
+      /* Note that fallthrough_bb starts with the statement that was
+	 pointed to by ITER.  */
+
+      /* The 'then block' of the 'if (len != 0) condition is where
+	 we'll generate the asan instrumentation code now.  */
+      gsi = gsi_start_bb (then_bb);
+    }
+
+  /* Instrument the beginning of the memory region to be accessed,
+     and arrange for the rest of the intrumentation code to be
+     inserted in the then block *after* the current gsi.  */
+  build_check_stmt (location, base, &gsi, /*before_p=*/true, is_store, 1);
+
+  if (then_bb)
+    /* We are in the case where the length of the region is not
+       constant; so instrumentation code is being generated in the
+       'then block' of the 'if (len != 0) condition.  Let's arrange
+       for the subsequent instrumentation statements to go in the
+       'then block'.  */
+    gsi = gsi_last_bb (then_bb);
+  else
+    *iter = gsi;
+
+  /* We want to instrument the access at the end of the memory region,
+     which is at (base + len - 1).  */
+
+  /* offset = len - 1;  */
+  len = unshare_expr (len);
+  gimple offset =
+    gimple_build_assign_with_ops (TREE_CODE (len),
+				  make_ssa_name (TREE_TYPE (len), NULL),
+				  len, NULL);
+  gimple_set_location (offset, location);
+  gsi_insert_before (&gsi, offset, GSI_NEW_STMT);
+
+  offset =
+    gimple_build_assign_with_ops (MINUS_EXPR,
+				  make_ssa_name (size_type_node, NULL),
+				  gimple_assign_lhs (offset),
+				  build_int_cst (size_type_node, 1));
+  gimple_set_location (offset, location);
+  gsi_insert_after (&gsi, offset, GSI_NEW_STMT);
+
+  /* _1 = base;  */
+  base = unshare_expr (base);
+  gimple region_end =
+    gimple_build_assign_with_ops (TREE_CODE (base),
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  base, NULL);
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* _2 = _1 + offset;  */
+  region_end =
+    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  gimple_assign_lhs (region_end), 
+				  gimple_assign_lhs (offset));
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* instrument access at _2;  */
+  build_check_stmt (location, gimple_assign_lhs (region_end),
+		    &gsi, /*before_p=*/false, is_store, 1);
+}
+
+/* Instrument the strlen builtin call pointed to by ITER.
+
+   This function instruments the access to the first byte of the
+   argument, right before the call.  After the call it instruments the
+   access to the last byte of the argument; it uses the result of the
+   call to deduce the offset of that last byte.  */
+
+static void
+instrument_strlen_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+  gcc_assert (is_gimple_call (call));
+
+  tree callee = gimple_call_fndecl (call);
+  gcc_assert (is_builtin_fn (callee)
+	      && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL
+	      && DECL_FUNCTION_CODE (callee) == BUILT_IN_STRLEN);
+
+  location_t location = gimple_location (call);
+  tree str_arg = gimple_call_arg (call, 0);
+
+  /* Instrument the access to the first byte of str_arg.  i.e:
+
+     _1 = str_arg; instrument (_1); */
+  gimple str_arg_ssa =
+    gimple_build_assign_with_ops (NOP_EXPR,
+				  make_ssa_name (build_pointer_type
+						 (char_type_node), NULL),
+				  str_arg, NULL);
+  gimple_set_location (str_arg_ssa, location);
+  gimple_stmt_iterator gsi = *iter;
+  gsi_insert_before (&gsi, str_arg_ssa, GSI_NEW_STMT);
+  build_check_stmt (location, gimple_assign_lhs (str_arg_ssa), &gsi,
+		    /*before_p=*/false, /*is_store=*/false, 1);
+
+  /* If we initially had an instruction like:
+
+	 int n = strlen (str)
+
+     we now want to instrument the access to str[n-1], after the
+     instruction above, if n is greater than 0.  */
+
+  /* So let's build the access to str[n-1], in a if (n > 0) {} block.  */
+
+  tree len = gimple_call_lhs (call);
+
+  /*  if (len > 0); */
+  gimple cond = gimple_build_cond (GT_EXPR, len,
+				   build_int_cst (size_type_node, 0),
+				   NULL, NULL);
+  basic_block then_bb, fallthrough_bb;
+  insert_if_then_after_iter (cond, &gsi, /*then_more_likely_p=*/true,
+			     &then_bb, &fallthrough_bb);
+  /*  fallthrough_bb starts with the statement that gsi pointed to
+      initially; that is, at the statement that comes after the call
+      to strlen.  The instrumentation code is going to be emitted in
+      then_bb.  */
+
+  gsi = gsi_start_bb (then_bb);
+
+  /* _3 = _2 - 1;  */
+  gimple stmt =
+    gimple_build_assign_with_ops (MINUS_EXPR,
+				  make_ssa_name (size_type_node, NULL),
+				  len,
+				  build_int_cst (size_type_node, 1));
+  gimple_set_location (stmt, location);
+  gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
+
+  /* really build the access to str[n-1] and instrument it now.  That
+     is, access through the pointer_plus expr: (_1 + _3).  */
+  stmt =
+    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+				  make_ssa_name (TREE_TYPE (str_arg),
+						 NULL),
+				  gimple_assign_lhs (str_arg_ssa),
+				  gimple_assign_lhs (stmt));
+  gimple_set_location (stmt, location);
+  gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
+
+  build_check_stmt (location, gimple_assign_lhs (stmt), &gsi,
+		    /*before_p=*/false, /*is_store=*/false, 1);
+
+  /* Ensure that iter points to the statement logically following the
+     one it was initially pointing to.  */
+  *iter = gsi_start_bb (fallthrough_bb);
+}
+
+/* if the statement pointed to by the iterator iter is a call to a
+   builtin memory access function, instrumented it and return true.
+   otherwise, return false.  */
+
+static bool
+maybe_instrument_builtin_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+  location_t location = gimple_location (call);
+
+  if (!is_gimple_call (call))
+    return false;
+
+  tree callee = gimple_call_fndecl (call);
+
+  if (!is_builtin_fn (callee)
+      || DECL_BUILT_IN_CLASS (callee) != BUILT_IN_NORMAL)
+    return false;
+
+  tree source0 = NULL_TREE, source1 = NULL_TREE,
+    dest = NULL_TREE, len = NULL_TREE;
+  bool is_store = true;
+
+  switch (DECL_FUNCTION_CODE (callee))
+    {
+      /* (s, s, n) style memops.  */
+    case BUILT_IN_BCMP:
+    case BUILT_IN_MEMCMP:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      source1 = gimple_call_arg (call, 1);
+      break;
+
+      /* (src, dest, n) style memops.  */
+    case BUILT_IN_BCOPY:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      dest = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, src, n) style memops.  */
+    case BUILT_IN_MEMCPY:
+    case BUILT_IN_MEMCPY_CHK:
+    case BUILT_IN_MEMMOVE:
+    case BUILT_IN_MEMMOVE_CHK:
+    case BUILT_IN_MEMPCPY:
+    case BUILT_IN_MEMPCPY_CHK:
+      dest = gimple_call_arg (call, 0);
+      source0 = gimple_call_arg (call, 1);
+      len = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, n) style memops.  */
+    case BUILT_IN_BZERO:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 1);
+      break;
+
+      /* (dest, x, n) style memops*/
+    case BUILT_IN_MEMSET:
+    case BUILT_IN_MEMSET_CHK:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 2);
+      break;
+
+    case BUILT_IN_STRLEN:
+      instrument_strlen_call (iter);
+      return true;
+
+    /* And now the __atomic* and __sync builtins.
+       These are handled differently from the classical memory memory
+       access builtins above.  */
+
+    case BUILT_IN_SYNC_FETCH_AND_ADD_1:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_2:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_4:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_8:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_SUB_1:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_2:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_4:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_8:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_OR_1:
+    case BUILT_IN_SYNC_FETCH_AND_OR_2:
+    case BUILT_IN_SYNC_FETCH_AND_OR_4:
+    case BUILT_IN_SYNC_FETCH_AND_OR_8:
+    case BUILT_IN_SYNC_FETCH_AND_OR_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_AND_1:
+    case BUILT_IN_SYNC_FETCH_AND_AND_2:
+    case BUILT_IN_SYNC_FETCH_AND_AND_4:
+    case BUILT_IN_SYNC_FETCH_AND_AND_8:
+    case BUILT_IN_SYNC_FETCH_AND_AND_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_XOR_1:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_2:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_4:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_8:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_NAND_1:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_2:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_4:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_8:
+
+    case BUILT_IN_SYNC_ADD_AND_FETCH_1:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_2:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_4:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_8:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_SUB_AND_FETCH_1:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_2:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_4:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_8:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_OR_AND_FETCH_1:
+    case BUILT_IN_SYNC_OR_AND_FETCH_2:
+    case BUILT_IN_SYNC_OR_AND_FETCH_4:
+    case BUILT_IN_SYNC_OR_AND_FETCH_8:
+    case BUILT_IN_SYNC_OR_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_AND_AND_FETCH_1:
+    case BUILT_IN_SYNC_AND_AND_FETCH_2:
+    case BUILT_IN_SYNC_AND_AND_FETCH_4:
+    case BUILT_IN_SYNC_AND_AND_FETCH_8:
+    case BUILT_IN_SYNC_AND_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_XOR_AND_FETCH_1:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_2:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_4:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_8:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_NAND_AND_FETCH_1:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_2:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_4:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_8:
+
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_1:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_2:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_4:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_8:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_16:
+
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_1:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_2:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_4:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_8:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_16:
+
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_1:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_2:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_4:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_8:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_16:
+
+    case BUILT_IN_SYNC_LOCK_RELEASE_1:
+    case BUILT_IN_SYNC_LOCK_RELEASE_2:
+    case BUILT_IN_SYNC_LOCK_RELEASE_4:
+    case BUILT_IN_SYNC_LOCK_RELEASE_8:
+    case BUILT_IN_SYNC_LOCK_RELEASE_16:
+
+    case BUILT_IN_ATOMIC_TEST_AND_SET:
+    case BUILT_IN_ATOMIC_CLEAR:
+    case BUILT_IN_ATOMIC_EXCHANGE:
+    case BUILT_IN_ATOMIC_EXCHANGE_1:
+    case BUILT_IN_ATOMIC_EXCHANGE_2:
+    case BUILT_IN_ATOMIC_EXCHANGE_4:
+    case BUILT_IN_ATOMIC_EXCHANGE_8:
+    case BUILT_IN_ATOMIC_EXCHANGE_16:
+
+    case BUILT_IN_ATOMIC_LOAD:
+    case BUILT_IN_ATOMIC_LOAD_1:
+    case BUILT_IN_ATOMIC_LOAD_2:
+    case BUILT_IN_ATOMIC_LOAD_4:
+    case BUILT_IN_ATOMIC_LOAD_8:
+    case BUILT_IN_ATOMIC_LOAD_16:
+
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_2:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_8:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_16:
+
+    case BUILT_IN_ATOMIC_STORE:
+    case BUILT_IN_ATOMIC_STORE_1:
+    case BUILT_IN_ATOMIC_STORE_2:
+    case BUILT_IN_ATOMIC_STORE_4:
+    case BUILT_IN_ATOMIC_STORE_8:
+    case BUILT_IN_ATOMIC_STORE_16:
+
+    case BUILT_IN_ATOMIC_ADD_FETCH_1:
+    case BUILT_IN_ATOMIC_ADD_FETCH_2:
+    case BUILT_IN_ATOMIC_ADD_FETCH_4:
+    case BUILT_IN_ATOMIC_ADD_FETCH_8:
+    case BUILT_IN_ATOMIC_ADD_FETCH_16:
+
+    case BUILT_IN_ATOMIC_SUB_FETCH_1:
+    case BUILT_IN_ATOMIC_SUB_FETCH_2:
+    case BUILT_IN_ATOMIC_SUB_FETCH_4:
+    case BUILT_IN_ATOMIC_SUB_FETCH_8:
+    case BUILT_IN_ATOMIC_SUB_FETCH_16:
+
+    case BUILT_IN_ATOMIC_AND_FETCH_1:
+    case BUILT_IN_ATOMIC_AND_FETCH_2:
+    case BUILT_IN_ATOMIC_AND_FETCH_4:
+    case BUILT_IN_ATOMIC_AND_FETCH_8:
+    case BUILT_IN_ATOMIC_AND_FETCH_16:
+
+    case BUILT_IN_ATOMIC_NAND_FETCH_1:
+    case BUILT_IN_ATOMIC_NAND_FETCH_2:
+    case BUILT_IN_ATOMIC_NAND_FETCH_4:
+    case BUILT_IN_ATOMIC_NAND_FETCH_8:
+    case BUILT_IN_ATOMIC_NAND_FETCH_16:
+
+    case BUILT_IN_ATOMIC_XOR_FETCH_1:
+    case BUILT_IN_ATOMIC_XOR_FETCH_2:
+    case BUILT_IN_ATOMIC_XOR_FETCH_4:
+    case BUILT_IN_ATOMIC_XOR_FETCH_8:
+    case BUILT_IN_ATOMIC_XOR_FETCH_16:
+
+    case BUILT_IN_ATOMIC_OR_FETCH_1:
+    case BUILT_IN_ATOMIC_OR_FETCH_2:
+    case BUILT_IN_ATOMIC_OR_FETCH_4:
+    case BUILT_IN_ATOMIC_OR_FETCH_8:
+    case BUILT_IN_ATOMIC_OR_FETCH_16:
+
+    case BUILT_IN_ATOMIC_FETCH_ADD_1:
+    case BUILT_IN_ATOMIC_FETCH_ADD_2:
+    case BUILT_IN_ATOMIC_FETCH_ADD_4:
+    case BUILT_IN_ATOMIC_FETCH_ADD_8:
+    case BUILT_IN_ATOMIC_FETCH_ADD_16:
+
+    case BUILT_IN_ATOMIC_FETCH_SUB_1:
+    case BUILT_IN_ATOMIC_FETCH_SUB_2:
+    case BUILT_IN_ATOMIC_FETCH_SUB_4:
+    case BUILT_IN_ATOMIC_FETCH_SUB_8:
+    case BUILT_IN_ATOMIC_FETCH_SUB_16:
+
+    case BUILT_IN_ATOMIC_FETCH_AND_1:
+    case BUILT_IN_ATOMIC_FETCH_AND_2:
+    case BUILT_IN_ATOMIC_FETCH_AND_4:
+    case BUILT_IN_ATOMIC_FETCH_AND_8:
+    case BUILT_IN_ATOMIC_FETCH_AND_16:
+
+    case BUILT_IN_ATOMIC_FETCH_NAND_1:
+    case BUILT_IN_ATOMIC_FETCH_NAND_2:
+    case BUILT_IN_ATOMIC_FETCH_NAND_4:
+    case BUILT_IN_ATOMIC_FETCH_NAND_8:
+    case BUILT_IN_ATOMIC_FETCH_NAND_16:
+
+    case BUILT_IN_ATOMIC_FETCH_XOR_1:
+    case BUILT_IN_ATOMIC_FETCH_XOR_2:
+    case BUILT_IN_ATOMIC_FETCH_XOR_4:
+    case BUILT_IN_ATOMIC_FETCH_XOR_8:
+    case BUILT_IN_ATOMIC_FETCH_XOR_16:
+
+    case BUILT_IN_ATOMIC_FETCH_OR_1:
+    case BUILT_IN_ATOMIC_FETCH_OR_2:
+    case BUILT_IN_ATOMIC_FETCH_OR_4:
+    case BUILT_IN_ATOMIC_FETCH_OR_8:
+    case BUILT_IN_ATOMIC_FETCH_OR_16:
+      {
+	dest = gimple_call_arg (call, 0);
+	/* So DEST represents the address of a memory location.
+	   instrument_derefs wants the memory location, so lets
+	   dereference the address DEST before handing it to
+	   instrument_derefs.  */
+	if (TREE_CODE (dest) == ADDR_EXPR)
+	  dest = TREE_OPERAND (dest, 0);
+	else if (TREE_CODE (dest) == SSA_NAME)
+	  dest = build2 (MEM_REF, TREE_TYPE (TREE_TYPE (dest)),
+			 dest, build_int_cst (TREE_TYPE (dest), 0));
+	else
+	  gcc_unreachable ();
+
+	instrument_derefs (iter, dest, location, is_store);
+	return true;
+      }
+
+    default:
+      /* The other builtins memory access are not instrumented in this
+	 function because they either don't have any length parameter,
+	 or their length parameter is just a limit.  */
+      break;
+    }
+
+  if (len != NULL_TREE)
+    {
+      is_store = (dest != NULL_TREE);
+
+      if (source0 != NULL_TREE)
+	instrument_mem_region_access (source0, len, iter,
+				      location, is_store);
+      if (source1 != NULL_TREE)
+	instrument_mem_region_access (source1, len, iter,
+				      location, is_store);
+      else if (dest != NULL_TREE)
+	instrument_mem_region_access (dest, len, iter,
+				      location, is_store);
+      return true;
+    }
+  return false;
+}
+
+/*  Instrument the assignment statement ITER if it is subject to
+    instrumentation.  */
+
+static void
+instrument_assignment (gimple_stmt_iterator *iter)
+{
+  gimple s = gsi_stmt (*iter);
+
+  gcc_assert (gimple_assign_single_p (s));
+
+  instrument_derefs (iter, gimple_assign_lhs (s),
+		     gimple_location (s), true);
+  instrument_derefs (iter, gimple_assign_rhs1 (s),
+		     gimple_location (s), false);
+}
+
+/* Instrument the function call pointed to by the iterator ITER, if it
+   is subject to instrumentation.  At the moment, the only function
+   calls that are instrumented are some built-in functions that access
+   memory.  Look at maybe_instrument_builtin_call to learn more.  */
+
+static void
+maybe_instrument_call (gimple_stmt_iterator *iter)
+{
+  maybe_instrument_builtin_call (iter);
 }
 
 /* asan: this looks too complex. Can this be done simpler? */
@@ -686,13 +1296,12 @@ transform_statements (void)
       if (bb->index >= saved_last_basic_block) continue;
       for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
         {
-          gimple s = gsi_stmt (i);
-          if (!gimple_assign_single_p (s))
-	    continue;
-          instrument_derefs (&i, gimple_assign_lhs (s),
-                             gimple_location (s), true);
-          instrument_derefs (&i, gimple_assign_rhs1 (s),
-                             gimple_location (s), false);
+	  gimple s = gsi_stmt (i);
+
+	  if (gimple_assign_single_p (s))
+	    instrument_assignment (&i);
+	  else if (is_gimple_call (s))
+	    maybe_instrument_call (&i);
         }
     }
 }
-- 
		Dodji

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-25 21:46         ` Dodji Seketeli
@ 2012-10-25 22:13           ` Jakub Jelinek
  2012-10-25 22:39             ` Dodji Seketeli
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2012-10-25 22:13 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

On Thu, Oct 25, 2012 at 11:32:58PM +0200, Dodji Seketeli wrote:
> +  tree source0 = NULL_TREE, source1 = NULL_TREE,
> +    dest = NULL_TREE, len = NULL_TREE;
> +  bool is_store = true;
...
nothing sets is_store here.
...
> +
> +	instrument_derefs (iter, dest, location, is_store);

BUILTIN_ATOMIC_LOAD* are just loads and so should clear is_store.

> +  if (len != NULL_TREE)
> +    {
> +      is_store = (dest != NULL_TREE);
> +
> +      if (source0 != NULL_TREE)
> +	instrument_mem_region_access (source0, len, iter,
> +				      location, is_store);
> +      if (source1 != NULL_TREE)
> +	instrument_mem_region_access (source1, len, iter,
> +				      location, is_store);
> +      else if (dest != NULL_TREE)
> +	instrument_mem_region_access (dest, len, iter,
> +				      location, is_store);

This looks wrong, source0/source1 are always reads, aren't they?
So you should pass false for the first two instrument_mem_region_access
calls and true to the last one.

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-25 22:13           ` Jakub Jelinek
@ 2012-10-25 22:39             ` Dodji Seketeli
  2012-10-25 22:57               ` Jakub Jelinek
  0 siblings, 1 reply; 25+ messages in thread
From: Dodji Seketeli @ 2012-10-25 22:39 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

Jakub Jelinek <jakub@redhat.com> writes:

>> +	instrument_derefs (iter, dest, location, is_store);
>
> BUILTIN_ATOMIC_LOAD* are just loads and so should clear is_store.

Done.

>
>> +  if (len != NULL_TREE)
>> +    {
>> +      is_store = (dest != NULL_TREE);
>> +
>> +      if (source0 != NULL_TREE)
>> +	instrument_mem_region_access (source0, len, iter,
>> +				      location, is_store);
>> +      if (source1 != NULL_TREE)
>> +	instrument_mem_region_access (source1, len, iter,
>> +				      location, is_store);
>> +      else if (dest != NULL_TREE)
>> +	instrument_mem_region_access (dest, len, iter,
>> +				      location, is_store);
>
> This looks wrong, source0/source1 are always reads, aren't they?
> So you should pass false for the first two instrument_mem_region_access
> calls and true to the last one.

Done.


gcc/
	* asan.c (insert_if_then_before_iter, instrument_if_then_after_iter,
	instrument_mem_region_access, instrument_strlen_call,
	maybe_instrument_builtin_call, maybe_instrument_call): New static
	functions.
	(create_cond_insert_point): Renamed
	create_cond_insert_point_before_iter into this.  Add a new
	parameter to decide whether to insert the condition before or
	after the statement iterator.
	(build_check_stmt): Adjust for the new create_cond_insert_point.
	Add a new parameter to decide whether to add the instrumentation
	code before or after the statement iterator.
	(instrument_assignment): Factorize from ...
	(transform_statements): ... here.  Use maybe_instrument_call to
	instrument builtin function calls as well.
	(instrument_derefs): Adjust for the new parameter of
	build_check_stmt.  Fix detection of bit-field access.
---
 gcc/asan.c | 667 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 638 insertions(+), 29 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index 39e77e6..dc5e749 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -398,9 +398,9 @@ asan_init_func (void)
 #define PROB_ALWAYS		(REG_BR_PROB_BASE)
 
 /* Split the current basic block and create a condition statement
-   insertion point right before the statement pointed to by ITER.
-   Return an iterator to the point at which the caller might safely
-   insert the condition statement.
+   insertion point right before or after the statement pointed to by
+   ITER.  Return an iterator to the point at which the caller might
+   safely insert the condition statement.
 
    THEN_BLOCK must be set to the address of an uninitialized instance
    of basic_block.  The function will then set *THEN_BLOCK to the
@@ -414,18 +414,21 @@ asan_init_func (void)
    statements starting from *ITER, and *THEN_BLOCK is a new empty
    block.
 
-   *ITER is adjusted to still point to the same statement it was
-   *pointing to initially.  */
+   *ITER is adjusted to point to always point to the first statement
+    of the basic block * FALLTHROUGH_BLOCK.  That statement is the
+    same as what ITER was pointing to prior to calling this function,
+    if BEFORE_P is true; otherwise, it is its following statement.  */
 
 static gimple_stmt_iterator
-create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
-				      bool then_more_likely_p,
-				      basic_block *then_block,
-				      basic_block *fallthrough_block)
+create_cond_insert_point (gimple_stmt_iterator *iter,
+			  bool before_p,
+			  bool then_more_likely_p,
+			  basic_block *then_block,
+			  basic_block *fallthrough_block)
 {
   gimple_stmt_iterator gsi = *iter;
 
-  if (!gsi_end_p (gsi))
+  if (!gsi_end_p (gsi) && before_p)
     gsi_prev (&gsi);
 
   basic_block cur_bb = gsi_bb (*iter);
@@ -466,18 +469,93 @@ create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
   return gsi_last_bb (cond_bb);
 }
 
+/* Insert an if condition followed by a 'then block' right before the
+   statement pointed to by ITER.  The fallthrough block -- which is the
+   else block of the condition as well as the destination of the
+   outcoming edge of the 'then block' -- starts with the statement
+   pointed to by ITER.
+
+   COND is the condition of the if.  
+
+   If THEN_MORE_LIKELY_P is true, the probability of the edge to the
+   'then block' is higher than the probability of the edge to the
+   fallthrough block.
+
+   Upon completion of the function, *THEN_BB is set to the newly
+   inserted 'then block' and similarly, *FALLTHROUGH_BB is set to the
+   fallthrough block.
+
+   *ITER is adjusted to still point to the same statement it was
+   pointing to initially.  */
+
+static void
+insert_if_then_before_iter (gimple cond,
+			    gimple_stmt_iterator *iter,
+			    bool then_more_likely_p,
+			    basic_block *then_bb,
+			    basic_block *fallthrough_bb)
+{
+  gimple_stmt_iterator cond_insert_point =
+    create_cond_insert_point (iter,
+			      /*before_p=*/true,
+			      then_more_likely_p,
+			      then_bb,
+			      fallthrough_bb);
+  gsi_insert_after (&cond_insert_point, cond, GSI_NEW_STMT);
+}
+
+/* Insert an if condition followed by a 'then block' right after the
+   statement pointed to by ITER.  The fallthrough block -- which is
+   the else block of the condition as well as the destination of the
+   outcoming edge of the 'then block' -- starts with the statement
+   pointed to by ITER.
+
+   COND is the condition of the if.  
+
+   If THEN_MORE_LIKELY_P is true, the probability of the edge to the
+   'then block' is higher than the probability of the edge to the
+   fallthrough block.
+
+   Upon completion of the function, *THEN_BB is set to the newly
+   inserted 'then block' and similarly, *FALLTHROUGH_BB is set to the
+   fallthrough block.
+
+   *ITER is adjusted to point to the same statement following the one
+   it was pointing to prior to calling this function.  */
+
+static void
+insert_if_then_after_iter (gimple cond,
+			   gimple_stmt_iterator *iter,
+			   bool then_more_likely_p,
+			   basic_block *then_bb,
+			   basic_block *fallthrough_bb)
+{
+  gimple_stmt_iterator cond_insert_point =
+    create_cond_insert_point (iter,
+			      /*before_p=*/false,
+			      then_more_likely_p,
+			      then_bb,
+			      fallthrough_bb);
+  gsi_insert_after (&cond_insert_point, cond, GSI_NEW_STMT);
+}
+
 /* Instrument the memory access instruction BASE.  Insert new
-   statements before ITER.
+   statements before or after ITER.
 
    Note that the memory access represented by BASE can be either an
    SSA_NAME, or a non-SSA expression.  LOCATION is the source code
    location.  IS_STORE is TRUE for a store, FALSE for a load.
-   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
+   BEFORE_P is TRUE for inserting the instrumentation code before
+   ITER, FALSE for inserting it after ITER.  SIZE_IN_BYTES is one of
+   1, 2, 4, 8, 16.
+
+   If BEFORE_P is TRUE, *ITER is arranged to still point to the
+   statement it was pointing to prior to calling this function,
+   otherwise, it points to the statement logically following it.  */
 
 static void
-build_check_stmt (tree base, gimple_stmt_iterator *iter,
-                  location_t location, bool is_store,
-		  int size_in_bytes)
+build_check_stmt (location_t location, tree base, gimple_stmt_iterator *iter,
+		  bool before_p, bool is_store, int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
   basic_block then_bb, else_bb;
@@ -491,10 +569,10 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 
   /* Get an iterator on the point where we can add the condition
      statement for the instrumentation.  */
-  gsi = create_cond_insert_point_before_iter (iter,
-					      /*then_more_likely_p=*/false,
-					      &then_bb,
-					      &else_bb);
+  gsi = create_cond_insert_point (iter, before_p,
+				  /*then_more_likely_p=*/false,
+				  &then_bb,
+				  &else_bb);
 
   base = unshare_expr (base);
 
@@ -626,7 +704,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 
 /* If T represents a memory access, add instrumentation code before ITER.
    LOCATION is source code location.
-   IS_STORE is either 1 (for a store) or 0 (for a load).  */
+   IS_STORE is either TRUE (for a store) or FALSE (for a load).  */
 
 static void
 instrument_derefs (gimple_stmt_iterator *iter, tree t,
@@ -661,11 +739,543 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
   int volatilep = 0, unsignedp = 0;
   get_inner_reference (t, &bitsize, &bitpos, &offset,
 		       &mode, &unsignedp, &volatilep, false);
-  if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
+  if (bitpos % (size_in_bytes * BITS_PER_UNIT)
+      || bitsize != size_in_bytes * BITS_PER_UNIT)
     return;
 
   base = build_fold_addr_expr (t);
-  build_check_stmt (base, iter, location, is_store, size_in_bytes);
+  build_check_stmt (location, base, iter, /*before_p=*/true,
+		    is_store, size_in_bytes);
+}
+
+/* Instrument an access to a contiguous memory region that starts at
+   the address pointed to by BASE, over a length of LEN (expressed in
+   the sizeof (*BASE) bytes).  ITER points to the instruction before
+   which the instrumentation instructions must be inserted.  LOCATION
+   is the source location that the instrumentation instructions must
+   have.  If IS_STORE is true, then the memory access is a store;
+   otherwise, it's a load.  */
+
+static void
+instrument_mem_region_access (tree base, tree len,
+			      gimple_stmt_iterator *iter,
+			      location_t location, bool is_store)
+{
+  if (integer_zerop (len))
+    return;
+
+  gimple_stmt_iterator gsi = *iter;
+
+  basic_block fallthrough_bb = NULL, then_bb = NULL;
+  if (!is_gimple_constant (len))
+    {
+      /* So, the length of the memory area to asan-protect is
+	 non-constant.  Let's guard the generated instrumentation code
+	 like:
+
+	 if (len != 0)
+	   {
+	     //asan instrumentation code goes here.
+           }
+	   // falltrough instructions, starting with *ITER.  */
+
+      gimple g = gimple_build_cond (NE_EXPR,
+				    len,
+				    build_int_cst (TREE_TYPE (len), 0),
+				    NULL_TREE, NULL_TREE);
+      gimple_set_location (g, location);
+      insert_if_then_before_iter (g, iter, /*then_more_likely_p=*/true,
+				  &then_bb, &fallthrough_bb);
+      /* Note that fallthrough_bb starts with the statement that was
+	 pointed to by ITER.  */
+
+      /* The 'then block' of the 'if (len != 0) condition is where
+	 we'll generate the asan instrumentation code now.  */
+      gsi = gsi_start_bb (then_bb);
+    }
+
+  /* Instrument the beginning of the memory region to be accessed,
+     and arrange for the rest of the intrumentation code to be
+     inserted in the then block *after* the current gsi.  */
+  build_check_stmt (location, base, &gsi, /*before_p=*/true, is_store, 1);
+
+  if (then_bb)
+    /* We are in the case where the length of the region is not
+       constant; so instrumentation code is being generated in the
+       'then block' of the 'if (len != 0) condition.  Let's arrange
+       for the subsequent instrumentation statements to go in the
+       'then block'.  */
+    gsi = gsi_last_bb (then_bb);
+  else
+    *iter = gsi;
+
+  /* We want to instrument the access at the end of the memory region,
+     which is at (base + len - 1).  */
+
+  /* offset = len - 1;  */
+  len = unshare_expr (len);
+  gimple offset =
+    gimple_build_assign_with_ops (TREE_CODE (len),
+				  make_ssa_name (TREE_TYPE (len), NULL),
+				  len, NULL);
+  gimple_set_location (offset, location);
+  gsi_insert_before (&gsi, offset, GSI_NEW_STMT);
+
+  offset =
+    gimple_build_assign_with_ops (MINUS_EXPR,
+				  make_ssa_name (size_type_node, NULL),
+				  gimple_assign_lhs (offset),
+				  build_int_cst (size_type_node, 1));
+  gimple_set_location (offset, location);
+  gsi_insert_after (&gsi, offset, GSI_NEW_STMT);
+
+  /* _1 = base;  */
+  base = unshare_expr (base);
+  gimple region_end =
+    gimple_build_assign_with_ops (TREE_CODE (base),
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  base, NULL);
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* _2 = _1 + offset;  */
+  region_end =
+    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  gimple_assign_lhs (region_end), 
+				  gimple_assign_lhs (offset));
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* instrument access at _2;  */
+  build_check_stmt (location, gimple_assign_lhs (region_end),
+		    &gsi, /*before_p=*/false, is_store, 1);
+}
+
+/* Instrument the strlen builtin call pointed to by ITER.
+
+   This function instruments the access to the first byte of the
+   argument, right before the call.  After the call it instruments the
+   access to the last byte of the argument; it uses the result of the
+   call to deduce the offset of that last byte.  */
+
+static void
+instrument_strlen_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+  gcc_assert (is_gimple_call (call));
+
+  tree callee = gimple_call_fndecl (call);
+  gcc_assert (is_builtin_fn (callee)
+	      && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL
+	      && DECL_FUNCTION_CODE (callee) == BUILT_IN_STRLEN);
+
+  location_t location = gimple_location (call);
+  tree str_arg = gimple_call_arg (call, 0);
+
+  /* Instrument the access to the first byte of str_arg.  i.e:
+
+     _1 = str_arg; instrument (_1); */
+  gimple str_arg_ssa =
+    gimple_build_assign_with_ops (NOP_EXPR,
+				  make_ssa_name (build_pointer_type
+						 (char_type_node), NULL),
+				  str_arg, NULL);
+  gimple_set_location (str_arg_ssa, location);
+  gimple_stmt_iterator gsi = *iter;
+  gsi_insert_before (&gsi, str_arg_ssa, GSI_NEW_STMT);
+  build_check_stmt (location, gimple_assign_lhs (str_arg_ssa), &gsi,
+		    /*before_p=*/false, /*is_store=*/false, 1);
+
+  /* If we initially had an instruction like:
+
+	 int n = strlen (str)
+
+     we now want to instrument the access to str[n-1], after the
+     instruction above, if n is greater than 0.  */
+
+  /* So let's build the access to str[n-1], in a if (n > 0) {} block.  */
+
+  tree len = gimple_call_lhs (call);
+
+  /*  if (len > 0); */
+  gimple cond = gimple_build_cond (GT_EXPR, len,
+				   build_int_cst (size_type_node, 0),
+				   NULL, NULL);
+  basic_block then_bb, fallthrough_bb;
+  insert_if_then_after_iter (cond, &gsi, /*then_more_likely_p=*/true,
+			     &then_bb, &fallthrough_bb);
+  /*  fallthrough_bb starts with the statement that gsi pointed to
+      initially; that is, at the statement that comes after the call
+      to strlen.  The instrumentation code is going to be emitted in
+      then_bb.  */
+
+  gsi = gsi_start_bb (then_bb);
+
+  /* _3 = _2 - 1;  */
+  gimple stmt =
+    gimple_build_assign_with_ops (MINUS_EXPR,
+				  make_ssa_name (size_type_node, NULL),
+				  len,
+				  build_int_cst (size_type_node, 1));
+  gimple_set_location (stmt, location);
+  gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
+
+  /* really build the access to str[n-1] and instrument it now.  That
+     is, access through the pointer_plus expr: (_1 + _3).  */
+  stmt =
+    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+				  make_ssa_name (TREE_TYPE (str_arg),
+						 NULL),
+				  gimple_assign_lhs (str_arg_ssa),
+				  gimple_assign_lhs (stmt));
+  gimple_set_location (stmt, location);
+  gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
+
+  build_check_stmt (location, gimple_assign_lhs (stmt), &gsi,
+		    /*before_p=*/false, /*is_store=*/false, 1);
+
+  /* Ensure that iter points to the statement logically following the
+     one it was initially pointing to.  */
+  *iter = gsi_start_bb (fallthrough_bb);
+}
+
+/* if the statement pointed to by the iterator iter is a call to a
+   builtin memory access function, instrumented it and return true.
+   otherwise, return false.  */
+
+static bool
+maybe_instrument_builtin_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+  location_t location = gimple_location (call);
+
+  if (!is_gimple_call (call))
+    return false;
+
+  tree callee = gimple_call_fndecl (call);
+
+  if (!is_builtin_fn (callee)
+      || DECL_BUILT_IN_CLASS (callee) != BUILT_IN_NORMAL)
+    return false;
+
+  tree source0 = NULL_TREE, source1 = NULL_TREE,
+    dest = NULL_TREE, len = NULL_TREE;
+  bool is_store = true;
+
+  switch (DECL_FUNCTION_CODE (callee))
+    {
+      /* (s, s, n) style memops.  */
+    case BUILT_IN_BCMP:
+    case BUILT_IN_MEMCMP:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      source1 = gimple_call_arg (call, 1);
+      break;
+
+      /* (src, dest, n) style memops.  */
+    case BUILT_IN_BCOPY:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      dest = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, src, n) style memops.  */
+    case BUILT_IN_MEMCPY:
+    case BUILT_IN_MEMCPY_CHK:
+    case BUILT_IN_MEMMOVE:
+    case BUILT_IN_MEMMOVE_CHK:
+    case BUILT_IN_MEMPCPY:
+    case BUILT_IN_MEMPCPY_CHK:
+      dest = gimple_call_arg (call, 0);
+      source0 = gimple_call_arg (call, 1);
+      len = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, n) style memops.  */
+    case BUILT_IN_BZERO:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 1);
+      break;
+
+      /* (dest, x, n) style memops*/
+    case BUILT_IN_MEMSET:
+    case BUILT_IN_MEMSET_CHK:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 2);
+      break;
+
+    case BUILT_IN_STRLEN:
+      instrument_strlen_call (iter);
+      return true;
+
+    /* And now the __atomic* and __sync builtins.
+       These are handled differently from the classical memory memory
+       access builtins above.  */
+
+    case BUILT_IN_ATOMIC_LOAD:
+    case BUILT_IN_ATOMIC_LOAD_1:
+    case BUILT_IN_ATOMIC_LOAD_2:
+    case BUILT_IN_ATOMIC_LOAD_4:
+    case BUILT_IN_ATOMIC_LOAD_8:
+    case BUILT_IN_ATOMIC_LOAD_16:
+      is_store = false;
+      /* fall through.  */
+
+    case BUILT_IN_SYNC_FETCH_AND_ADD_1:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_2:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_4:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_8:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_SUB_1:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_2:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_4:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_8:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_OR_1:
+    case BUILT_IN_SYNC_FETCH_AND_OR_2:
+    case BUILT_IN_SYNC_FETCH_AND_OR_4:
+    case BUILT_IN_SYNC_FETCH_AND_OR_8:
+    case BUILT_IN_SYNC_FETCH_AND_OR_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_AND_1:
+    case BUILT_IN_SYNC_FETCH_AND_AND_2:
+    case BUILT_IN_SYNC_FETCH_AND_AND_4:
+    case BUILT_IN_SYNC_FETCH_AND_AND_8:
+    case BUILT_IN_SYNC_FETCH_AND_AND_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_XOR_1:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_2:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_4:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_8:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_NAND_1:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_2:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_4:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_8:
+
+    case BUILT_IN_SYNC_ADD_AND_FETCH_1:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_2:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_4:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_8:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_SUB_AND_FETCH_1:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_2:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_4:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_8:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_OR_AND_FETCH_1:
+    case BUILT_IN_SYNC_OR_AND_FETCH_2:
+    case BUILT_IN_SYNC_OR_AND_FETCH_4:
+    case BUILT_IN_SYNC_OR_AND_FETCH_8:
+    case BUILT_IN_SYNC_OR_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_AND_AND_FETCH_1:
+    case BUILT_IN_SYNC_AND_AND_FETCH_2:
+    case BUILT_IN_SYNC_AND_AND_FETCH_4:
+    case BUILT_IN_SYNC_AND_AND_FETCH_8:
+    case BUILT_IN_SYNC_AND_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_XOR_AND_FETCH_1:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_2:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_4:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_8:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_NAND_AND_FETCH_1:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_2:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_4:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_8:
+
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_1:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_2:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_4:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_8:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_16:
+
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_1:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_2:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_4:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_8:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_16:
+
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_1:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_2:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_4:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_8:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_16:
+
+    case BUILT_IN_SYNC_LOCK_RELEASE_1:
+    case BUILT_IN_SYNC_LOCK_RELEASE_2:
+    case BUILT_IN_SYNC_LOCK_RELEASE_4:
+    case BUILT_IN_SYNC_LOCK_RELEASE_8:
+    case BUILT_IN_SYNC_LOCK_RELEASE_16:
+
+    case BUILT_IN_ATOMIC_TEST_AND_SET:
+    case BUILT_IN_ATOMIC_CLEAR:
+    case BUILT_IN_ATOMIC_EXCHANGE:
+    case BUILT_IN_ATOMIC_EXCHANGE_1:
+    case BUILT_IN_ATOMIC_EXCHANGE_2:
+    case BUILT_IN_ATOMIC_EXCHANGE_4:
+    case BUILT_IN_ATOMIC_EXCHANGE_8:
+    case BUILT_IN_ATOMIC_EXCHANGE_16:
+
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_2:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_8:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_16:
+
+    case BUILT_IN_ATOMIC_STORE:
+    case BUILT_IN_ATOMIC_STORE_1:
+    case BUILT_IN_ATOMIC_STORE_2:
+    case BUILT_IN_ATOMIC_STORE_4:
+    case BUILT_IN_ATOMIC_STORE_8:
+    case BUILT_IN_ATOMIC_STORE_16:
+
+    case BUILT_IN_ATOMIC_ADD_FETCH_1:
+    case BUILT_IN_ATOMIC_ADD_FETCH_2:
+    case BUILT_IN_ATOMIC_ADD_FETCH_4:
+    case BUILT_IN_ATOMIC_ADD_FETCH_8:
+    case BUILT_IN_ATOMIC_ADD_FETCH_16:
+
+    case BUILT_IN_ATOMIC_SUB_FETCH_1:
+    case BUILT_IN_ATOMIC_SUB_FETCH_2:
+    case BUILT_IN_ATOMIC_SUB_FETCH_4:
+    case BUILT_IN_ATOMIC_SUB_FETCH_8:
+    case BUILT_IN_ATOMIC_SUB_FETCH_16:
+
+    case BUILT_IN_ATOMIC_AND_FETCH_1:
+    case BUILT_IN_ATOMIC_AND_FETCH_2:
+    case BUILT_IN_ATOMIC_AND_FETCH_4:
+    case BUILT_IN_ATOMIC_AND_FETCH_8:
+    case BUILT_IN_ATOMIC_AND_FETCH_16:
+
+    case BUILT_IN_ATOMIC_NAND_FETCH_1:
+    case BUILT_IN_ATOMIC_NAND_FETCH_2:
+    case BUILT_IN_ATOMIC_NAND_FETCH_4:
+    case BUILT_IN_ATOMIC_NAND_FETCH_8:
+    case BUILT_IN_ATOMIC_NAND_FETCH_16:
+
+    case BUILT_IN_ATOMIC_XOR_FETCH_1:
+    case BUILT_IN_ATOMIC_XOR_FETCH_2:
+    case BUILT_IN_ATOMIC_XOR_FETCH_4:
+    case BUILT_IN_ATOMIC_XOR_FETCH_8:
+    case BUILT_IN_ATOMIC_XOR_FETCH_16:
+
+    case BUILT_IN_ATOMIC_OR_FETCH_1:
+    case BUILT_IN_ATOMIC_OR_FETCH_2:
+    case BUILT_IN_ATOMIC_OR_FETCH_4:
+    case BUILT_IN_ATOMIC_OR_FETCH_8:
+    case BUILT_IN_ATOMIC_OR_FETCH_16:
+
+    case BUILT_IN_ATOMIC_FETCH_ADD_1:
+    case BUILT_IN_ATOMIC_FETCH_ADD_2:
+    case BUILT_IN_ATOMIC_FETCH_ADD_4:
+    case BUILT_IN_ATOMIC_FETCH_ADD_8:
+    case BUILT_IN_ATOMIC_FETCH_ADD_16:
+
+    case BUILT_IN_ATOMIC_FETCH_SUB_1:
+    case BUILT_IN_ATOMIC_FETCH_SUB_2:
+    case BUILT_IN_ATOMIC_FETCH_SUB_4:
+    case BUILT_IN_ATOMIC_FETCH_SUB_8:
+    case BUILT_IN_ATOMIC_FETCH_SUB_16:
+
+    case BUILT_IN_ATOMIC_FETCH_AND_1:
+    case BUILT_IN_ATOMIC_FETCH_AND_2:
+    case BUILT_IN_ATOMIC_FETCH_AND_4:
+    case BUILT_IN_ATOMIC_FETCH_AND_8:
+    case BUILT_IN_ATOMIC_FETCH_AND_16:
+
+    case BUILT_IN_ATOMIC_FETCH_NAND_1:
+    case BUILT_IN_ATOMIC_FETCH_NAND_2:
+    case BUILT_IN_ATOMIC_FETCH_NAND_4:
+    case BUILT_IN_ATOMIC_FETCH_NAND_8:
+    case BUILT_IN_ATOMIC_FETCH_NAND_16:
+
+    case BUILT_IN_ATOMIC_FETCH_XOR_1:
+    case BUILT_IN_ATOMIC_FETCH_XOR_2:
+    case BUILT_IN_ATOMIC_FETCH_XOR_4:
+    case BUILT_IN_ATOMIC_FETCH_XOR_8:
+    case BUILT_IN_ATOMIC_FETCH_XOR_16:
+
+    case BUILT_IN_ATOMIC_FETCH_OR_1:
+    case BUILT_IN_ATOMIC_FETCH_OR_2:
+    case BUILT_IN_ATOMIC_FETCH_OR_4:
+    case BUILT_IN_ATOMIC_FETCH_OR_8:
+    case BUILT_IN_ATOMIC_FETCH_OR_16:
+      {
+	dest = gimple_call_arg (call, 0);
+	/* So DEST represents the address of a memory location.
+	   instrument_derefs wants the memory location, so lets
+	   dereference the address DEST before handing it to
+	   instrument_derefs.  */
+	if (TREE_CODE (dest) == ADDR_EXPR)
+	  dest = TREE_OPERAND (dest, 0);
+	else if (TREE_CODE (dest) == SSA_NAME)
+	  dest = build2 (MEM_REF, TREE_TYPE (TREE_TYPE (dest)),
+			 dest, build_int_cst (TREE_TYPE (dest), 0));
+	else
+	  gcc_unreachable ();
+
+	instrument_derefs (iter, dest, location, is_store);
+	return true;
+      }
+
+    default:
+      /* The other builtins memory access are not instrumented in this
+	 function because they either don't have any length parameter,
+	 or their length parameter is just a limit.  */
+      break;
+    }
+
+  if (len != NULL_TREE)
+    {
+      if (source0 != NULL_TREE)
+	instrument_mem_region_access (source0, len, iter,
+				      location, /*is_store=*/false);
+      if (source1 != NULL_TREE)
+	instrument_mem_region_access (source1, len, iter,
+				      location, /*is_store=*/false);
+      else if (dest != NULL_TREE)
+	instrument_mem_region_access (dest, len, iter,
+				      location, /*is_store=*/true);
+      return true;
+    }
+  return false;
+}
+
+/*  Instrument the assignment statement ITER if it is subject to
+    instrumentation.  */
+
+static void
+instrument_assignment (gimple_stmt_iterator *iter)
+{
+  gimple s = gsi_stmt (*iter);
+
+  gcc_assert (gimple_assign_single_p (s));
+
+  instrument_derefs (iter, gimple_assign_lhs (s),
+		     gimple_location (s), true);
+  instrument_derefs (iter, gimple_assign_rhs1 (s),
+		     gimple_location (s), false);
+}
+
+/* Instrument the function call pointed to by the iterator ITER, if it
+   is subject to instrumentation.  At the moment, the only function
+   calls that are instrumented are some built-in functions that access
+   memory.  Look at maybe_instrument_builtin_call to learn more.  */
+
+static void
+maybe_instrument_call (gimple_stmt_iterator *iter)
+{
+  maybe_instrument_builtin_call (iter);
 }
 
 /* asan: this looks too complex. Can this be done simpler? */
@@ -686,13 +1296,12 @@ transform_statements (void)
       if (bb->index >= saved_last_basic_block) continue;
       for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
         {
-          gimple s = gsi_stmt (i);
-          if (!gimple_assign_single_p (s))
-	    continue;
-          instrument_derefs (&i, gimple_assign_lhs (s),
-                             gimple_location (s), true);
-          instrument_derefs (&i, gimple_assign_rhs1 (s),
-                             gimple_location (s), false);
+	  gimple s = gsi_stmt (i);
+
+	  if (gimple_assign_single_p (s))
+	    instrument_assignment (&i);
+	  else if (is_gimple_call (s))
+	    maybe_instrument_call (&i);
         }
     }
 }
-- 
1.7.11.7


-- 
		Dodji

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-25 22:39             ` Dodji Seketeli
@ 2012-10-25 22:57               ` Jakub Jelinek
  2012-10-26 10:55                 ` Dodji Seketeli
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2012-10-25 22:57 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

On Fri, Oct 26, 2012 at 12:23:41AM +0200, Dodji Seketeli wrote:
> +/* Instrument the strlen builtin call pointed to by ITER.
> +
> +   This function instruments the access to the first byte of the
> +   argument, right before the call.  After the call it instruments the
> +   access to the last byte of the argument; it uses the result of the
> +   call to deduce the offset of that last byte.  */
> +
> +static void
> +instrument_strlen_call (gimple_stmt_iterator *iter)
> +{
> +  gimple call = gsi_stmt (*iter);
> +  gcc_assert (is_gimple_call (call));
> +
> +  tree callee = gimple_call_fndecl (call);
> +  gcc_assert (is_builtin_fn (callee)
> +	      && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL
> +	      && DECL_FUNCTION_CODE (callee) == BUILT_IN_STRLEN);
> +
> +  location_t location = gimple_location (call);

I'd call the var just loc, that is far more commonly used and shorter.

> +  /* If we initially had an instruction like:
> +
> +	 int n = strlen (str)
> +
> +     we now want to instrument the access to str[n-1], after the
> +     instruction above, if n is greater than 0.  */
> +
> +  /* So let's build the access to str[n-1], in a if (n > 0) {} block.  */
> +
> +  tree len = gimple_call_lhs (call);

I'd add
  if (len == NULL_TREE)
    return;
just in case.  While strlen is a pure function, not sure if some pass isn't
just clearing the lhs as unneeded without removing it immediately.

> +  /*  if (len > 0); */
> +  gimple cond = gimple_build_cond (GT_EXPR, len,
> +				   build_int_cst (size_type_node, 0),
> +				   NULL, NULL);

But more importantly, you shouldn't guard it with len > 0, and shouldn't
test str[n-1], but str[n] (also adjust the comment above).
Because strlen must read even the '\0' byte at str[n].


> +    case BUILT_IN_ATOMIC_FETCH_OR_16:
> +      {
> +	dest = gimple_call_arg (call, 0);
> +	/* So DEST represents the address of a memory location.
> +	   instrument_derefs wants the memory location, so lets
> +	   dereference the address DEST before handing it to
> +	   instrument_derefs.  */
> +	if (TREE_CODE (dest) == ADDR_EXPR)
> +	  dest = TREE_OPERAND (dest, 0);
> +	else if (TREE_CODE (dest) == SSA_NAME)
> +	  dest = build2 (MEM_REF, TREE_TYPE (TREE_TYPE (dest)),
> +			 dest, build_int_cst (TREE_TYPE (dest), 0));
> +	else
> +	  gcc_unreachable ();
> +
> +	instrument_derefs (iter, dest, location, is_store);
> +	return true;
> +      }

{} around this is unneeded I think.

Otherwise looks good.

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-25 22:57               ` Jakub Jelinek
@ 2012-10-26 10:55                 ` Dodji Seketeli
  2012-10-26 11:33                   ` Jakub Jelinek
  0 siblings, 1 reply; 25+ messages in thread
From: Dodji Seketeli @ 2012-10-26 10:55 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

Jakub Jelinek <jakub@redhat.com> writes:

>> +  location_t location = gimple_location (call);
>
> I'd call the var just loc, that is far more commonly used and shorter.

Done.

>
>> +  /* If we initially had an instruction like:
>> +
>> +	 int n = strlen (str)
>> +
>> +     we now want to instrument the access to str[n-1], after the
>> +     instruction above, if n is greater than 0.  */
>> +
>> +  /* So let's build the access to str[n-1], in a if (n > 0) {} block.  */
>> +
>> +  tree len = gimple_call_lhs (call);
>
> I'd add
>   if (len == NULL_TREE)
>     return;
> just in case.  While strlen is a pure function, not sure if some pass isn't
> just clearing the lhs as unneeded without removing it immediately.
>

Done.

>> +  /*  if (len > 0); */
>> +  gimple cond = gimple_build_cond (GT_EXPR, len,
>> +				   build_int_cst (size_type_node, 0),
>> +				   NULL, NULL);
>
> But more importantly, you shouldn't guard it with len > 0, and shouldn't
> test str[n-1], but str[n] (also adjust the comment above).
> Because strlen must read even the '\0' byte at str[n].

Done.

Below is the updated patch.  Thanks.

gcc/
	* asan.c (insert_if_then_before_iter, instrument_mem_region_access,
	(instrument_strlen_call, maybe_instrument_builtin_call,
	(maybe_instrument_call): New static functions.
	(create_cond_insert_point): Renamed
	create_cond_insert_point_before_iter into this.  Add a new
	parameter to decide whether to insert the condition before or
	after the statement iterator.
	(build_check_stmt): Adjust for the new create_cond_insert_point.
	Add a new parameter to decide whether to add the instrumentation
	code before or after the statement iterator.
	(instrument_assignment): Factorize from ...
	(transform_statements): ... here.  Use maybe_instrument_call to
	instrument builtin function calls as well.
	(instrument_derefs): Adjust for the new parameter of
	build_check_stmt.  Fix detection of bit-field access.
---
 gcc/asan.c | 612 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 583 insertions(+), 29 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index 39e77e6..155e84b 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -398,9 +398,9 @@ asan_init_func (void)
 #define PROB_ALWAYS		(REG_BR_PROB_BASE)
 
 /* Split the current basic block and create a condition statement
-   insertion point right before the statement pointed to by ITER.
-   Return an iterator to the point at which the caller might safely
-   insert the condition statement.
+   insertion point right before or after the statement pointed to by
+   ITER.  Return an iterator to the point at which the caller might
+   safely insert the condition statement.
 
    THEN_BLOCK must be set to the address of an uninitialized instance
    of basic_block.  The function will then set *THEN_BLOCK to the
@@ -414,18 +414,21 @@ asan_init_func (void)
    statements starting from *ITER, and *THEN_BLOCK is a new empty
    block.
 
-   *ITER is adjusted to still point to the same statement it was
-   *pointing to initially.  */
+   *ITER is adjusted to point to always point to the first statement
+    of the basic block * FALLTHROUGH_BLOCK.  That statement is the
+    same as what ITER was pointing to prior to calling this function,
+    if BEFORE_P is true; otherwise, it is its following statement.  */
 
 static gimple_stmt_iterator
-create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
-				      bool then_more_likely_p,
-				      basic_block *then_block,
-				      basic_block *fallthrough_block)
+create_cond_insert_point (gimple_stmt_iterator *iter,
+			  bool before_p,
+			  bool then_more_likely_p,
+			  basic_block *then_block,
+			  basic_block *fallthrough_block)
 {
   gimple_stmt_iterator gsi = *iter;
 
-  if (!gsi_end_p (gsi))
+  if (!gsi_end_p (gsi) && before_p)
     gsi_prev (&gsi);
 
   basic_block cur_bb = gsi_bb (*iter);
@@ -466,18 +469,58 @@ create_cond_insert_point_before_iter (gimple_stmt_iterator *iter,
   return gsi_last_bb (cond_bb);
 }
 
+/* Insert an if condition followed by a 'then block' right before the
+   statement pointed to by ITER.  The fallthrough block -- which is the
+   else block of the condition as well as the destination of the
+   outcoming edge of the 'then block' -- starts with the statement
+   pointed to by ITER.
+
+   COND is the condition of the if.  
+
+   If THEN_MORE_LIKELY_P is true, the probability of the edge to the
+   'then block' is higher than the probability of the edge to the
+   fallthrough block.
+
+   Upon completion of the function, *THEN_BB is set to the newly
+   inserted 'then block' and similarly, *FALLTHROUGH_BB is set to the
+   fallthrough block.
+
+   *ITER is adjusted to still point to the same statement it was
+   pointing to initially.  */
+
+static void
+insert_if_then_before_iter (gimple cond,
+			    gimple_stmt_iterator *iter,
+			    bool then_more_likely_p,
+			    basic_block *then_bb,
+			    basic_block *fallthrough_bb)
+{
+  gimple_stmt_iterator cond_insert_point =
+    create_cond_insert_point (iter,
+			      /*before_p=*/true,
+			      then_more_likely_p,
+			      then_bb,
+			      fallthrough_bb);
+  gsi_insert_after (&cond_insert_point, cond, GSI_NEW_STMT);
+}
+
 /* Instrument the memory access instruction BASE.  Insert new
-   statements before ITER.
+   statements before or after ITER.
 
    Note that the memory access represented by BASE can be either an
    SSA_NAME, or a non-SSA expression.  LOCATION is the source code
    location.  IS_STORE is TRUE for a store, FALSE for a load.
-   SIZE_IN_BYTES is one of 1, 2, 4, 8, 16.  */
+   BEFORE_P is TRUE for inserting the instrumentation code before
+   ITER, FALSE for inserting it after ITER.  SIZE_IN_BYTES is one of
+   1, 2, 4, 8, 16.
+
+   If BEFORE_P is TRUE, *ITER is arranged to still point to the
+   statement it was pointing to prior to calling this function,
+   otherwise, it points to the statement logically following it.  */
 
 static void
-build_check_stmt (tree base, gimple_stmt_iterator *iter,
-                  location_t location, bool is_store,
-		  int size_in_bytes)
+build_check_stmt (location_t location, tree base, gimple_stmt_iterator *iter,
+		  bool before_p, bool is_store, int size_in_bytes)
 {
   gimple_stmt_iterator gsi;
   basic_block then_bb, else_bb;
@@ -491,10 +534,10 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 
   /* Get an iterator on the point where we can add the condition
      statement for the instrumentation.  */
-  gsi = create_cond_insert_point_before_iter (iter,
-					      /*then_more_likely_p=*/false,
-					      &then_bb,
-					      &else_bb);
+  gsi = create_cond_insert_point (iter, before_p,
+				  /*then_more_likely_p=*/false,
+				  &then_bb,
+				  &else_bb);
 
   base = unshare_expr (base);
 
@@ -626,7 +669,7 @@ build_check_stmt (tree base, gimple_stmt_iterator *iter,
 
 /* If T represents a memory access, add instrumentation code before ITER.
    LOCATION is source code location.
-   IS_STORE is either 1 (for a store) or 0 (for a load).  */
+   IS_STORE is either TRUE (for a store) or FALSE (for a load).  */
 
 static void
 instrument_derefs (gimple_stmt_iterator *iter, tree t,
@@ -661,11 +704,523 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
   int volatilep = 0, unsignedp = 0;
   get_inner_reference (t, &bitsize, &bitpos, &offset,
 		       &mode, &unsignedp, &volatilep, false);
-  if (bitpos != 0 || bitsize != size_in_bytes * BITS_PER_UNIT)
+  if (bitpos % (size_in_bytes * BITS_PER_UNIT)
+      || bitsize != size_in_bytes * BITS_PER_UNIT)
     return;
 
   base = build_fold_addr_expr (t);
-  build_check_stmt (base, iter, location, is_store, size_in_bytes);
+  build_check_stmt (location, base, iter, /*before_p=*/true,
+		    is_store, size_in_bytes);
+}
+
+/* Instrument an access to a contiguous memory region that starts at
+   the address pointed to by BASE, over a length of LEN (expressed in
+   the sizeof (*BASE) bytes).  ITER points to the instruction before
+   which the instrumentation instructions must be inserted.  LOCATION
+   is the source location that the instrumentation instructions must
+   have.  If IS_STORE is true, then the memory access is a store;
+   otherwise, it's a load.  */
+
+static void
+instrument_mem_region_access (tree base, tree len,
+			      gimple_stmt_iterator *iter,
+			      location_t location, bool is_store)
+{
+  if (integer_zerop (len))
+    return;
+
+  gimple_stmt_iterator gsi = *iter;
+
+  basic_block fallthrough_bb = NULL, then_bb = NULL;
+  if (!is_gimple_constant (len))
+    {
+      /* So, the length of the memory area to asan-protect is
+	 non-constant.  Let's guard the generated instrumentation code
+	 like:
+
+	 if (len != 0)
+	   {
+	     //asan instrumentation code goes here.
+           }
+	   // falltrough instructions, starting with *ITER.  */
+
+      gimple g = gimple_build_cond (NE_EXPR,
+				    len,
+				    build_int_cst (TREE_TYPE (len), 0),
+				    NULL_TREE, NULL_TREE);
+      gimple_set_location (g, location);
+      insert_if_then_before_iter (g, iter, /*then_more_likely_p=*/true,
+				  &then_bb, &fallthrough_bb);
+      /* Note that fallthrough_bb starts with the statement that was
+	 pointed to by ITER.  */
+
+      /* The 'then block' of the 'if (len != 0) condition is where
+	 we'll generate the asan instrumentation code now.  */
+      gsi = gsi_start_bb (then_bb);
+    }
+
+  /* Instrument the beginning of the memory region to be accessed,
+     and arrange for the rest of the intrumentation code to be
+     inserted in the then block *after* the current gsi.  */
+  build_check_stmt (location, base, &gsi, /*before_p=*/true, is_store, 1);
+
+  if (then_bb)
+    /* We are in the case where the length of the region is not
+       constant; so instrumentation code is being generated in the
+       'then block' of the 'if (len != 0) condition.  Let's arrange
+       for the subsequent instrumentation statements to go in the
+       'then block'.  */
+    gsi = gsi_last_bb (then_bb);
+  else
+    *iter = gsi;
+
+  /* We want to instrument the access at the end of the memory region,
+     which is at (base + len - 1).  */
+
+  /* offset = len - 1;  */
+  len = unshare_expr (len);
+  gimple offset =
+    gimple_build_assign_with_ops (TREE_CODE (len),
+				  make_ssa_name (TREE_TYPE (len), NULL),
+				  len, NULL);
+  gimple_set_location (offset, location);
+  gsi_insert_before (&gsi, offset, GSI_NEW_STMT);
+
+  offset =
+    gimple_build_assign_with_ops (MINUS_EXPR,
+				  make_ssa_name (size_type_node, NULL),
+				  gimple_assign_lhs (offset),
+				  build_int_cst (size_type_node, 1));
+  gimple_set_location (offset, location);
+  gsi_insert_after (&gsi, offset, GSI_NEW_STMT);
+
+  /* _1 = base;  */
+  base = unshare_expr (base);
+  gimple region_end =
+    gimple_build_assign_with_ops (TREE_CODE (base),
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  base, NULL);
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* _2 = _1 + offset;  */
+  region_end =
+    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+				  make_ssa_name (TREE_TYPE (base), NULL),
+				  gimple_assign_lhs (region_end), 
+				  gimple_assign_lhs (offset));
+  gimple_set_location (region_end, location);
+  gsi_insert_after (&gsi, region_end, GSI_NEW_STMT);
+
+  /* instrument access at _2;  */
+  build_check_stmt (location, gimple_assign_lhs (region_end),
+		    &gsi, /*before_p=*/false, is_store, 1);
+}
+
+/* Instrument the strlen builtin call pointed to by ITER.
+
+   This function instruments the access to the first byte of the
+   argument, right before the call.  After the call it instruments the
+   access to the last byte of the argument; it uses the result of the
+   call to deduce the offset of that last byte.  */
+
+static void
+instrument_strlen_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+  gcc_assert (is_gimple_call (call));
+
+  tree callee = gimple_call_fndecl (call);
+  gcc_assert (is_builtin_fn (callee)
+	      && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL
+	      && DECL_FUNCTION_CODE (callee) == BUILT_IN_STRLEN);
+
+  tree len = gimple_call_lhs (call);
+  if (len == NULL)
+    /* Some passes might clear the return value of the strlen call;
+       bail out in that case.  */
+    return;
+  gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (len)));
+
+  location_t loc = gimple_location (call);
+  tree str_arg = gimple_call_arg (call, 0);
+
+  /* Instrument the access to the first byte of str_arg.  i.e:
+
+     _1 = str_arg; instrument (_1); */
+  gimple str_arg_ssa =
+    gimple_build_assign_with_ops (NOP_EXPR,
+				  make_ssa_name (build_pointer_type
+						 (char_type_node), NULL),
+				  str_arg, NULL);
+  gimple_set_location (str_arg_ssa, loc);
+  gimple_stmt_iterator gsi = *iter;
+  gsi_insert_before (&gsi, str_arg_ssa, GSI_NEW_STMT);
+  build_check_stmt (loc, gimple_assign_lhs (str_arg_ssa), &gsi,
+		    /*before_p=*/false, /*is_store=*/false, 1);
+
+  /* If we initially had an instruction like:
+
+	 int n = strlen (str)
+
+     we now want to instrument the access to str[n], after the
+     instruction above.*/
+
+  /* So let's build the access to str[n] that is, access through the
+     pointer_plus expr: (_1 + len).  */
+  gimple stmt =
+    gimple_build_assign_with_ops (POINTER_PLUS_EXPR,
+				  make_ssa_name (TREE_TYPE (str_arg),
+						 NULL),
+				  gimple_assign_lhs (str_arg_ssa),
+				  len);
+  gimple_set_location (stmt, loc);
+  gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
+
+  build_check_stmt (loc, gimple_assign_lhs (stmt), &gsi,
+		    /*before_p=*/false, /*is_store=*/false, 1);
+
+  /* Ensure that iter points to the statement logically following the
+     one it was initially pointing to.  */
+  *iter = gsi;
+}
+
+/* if the statement pointed to by the iterator iter is a call to a
+   builtin memory access function, instrument it and return true.
+   otherwise, return false.  */
+
+static bool
+maybe_instrument_builtin_call (gimple_stmt_iterator *iter)
+{
+  gimple call = gsi_stmt (*iter);
+  location_t loc = gimple_location (call);
+
+  if (!is_gimple_call (call))
+    return false;
+
+  tree callee = gimple_call_fndecl (call);
+
+  if (!is_builtin_fn (callee)
+      || DECL_BUILT_IN_CLASS (callee) != BUILT_IN_NORMAL)
+    return false;
+
+  tree source0 = NULL_TREE, source1 = NULL_TREE,
+    dest = NULL_TREE, len = NULL_TREE;
+  bool is_store = true;
+
+  switch (DECL_FUNCTION_CODE (callee))
+    {
+      /* (s, s, n) style memops.  */
+    case BUILT_IN_BCMP:
+    case BUILT_IN_MEMCMP:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      source1 = gimple_call_arg (call, 1);
+      break;
+
+      /* (src, dest, n) style memops.  */
+    case BUILT_IN_BCOPY:
+      len = gimple_call_arg (call, 2);
+      source0 = gimple_call_arg (call, 0);
+      dest = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, src, n) style memops.  */
+    case BUILT_IN_MEMCPY:
+    case BUILT_IN_MEMCPY_CHK:
+    case BUILT_IN_MEMMOVE:
+    case BUILT_IN_MEMMOVE_CHK:
+    case BUILT_IN_MEMPCPY:
+    case BUILT_IN_MEMPCPY_CHK:
+      dest = gimple_call_arg (call, 0);
+      source0 = gimple_call_arg (call, 1);
+      len = gimple_call_arg (call, 2);
+      break;
+
+      /* (dest, n) style memops.  */
+    case BUILT_IN_BZERO:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 1);
+      break;
+
+      /* (dest, x, n) style memops*/
+    case BUILT_IN_MEMSET:
+    case BUILT_IN_MEMSET_CHK:
+      dest = gimple_call_arg (call, 0);
+      len = gimple_call_arg (call, 2);
+      break;
+
+    case BUILT_IN_STRLEN:
+      instrument_strlen_call (iter);
+      return true;
+
+    /* And now the __atomic* and __sync builtins.
+       These are handled differently from the classical memory memory
+       access builtins above.  */
+
+    case BUILT_IN_ATOMIC_LOAD:
+    case BUILT_IN_ATOMIC_LOAD_1:
+    case BUILT_IN_ATOMIC_LOAD_2:
+    case BUILT_IN_ATOMIC_LOAD_4:
+    case BUILT_IN_ATOMIC_LOAD_8:
+    case BUILT_IN_ATOMIC_LOAD_16:
+      is_store = false;
+      /* fall through.  */
+
+    case BUILT_IN_SYNC_FETCH_AND_ADD_1:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_2:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_4:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_8:
+    case BUILT_IN_SYNC_FETCH_AND_ADD_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_SUB_1:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_2:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_4:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_8:
+    case BUILT_IN_SYNC_FETCH_AND_SUB_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_OR_1:
+    case BUILT_IN_SYNC_FETCH_AND_OR_2:
+    case BUILT_IN_SYNC_FETCH_AND_OR_4:
+    case BUILT_IN_SYNC_FETCH_AND_OR_8:
+    case BUILT_IN_SYNC_FETCH_AND_OR_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_AND_1:
+    case BUILT_IN_SYNC_FETCH_AND_AND_2:
+    case BUILT_IN_SYNC_FETCH_AND_AND_4:
+    case BUILT_IN_SYNC_FETCH_AND_AND_8:
+    case BUILT_IN_SYNC_FETCH_AND_AND_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_XOR_1:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_2:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_4:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_8:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_16:
+
+    case BUILT_IN_SYNC_FETCH_AND_NAND_1:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_2:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_4:
+    case BUILT_IN_SYNC_FETCH_AND_NAND_8:
+
+    case BUILT_IN_SYNC_ADD_AND_FETCH_1:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_2:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_4:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_8:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_SUB_AND_FETCH_1:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_2:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_4:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_8:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_OR_AND_FETCH_1:
+    case BUILT_IN_SYNC_OR_AND_FETCH_2:
+    case BUILT_IN_SYNC_OR_AND_FETCH_4:
+    case BUILT_IN_SYNC_OR_AND_FETCH_8:
+    case BUILT_IN_SYNC_OR_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_AND_AND_FETCH_1:
+    case BUILT_IN_SYNC_AND_AND_FETCH_2:
+    case BUILT_IN_SYNC_AND_AND_FETCH_4:
+    case BUILT_IN_SYNC_AND_AND_FETCH_8:
+    case BUILT_IN_SYNC_AND_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_XOR_AND_FETCH_1:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_2:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_4:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_8:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_16:
+
+    case BUILT_IN_SYNC_NAND_AND_FETCH_1:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_2:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_4:
+    case BUILT_IN_SYNC_NAND_AND_FETCH_8:
+
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_1:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_2:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_4:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_8:
+    case BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_16:
+
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_1:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_2:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_4:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_8:
+    case BUILT_IN_SYNC_VAL_COMPARE_AND_SWAP_16:
+
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_1:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_2:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_4:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_8:
+    case BUILT_IN_SYNC_LOCK_TEST_AND_SET_16:
+
+    case BUILT_IN_SYNC_LOCK_RELEASE_1:
+    case BUILT_IN_SYNC_LOCK_RELEASE_2:
+    case BUILT_IN_SYNC_LOCK_RELEASE_4:
+    case BUILT_IN_SYNC_LOCK_RELEASE_8:
+    case BUILT_IN_SYNC_LOCK_RELEASE_16:
+
+    case BUILT_IN_ATOMIC_TEST_AND_SET:
+    case BUILT_IN_ATOMIC_CLEAR:
+    case BUILT_IN_ATOMIC_EXCHANGE:
+    case BUILT_IN_ATOMIC_EXCHANGE_1:
+    case BUILT_IN_ATOMIC_EXCHANGE_2:
+    case BUILT_IN_ATOMIC_EXCHANGE_4:
+    case BUILT_IN_ATOMIC_EXCHANGE_8:
+    case BUILT_IN_ATOMIC_EXCHANGE_16:
+
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_2:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_8:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_16:
+
+    case BUILT_IN_ATOMIC_STORE:
+    case BUILT_IN_ATOMIC_STORE_1:
+    case BUILT_IN_ATOMIC_STORE_2:
+    case BUILT_IN_ATOMIC_STORE_4:
+    case BUILT_IN_ATOMIC_STORE_8:
+    case BUILT_IN_ATOMIC_STORE_16:
+
+    case BUILT_IN_ATOMIC_ADD_FETCH_1:
+    case BUILT_IN_ATOMIC_ADD_FETCH_2:
+    case BUILT_IN_ATOMIC_ADD_FETCH_4:
+    case BUILT_IN_ATOMIC_ADD_FETCH_8:
+    case BUILT_IN_ATOMIC_ADD_FETCH_16:
+
+    case BUILT_IN_ATOMIC_SUB_FETCH_1:
+    case BUILT_IN_ATOMIC_SUB_FETCH_2:
+    case BUILT_IN_ATOMIC_SUB_FETCH_4:
+    case BUILT_IN_ATOMIC_SUB_FETCH_8:
+    case BUILT_IN_ATOMIC_SUB_FETCH_16:
+
+    case BUILT_IN_ATOMIC_AND_FETCH_1:
+    case BUILT_IN_ATOMIC_AND_FETCH_2:
+    case BUILT_IN_ATOMIC_AND_FETCH_4:
+    case BUILT_IN_ATOMIC_AND_FETCH_8:
+    case BUILT_IN_ATOMIC_AND_FETCH_16:
+
+    case BUILT_IN_ATOMIC_NAND_FETCH_1:
+    case BUILT_IN_ATOMIC_NAND_FETCH_2:
+    case BUILT_IN_ATOMIC_NAND_FETCH_4:
+    case BUILT_IN_ATOMIC_NAND_FETCH_8:
+    case BUILT_IN_ATOMIC_NAND_FETCH_16:
+
+    case BUILT_IN_ATOMIC_XOR_FETCH_1:
+    case BUILT_IN_ATOMIC_XOR_FETCH_2:
+    case BUILT_IN_ATOMIC_XOR_FETCH_4:
+    case BUILT_IN_ATOMIC_XOR_FETCH_8:
+    case BUILT_IN_ATOMIC_XOR_FETCH_16:
+
+    case BUILT_IN_ATOMIC_OR_FETCH_1:
+    case BUILT_IN_ATOMIC_OR_FETCH_2:
+    case BUILT_IN_ATOMIC_OR_FETCH_4:
+    case BUILT_IN_ATOMIC_OR_FETCH_8:
+    case BUILT_IN_ATOMIC_OR_FETCH_16:
+
+    case BUILT_IN_ATOMIC_FETCH_ADD_1:
+    case BUILT_IN_ATOMIC_FETCH_ADD_2:
+    case BUILT_IN_ATOMIC_FETCH_ADD_4:
+    case BUILT_IN_ATOMIC_FETCH_ADD_8:
+    case BUILT_IN_ATOMIC_FETCH_ADD_16:
+
+    case BUILT_IN_ATOMIC_FETCH_SUB_1:
+    case BUILT_IN_ATOMIC_FETCH_SUB_2:
+    case BUILT_IN_ATOMIC_FETCH_SUB_4:
+    case BUILT_IN_ATOMIC_FETCH_SUB_8:
+    case BUILT_IN_ATOMIC_FETCH_SUB_16:
+
+    case BUILT_IN_ATOMIC_FETCH_AND_1:
+    case BUILT_IN_ATOMIC_FETCH_AND_2:
+    case BUILT_IN_ATOMIC_FETCH_AND_4:
+    case BUILT_IN_ATOMIC_FETCH_AND_8:
+    case BUILT_IN_ATOMIC_FETCH_AND_16:
+
+    case BUILT_IN_ATOMIC_FETCH_NAND_1:
+    case BUILT_IN_ATOMIC_FETCH_NAND_2:
+    case BUILT_IN_ATOMIC_FETCH_NAND_4:
+    case BUILT_IN_ATOMIC_FETCH_NAND_8:
+    case BUILT_IN_ATOMIC_FETCH_NAND_16:
+
+    case BUILT_IN_ATOMIC_FETCH_XOR_1:
+    case BUILT_IN_ATOMIC_FETCH_XOR_2:
+    case BUILT_IN_ATOMIC_FETCH_XOR_4:
+    case BUILT_IN_ATOMIC_FETCH_XOR_8:
+    case BUILT_IN_ATOMIC_FETCH_XOR_16:
+
+    case BUILT_IN_ATOMIC_FETCH_OR_1:
+    case BUILT_IN_ATOMIC_FETCH_OR_2:
+    case BUILT_IN_ATOMIC_FETCH_OR_4:
+    case BUILT_IN_ATOMIC_FETCH_OR_8:
+    case BUILT_IN_ATOMIC_FETCH_OR_16:
+      {
+	dest = gimple_call_arg (call, 0);
+	/* So DEST represents the address of a memory location.
+	   instrument_derefs wants the memory location, so lets
+	   dereference the address DEST before handing it to
+	   instrument_derefs.  */
+	if (TREE_CODE (dest) == ADDR_EXPR)
+	  dest = TREE_OPERAND (dest, 0);
+	else if (TREE_CODE (dest) == SSA_NAME)
+	  dest = build2 (MEM_REF, TREE_TYPE (TREE_TYPE (dest)),
+			 dest, build_int_cst (TREE_TYPE (dest), 0));
+	else
+	  gcc_unreachable ();
+
+	instrument_derefs (iter, dest, loc, is_store);
+	return true;
+      }
+
+    default:
+      /* The other builtins memory access are not instrumented in this
+	 function because they either don't have any length parameter,
+	 or their length parameter is just a limit.  */
+      break;
+    }
+
+  if (len != NULL_TREE)
+    {
+      if (source0 != NULL_TREE)
+	instrument_mem_region_access (source0, len, iter,
+				      loc, /*is_store=*/false);
+      if (source1 != NULL_TREE)
+	instrument_mem_region_access (source1, len, iter,
+				      loc, /*is_store=*/false);
+      else if (dest != NULL_TREE)
+	instrument_mem_region_access (dest, len, iter,
+				      loc, /*is_store=*/true);
+      return true;
+    }
+  return false;
+}
+
+/*  Instrument the assignment statement ITER if it is subject to
+    instrumentation.  */
+
+static void
+instrument_assignment (gimple_stmt_iterator *iter)
+{
+  gimple s = gsi_stmt (*iter);
+
+  gcc_assert (gimple_assign_single_p (s));
+
+  instrument_derefs (iter, gimple_assign_lhs (s),
+		     gimple_location (s), true);
+  instrument_derefs (iter, gimple_assign_rhs1 (s),
+		     gimple_location (s), false);
+}
+
+/* Instrument the function call pointed to by the iterator ITER, if it
+   is subject to instrumentation.  At the moment, the only function
+   calls that are instrumented are some built-in functions that access
+   memory.  Look at maybe_instrument_builtin_call to learn more.  */
+
+static void
+maybe_instrument_call (gimple_stmt_iterator *iter)
+{
+  maybe_instrument_builtin_call (iter);
 }
 
 /* asan: this looks too complex. Can this be done simpler? */
@@ -686,13 +1241,12 @@ transform_statements (void)
       if (bb->index >= saved_last_basic_block) continue;
       for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
         {
-          gimple s = gsi_stmt (i);
-          if (!gimple_assign_single_p (s))
-	    continue;
-          instrument_derefs (&i, gimple_assign_lhs (s),
-                             gimple_location (s), true);
-          instrument_derefs (&i, gimple_assign_rhs1 (s),
-                             gimple_location (s), false);
+	  gimple s = gsi_stmt (i);
+
+	  if (gimple_assign_single_p (s))
+	    instrument_assignment (&i);
+	  else if (is_gimple_call (s))
+	    maybe_instrument_call (&i);
         }
     }
 }
-- 
		Dodji

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-26 10:55                 ` Dodji Seketeli
@ 2012-10-26 11:33                   ` Jakub Jelinek
  0 siblings, 0 replies; 25+ messages in thread
From: Jakub Jelinek @ 2012-10-26 11:33 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

On Fri, Oct 26, 2012 at 12:13:16PM +0200, Dodji Seketeli wrote:
> 	* asan.c (insert_if_then_before_iter, instrument_mem_region_access,
> 	(instrument_strlen_call, maybe_instrument_builtin_call,
> 	(maybe_instrument_call): New static functions.
> 	(create_cond_insert_point): Renamed
> 	create_cond_insert_point_before_iter into this.  Add a new
> 	parameter to decide whether to insert the condition before or
> 	after the statement iterator.
> 	(build_check_stmt): Adjust for the new create_cond_insert_point.
> 	Add a new parameter to decide whether to add the instrumentation
> 	code before or after the statement iterator.
> 	(instrument_assignment): Factorize from ...
> 	(transform_statements): ... here.  Use maybe_instrument_call to
> 	instrument builtin function calls as well.
> 	(instrument_derefs): Adjust for the new parameter of
> 	build_check_stmt.  Fix detection of bit-field access.

Ok, thanks.

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] [asan] Instrument built-in memory access function calls
  2012-10-24 15:27     ` Dodji Seketeli
  2012-10-24 16:46       ` Jakub Jelinek
@ 2012-10-29 15:15       ` Hans-Peter Nilsson
  1 sibling, 0 replies; 25+ messages in thread
From: Hans-Peter Nilsson @ 2012-10-29 15:15 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Jakub Jelinek, GCC Patches, Diego Novillo, Xinliang David Li, Wei Mi

On Wed, 24 Oct 2012, Dodji Seketeli wrote:
> Jakub Jelinek <jakub@redhat.com> writes:
> > On Tue, Oct 23, 2012 at 03:11:29PM +0200, Dodji Seketeli wrote:
> > > 	* asan.c (insert_if_then_before_iter)
> > > 	(instrument_mem_region_access)
> > > 	(maybe_instrument_builtin_call, maybe_instrument_call): New static
> >
> > Why not just write it:
> > 	* asan.c (insert_if_then_before_iter, instrument_mem_region_access,
> > 	maybe_instrument_builtin_call, maybe_instrument_call): New static
> > ?
>
> It's emacs that formats it like that automatically.  I am not sure how
> to teach him otherwise.  I have fixed this as you want by doing it "by
> hand".

JFTR, the emacs-formatted-version *is* the correct one, at least
when going by the GNU coding standard.  Maybe something about
being eager to balance parentheses as early as possible. :)
But, I'll keep formatting it like that myself until someone
updates the GCC coding standard.  Hint, like.

brgds, H-P

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2012-10-29 14:55 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-23 13:08 [PATCH 0/3][asan] Instrument memory access builtins calls Dodji Seketeli
2012-10-23 13:11 ` [PATCH 2/3] [asan] Factorize condition insertion code out of build_check_stmt Dodji Seketeli
2012-10-23 13:27   ` Jakub Jelinek
2012-10-24 15:11     ` Dodji Seketeli
2012-10-24 15:44       ` Jakub Jelinek
2012-10-23 13:11 ` [PATCH 1/3] [asan] Make build_check_stmt accept an SSA_NAME for its base Dodji Seketeli
2012-10-23 13:14   ` Jakub Jelinek
2012-10-23 13:21 ` [PATCH 3/3] [asan] Instrument built-in memory access function calls Dodji Seketeli
2012-10-23 14:12   ` Jakub Jelinek
2012-10-23 15:51   ` Xinliang David Li
2012-10-23 16:03     ` Jakub Jelinek
2012-10-23 16:13       ` Xinliang David Li
2012-10-24 15:35       ` Dodji Seketeli
2012-10-24 16:10         ` Jakub Jelinek
2012-10-24 16:15           ` Dodji Seketeli
2012-10-23 16:06   ` Jakub Jelinek
2012-10-24 15:27     ` Dodji Seketeli
2012-10-24 16:46       ` Jakub Jelinek
2012-10-25 21:46         ` Dodji Seketeli
2012-10-25 22:13           ` Jakub Jelinek
2012-10-25 22:39             ` Dodji Seketeli
2012-10-25 22:57               ` Jakub Jelinek
2012-10-26 10:55                 ` Dodji Seketeli
2012-10-26 11:33                   ` Jakub Jelinek
2012-10-29 15:15       ` Hans-Peter Nilsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).