public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Asan static optimization (draft)
@ 2014-08-08 10:26 Yury Gribov
  2014-08-08 10:37 ` Jakub Jelinek
  2014-08-14 16:53 ` Konstantin Serebryany
  0 siblings, 2 replies; 10+ messages in thread
From: Yury Gribov @ 2014-08-08 10:26 UTC (permalink / raw)
  To: GCC Patches
  Cc: Jakub Jelinek, Marek Polacek, Konstantin Serebryany,
	Dmitry Vyukov, Dodji Seketeli, Yuri Gribov, Viacheslav Garbuzov

[-- Attachment #1: Type: text/plain, Size: 2222 bytes --]

Hi all,

I have been working on Asan global optimization pass lately. The goal is 
to remove redundant Asan checks from sanitized code. This should 
hopefully reduce Asan's speed/size overhead (which is currently ~2x). 
The patch is not yet ready for trunk (e.g. I haven't done bootstrap, 
etc. but Asan testsuite passed wo errors) but I thought I'd send it for 
preliminary review of algorithm and data structures (*).

Current implementation (based on existing sanopt pass) uses a simple 
iterative intra-procedural dataflow to compute information about living 
checks. For each pointer we track the size of memory that was already 
checked for it. During dataflow iterations, living checks are merged at 
blocks start in a natural way.

I decided to keep current (local) Asan optimizations because they reduce 
compilation time by dropping many obviously redundant checks much 
earlier in the compilation pipeline.

Current results seem to be encouraging: the pass was able to remove 112 
checks (out of 1768) in gcc/asan.c without significant increase in 
sanopt pass runtime.

Before upstreaming this code, I plan to
1) develop extensive set of tests to make sure that sanopt performs 
conservative optimization i.e. does not remove checks too agressively (I 
guess this is a critically important prerequisite so any suggestions are 
welcomed)
2) disable optimizations for very large functions to avoid unbearable 
compile times
3) do detailed performance and efficiency measuments for Asan-bootstrap

I also have some ideas for improving this code (and I'm certainly open 
to suggestions from community):
1) propagating checking information through assignments and PHI-nodes 
(and maybe even pointer arithmetic) should improve efficiency
2) ditto for optimization of symbolic ranges; actually this looks like a 
relatively easy addition to current code (I just need to keep a list of 
checked symbolic ranges to check_info structure)
3) in addition to redundant check removal, we could also move duplicate 
checks from e.g. branches of if-statement to their dominators.

-Y

(*) The patch relies on some additional changes in hash-table and CFG 
which have not yet been upstreamed so it won't go with current trunk.

[-- Attachment #2: asan_opt-1.diff --]
[-- Type: text/x-diff, Size: 13868 bytes --]

commit 602792daa983a6304205cd8fe8ec07b9fd348c3e
Author: Yury Gribov <y.gribov@samsung.com>
Date:   Thu Jul 31 15:07:30 2014 +0400

    Global elimination of redundant Asan checks.
    
    2014-08-07  Yury Gribov  <y.gribov@samsung.com>
    
    	* asan.c (maybe_tree_to_shwi): New function.
    	(instrument_derefs): Factor out common code.
    	(instrument_mem_region_access): Likewise.
    	(asan_expand_check_ifn): Factor out common code.
    	Always perform full word checks. Avoid inserting
    	redundant byte checks.
    	(class check_info): New class.
    	(sanopt_info): New struct.
    	(remove_redundant_asan_checks): New function.
    	(pass_sanopt::execute): Call optimizer of Asan checks.
    	* params.def (PARAM_ASAN_OPTIMIZE_REDUNDANT_CHECKS): New parameter.
    	* params.h (ASAN_OPTIMIZE_REDUNDANT_CHECKS): New define.

diff --git a/gcc/asan.c b/gcc/asan.c
index 4e6f438..0795e73 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -24,6 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "coretypes.h"
 #include "tree.h"
 #include "hash-table.h"
+#include "hash-map.h"
 #include "basic-block.h"
 #include "tree-ssa-alias.h"
 #include "internal-fn.h"
@@ -1253,6 +1254,15 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
   return ret;
 }
 
+/* Converts LEN to HOST_WIDE_INT if possible.
+   Returns -1 otherwise.  */
+
+HOST_WIDE_INT
+maybe_tree_to_shwi (tree len)
+{
+  return tree_fits_shwi_p (len) ? tree_to_shwi (len) : -1;
+}
+
 /* Return true if DECL, a global var, might be overridden and needs
    therefore a local alias.  */
 
@@ -1711,8 +1721,7 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
       && offset == NULL_TREE
       && bitpos >= 0
       && DECL_SIZE (inner)
-      && tree_fits_shwi_p (DECL_SIZE (inner))
-      && bitpos + bitsize <= tree_to_shwi (DECL_SIZE (inner)))
+      && bitpos + bitsize <= maybe_tree_to_shwi (DECL_SIZE (inner)))
     {
       if (DECL_THREAD_LOCAL_P (inner))
 	return;
@@ -1775,7 +1784,7 @@ instrument_mem_region_access (tree base, tree len,
   tree end = asan_mem_ref_get_end (base, len);
   bool end_instrumented = has_mem_ref_been_instrumented (end, 1);
 
-  HOST_WIDE_INT size_in_bytes = tree_fits_shwi_p (len) ? tree_to_shwi (len) : -1;
+  HOST_WIDE_INT size_in_bytes = maybe_tree_to_shwi (len);
 
   build_check_stmt (location, base, len, size_in_bytes, iter,
 		    /*is_non_zero_len*/size_in_bytes > 0, /*before_p*/true,
@@ -2435,7 +2444,22 @@ asan_expand_check_ifn (gimple_stmt_iterator *iter, bool use_calls)
   tree len = gimple_call_arg (g, 2);
 
   HOST_WIDE_INT size_in_bytes
-    = is_scalar_access && tree_fits_shwi_p (len) ? tree_to_shwi (len) : -1;
+    = is_scalar_access ? maybe_tree_to_shwi (len) : -1;
+
+  /* Some checks might have become effectively empty.  */
+  if ((start_instrumented && end_instrumented)
+      || (start_instrumented && size_in_bytes == 1)
+      || (end_instrumented && size_in_bytes == 1))
+    {
+      gsi_remove (iter, true);
+      return true;
+    }
+
+  /* If first byte of word is instrumented
+     we still prefer to perform full check
+     instead of instrumenting just the last byte.  */
+  if (start_instrumented && size_in_bytes > 1)
+    start_instrumented = false;
 
   if (use_calls)
     {
@@ -2643,6 +2667,382 @@ gate_asan (void)
 				DECL_ATTRIBUTES (current_function_decl));
 }
 
+/* Class holds info about sanitized memory regions
+   and performs useful operations on it.  */
+
+class check_info
+{
+public:
+  typedef HOST_WIDE_INT access_range_type;
+
+  check_info () : m (NULL) {}
+
+  check_info (/*const*/ check_info &info) : m (NULL)
+    {
+      *this = info;
+    }
+
+  /* Clear all living checks
+     (e.g. after encountering call to free).  */
+
+  void empty () { if (m) m->empty (); }
+
+  /* Insert information about generated check.  */
+
+  void update (tree ref, access_range_type range);
+
+  /* Intersect checks coming from multiple basic blocks.  */
+
+  void intersect (/*const*/ check_info &info);
+
+  /* Union checks coming from multiple basic blocks.  */
+
+  void join (/*const*/ check_info &info);
+
+  /* Returns TRUE if object contains all checks from INFO,
+     FALSE otherwise.  */
+
+  bool contains (/*const*/ check_info &info) /*const*/;
+
+  /* Returns length of memory checked for address REF.  */
+
+  access_range_type get_range (tree ref);
+
+  check_info &operator = (/*const*/ check_info &info);
+
+  bool operator == (/*const*/ check_info &info) /*const*/
+    {
+      return contains (info) && info.contains (*this);
+    }
+
+  bool operator != (/*const*/ check_info &info) /*const*/
+    {
+      return !(*this == info);
+    }
+
+  void print (FILE *p, const char *tab) const;
+
+private:
+
+  struct check_info_map_traits : default_hashmap_traits
+  {
+    static inline hashval_t hash (const_tree ref)
+      {
+	return iterative_hash_expr (ref, 0);
+      }
+
+    static inline bool equal (const_tree &ref1, const_tree &ref2)
+      {
+	return operand_equal_p (ref1, ref2, 0);
+      }
+  };
+
+  typedef hash_map<tree, access_range_type, check_info_map_traits> map_type;
+
+  void ensure_map ()
+    {
+      if (!m)
+	m = new map_type (10);
+    }
+
+  map_type *m;
+};
+
+void
+check_info::update (tree t, access_range_type range)
+{
+  ensure_map ();
+  bool existed;
+  access_range_type &old_range = m->get_or_insert (t, &existed);
+  if (!existed || old_range < range)
+    old_range = range;
+}
+
+check_info::access_range_type
+check_info::get_range (tree ref)
+{
+  if (!m)
+    return -1;
+  access_range_type *range = m->get (ref);
+  return range ? *range : -1;
+}
+
+void
+check_info::intersect (/*const*/ check_info &info)
+{
+  if (!m)
+    return;
+  if (!info.m)
+    {
+      empty ();
+      return;
+    }
+  map_type::iterator i;
+  for (i = m->begin (); i != m->end (); ++i)
+    {
+      access_range_type *range = info.m->get (i.key ());
+      if (!range)
+	m->remove (i.key ());
+      else if (*range < *i)
+	*i = *range;
+    }
+}
+
+void
+check_info::join (/*const*/ check_info &info)
+{
+  if (!info.m)
+    return;
+  ensure_map ();
+  map_type::iterator i;
+  for (i = info.m->begin (); i != info.m->end (); ++i)
+    {
+      bool existed;
+      access_range_type &old_range = m->get_or_insert (i.key (), &existed);
+      if (!existed || *i > old_range)
+	old_range = *i;
+    }
+}
+
+bool
+check_info::contains (/*const*/ check_info &info)
+{
+  if (!info.m)
+    return true;
+  if (!m)
+    return info.m->elements () == 0;
+  map_type::iterator i;
+  for (i = info.m->begin (); i != info.m->end (); ++i)
+    {
+      if (!m->get (i.key ()))
+	return false;
+    }
+  return true;
+}
+
+check_info &
+check_info::operator = (/*const*/ check_info &info)
+{
+  if (m)
+    m->empty ();
+  join (info);
+  return *this;
+}
+
+void
+check_info::print (FILE *p, const char *tab) const
+{
+  if (!m)
+    return;
+  map_type::iterator i;
+  for (i = m->begin (); i != m->end (); ++i)
+    {
+      fprintf (p, "%s", tab);
+      print_generic_expr (p, i.key (), 0);
+      fprintf (p, " (size %d)\n", (int) *i);
+    }
+}
+
+struct sanopt_info
+{
+  /* Checks performed in this BB.  */
+  check_info local;
+  /* Checks available upon into to BB.  */
+  check_info in;
+  /* Checks available at the end of BB.  */
+  check_info out;
+  /* Is this BB in worklist?  */
+  bool avail_in_worklist_p;
+};
+
+/* Remove redundant Asan checks in function FUN.
+
+   TODO:
+   - upper-exposed checks
+   - handle symbolic lengths (somehow).  */
+
+void
+remove_redundant_asan_checks (function *fun)
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+
+  /* Initialize dataflow info.  */
+
+  alloc_aux_for_blocks (fun, sizeof (sanopt_info));
+
+  FOR_ALL_BB_FN (bb, fun)
+    {
+      sanopt_info *info = (sanopt_info *) bb->aux;
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+ 	  gimple g = gsi_stmt (gsi);
+	  if (is_gimple_call (g)
+	      && gimple_call_internal_p (g)
+	      && gimple_call_internal_fn (g) == IFN_ASAN_CHECK)
+	    {
+	      HOST_WIDE_INT flags = tree_to_shwi (gimple_call_arg (g, 0));
+	      tree addr = gimple_call_arg (g, 1);
+	      tree len = gimple_call_arg (g, 2);
+
+	      HOST_WIDE_INT size_in_bytes = maybe_tree_to_shwi (len);
+	      if (size_in_bytes == -1 && (flags & ASAN_CHECK_NON_ZERO_LEN))
+		size_in_bytes = 1;
+
+	      if (size_in_bytes != -1)
+		info->local.update (addr, size_in_bytes);
+	    }
+	  else if (is_gimple_call (g) && !nonfreeing_call_p (g))
+	    {
+	      info->local.empty ();
+	    }
+	}
+
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  fprintf (dump_file, "Checks generated in bb %d\n", bb->index);
+	  info->local.print (dump_file, "  ");
+	}
+    }
+
+  basic_block *worklist, *qin, *qout, *qend;
+  unsigned qlen;
+
+  /* Allocate a worklist array/queue.  Entries are only added to the
+     list if they were not already on the list.  So the size is
+     bounded by the number of basic blocks in the region.  */
+  qlen = n_basic_blocks_for_fn (fun) - NUM_FIXED_BLOCKS;
+  qin = qout = worklist =
+    XNEWVEC (basic_block, qlen);
+
+  /* Put blocks on the worklist.  */
+  FOR_EACH_BB_FN (bb, fun)
+    {
+      sanopt_info *info = (sanopt_info *) bb->aux;
+
+      /* Seed OUT with LOCAL.  */
+      info->out = info->local;
+
+      info->avail_in_worklist_p = true;
+
+      *qin++ = bb;
+    }
+
+  qin = worklist;
+  qend = &worklist[qlen];
+
+  /* Iterate until the worklist is empty.  */
+  while (qlen)
+    {
+      /* Take the first entry off the worklist.  */
+      bb = *qout++;
+      qlen--;
+
+      if (qout >= qend)
+	qout = worklist;
+
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	fprintf (dump_file, "Recomputing dataflow info for bb %d\n", bb->index);
+
+      sanopt_info *info = (sanopt_info *) bb->aux;
+
+      info->avail_in_worklist_p = false;
+
+      /* Recompute IN.  */
+      if (EDGE_COUNT (bb->preds) > 0)
+	info->in = ((sanopt_info *) EDGE_PRED (bb, 0)->src->aux)->out;
+      else
+	info->in.empty ();
+      for (unsigned i = 1; i < EDGE_COUNT (bb->preds); ++i)
+	info->in.intersect (((sanopt_info *) EDGE_PRED (bb, i)->src->aux)->out);
+
+      /* Recompute OUT.  */
+      check_info tmp = info->in;
+      tmp.join (info->local);
+      if (tmp != info->out)
+	{
+	  info->out = tmp;
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    {
+	      fprintf (dump_file, "Output checks changed for bb %d:\n", bb->index);
+	      info->out.print (dump_file, "  ");
+	    }
+
+	  /* Add successors to worklist.  */
+	  for (unsigned i = 0; i < EDGE_COUNT (bb->succs); ++i)
+	    {
+	      basic_block succ = EDGE_SUCC (bb, i)->dest;
+	      sanopt_info *succ_info = (sanopt_info *) succ->aux;
+	      if (!succ_info->avail_in_worklist_p
+		  && succ->index >= NUM_FIXED_BLOCKS)
+		{
+		  *qin++ = succ;
+		  succ_info->avail_in_worklist_p = true;
+		  qlen++;
+
+		  if (qin >= qend)
+		    qin = worklist;
+		}
+	    }
+	}
+    }
+
+  free (worklist);
+
+  /* Remove redundant checks.  */
+
+  FOR_ALL_BB_FN (bb, fun)
+    {
+      sanopt_info *info = (sanopt_info *) bb->aux;
+      check_info cur = info->in;
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
+	{
+ 	  gimple g = gsi_stmt (gsi);
+
+	  if (is_gimple_call (g)
+	      && gimple_call_internal_p (g)
+	      && gimple_call_internal_fn (g) == IFN_ASAN_CHECK)
+	    {
+	      HOST_WIDE_INT flags = tree_to_shwi (gimple_call_arg (g, 0));
+	      tree addr = gimple_call_arg (g, 1);
+	      tree len = gimple_call_arg (g, 2);
+
+	      HOST_WIDE_INT size_in_bytes = maybe_tree_to_shwi (len);
+	      check_info::access_range_type range = cur.get_range (addr);
+	      if (size_in_bytes == -1 || range < size_in_bytes)
+		{
+		  if (range > 0)
+		    {
+		      /* Remember that first byte is instrumented.  */
+		      flags |= ASAN_CHECK_START_INSTRUMENTED;
+		      *gimple_call_arg_ptr (g, 0)
+			= build_int_cst (integer_type_node, flags);
+		    }
+		  gsi_next (&gsi);
+		  continue;
+		}
+
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		{
+		  fprintf (dump_file, "Removing redundant Asan check in bb %d: ", bb->index);
+		  print_gimple_stmt (dump_file, gsi_stmt (gsi), 0, 0);
+		  fputc ('\n', dump_file);
+		}
+
+	      gsi_remove (&gsi, true);
+	      continue;
+	    }
+	  else if (is_gimple_call (g) && !nonfreeing_call_p (g))
+	    {
+	      cur.empty ();
+	    }
+	  gsi_next (&gsi);
+	}
+    }
+
+  free_aux_for_blocks (fun);
+}
+
 namespace {
 
 const pass_data pass_data_asan =
@@ -2751,6 +3151,10 @@ pass_sanopt::execute (function *fun)
 {
   basic_block bb;
 
+  if ((flag_sanitize & SANITIZE_ADDRESS)
+      && PARAM_ASAN_OPTIMIZE_REDUNDANT_CHECKS)
+    remove_redundant_asan_checks (fun);
+
   int asan_num_accesses = 0;
   if (flag_sanitize & SANITIZE_ADDRESS)
     {
diff --git a/gcc/params.def b/gcc/params.def
index aefdd07..a53c00f 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1094,6 +1094,11 @@ DEFPARAM (PARAM_ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD,
          "in function becomes greater or equal to this number",
          7000, 0, INT_MAX)
 
+DEFPARAM (PARAM_ASAN_OPTIMIZE_REDUNDANT_CHECKS,
+         "asan-optimize-redundant-checks",
+         "Remove redundant Asan memory checks",
+         1, 0, 1)
+
 DEFPARAM (PARAM_UNINIT_CONTROL_DEP_ATTEMPTS,
 	  "uninit-control-dep-attempts",
 	  "Maximum number of nested calls to search for control dependencies "
diff --git a/gcc/params.h b/gcc/params.h
index d488e32..14a2a3e 100644
--- a/gcc/params.h
+++ b/gcc/params.h
@@ -234,5 +234,7 @@ extern void init_param_values (int *params);
   PARAM_VALUE (PARAM_ASAN_USE_AFTER_RETURN)
 #define ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD \
   PARAM_VALUE (PARAM_ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD)
+#define ASAN_OPTIMIZE_REDUNDANT_CHECKS \
+  PARAM_VALUE (PARAM_ASAN_OPTIMIZE_REDUNDANT_CHECKS)
 
 #endif /* ! GCC_PARAMS_H */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Asan static optimization (draft)
  2014-08-08 10:26 [PATCH] Asan static optimization (draft) Yury Gribov
@ 2014-08-08 10:37 ` Jakub Jelinek
  2014-08-08 10:43   ` Dmitry Vyukov
  2014-08-14 16:53 ` Konstantin Serebryany
  1 sibling, 1 reply; 10+ messages in thread
From: Jakub Jelinek @ 2014-08-08 10:37 UTC (permalink / raw)
  To: Yury Gribov
  Cc: GCC Patches, Marek Polacek, Konstantin Serebryany, Dmitry Vyukov,
	Dodji Seketeli, Yuri Gribov, Viacheslav Garbuzov

On Fri, Aug 08, 2014 at 02:26:25PM +0400, Yury Gribov wrote:
> I have been working on Asan global optimization pass lately. The goal is to
> remove redundant Asan checks from sanitized code. This should hopefully
> reduce Asan's speed/size overhead (which is currently ~2x). The patch is not
> yet ready for trunk (e.g. I haven't done bootstrap, etc. but Asan testsuite
> passed wo errors) but I thought I'd send it for preliminary review of
> algorithm and data structures (*).

Thanks for working on it, I've just quickly skimmed it and it looks
reasonable.

Similar optimization could be used for tsan builtins, or some of the ubsan
builtins (which is the reason why the pass is called sanopt).

> 3) in addition to redundant check removal, we could also move duplicate
> checks from e.g. branches of if-statement to their dominators.

For that you'd need to be extra careful, you would need to avoid doing that
if any path doesn't contain the check, or if there are calls that e.g. could
not return (abort/exit/have infinite loop) in between, etc.

	Jakub

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Asan static optimization (draft)
  2014-08-08 10:37 ` Jakub Jelinek
@ 2014-08-08 10:43   ` Dmitry Vyukov
  2014-08-15  6:59     ` Yuri Gribov
  0 siblings, 1 reply; 10+ messages in thread
From: Dmitry Vyukov @ 2014-08-08 10:43 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Yury Gribov, GCC Patches, Marek Polacek, Konstantin Serebryany,
	Dodji Seketeli, Yuri Gribov, Viacheslav Garbuzov

On Fri, Aug 8, 2014 at 2:37 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, Aug 08, 2014 at 02:26:25PM +0400, Yury Gribov wrote:
>> I have been working on Asan global optimization pass lately. The goal is to
>> remove redundant Asan checks from sanitized code. This should hopefully
>> reduce Asan's speed/size overhead (which is currently ~2x). The patch is not
>> yet ready for trunk (e.g. I haven't done bootstrap, etc. but Asan testsuite
>> passed wo errors) but I thought I'd send it for preliminary review of
>> algorithm and data structures (*).
>
> Thanks for working on it, I've just quickly skimmed it and it looks
> reasonable.
>
> Similar optimization could be used for tsan builtins, or some of the ubsan
> builtins (which is the reason why the pass is called sanopt).


That would be great.
Note that tsan optimizations must be based around a different
criteria. Any unlock/release operations must reset set of already
checked locations. Also tsan does not care about non-escaped to other
threads variables; e.g. it does not care about
local_stack_array[rand()] expressions, while asan does.


>> 3) in addition to redundant check removal, we could also move duplicate
>> checks from e.g. branches of if-statement to their dominators.
>
> For that you'd need to be extra careful, you would need to avoid doing that
> if any path doesn't contain the check, or if there are calls that e.g. could
> not return (abort/exit/have infinite loop) in between, etc.
>
>         Jakub

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Asan static optimization (draft)
  2014-08-08 10:26 [PATCH] Asan static optimization (draft) Yury Gribov
  2014-08-08 10:37 ` Jakub Jelinek
@ 2014-08-14 16:53 ` Konstantin Serebryany
  2014-08-15  6:55   ` Yuri Gribov
  1 sibling, 1 reply; 10+ messages in thread
From: Konstantin Serebryany @ 2014-08-14 16:53 UTC (permalink / raw)
  To: Yury Gribov
  Cc: GCC Patches, Jakub Jelinek, Marek Polacek, Dmitry Vyukov,
	Dodji Seketeli, Yuri Gribov, Viacheslav Garbuzov

Indeed, thanks for working on this.
We've been wanting such optimization phase from day one, but never got
to implementing it (except for a few simple ones).
https://code.google.com/p/address-sanitizer/wiki/CompileTimeOptimizations
There have been several attempts outside of our team to do such
optimizations, but none of them made it to llvm trunk.

In order for your work to be generally useful, I'd ask several things:
- Update https://code.google.com/p/address-sanitizer/wiki/CompileTimeOptimizations
with examples that will be handled
- Create small standalone test cases in C
- Don't put the tests under GPL (otherwise we'll not be able to reuse
them in LLVM)

>> make sure that sanopt performs conservative optimization
Yes. I don't know a good solution to this problem, which is why we did
not attack it before.
Increasing asan speed will save lots of CPU time, but missing a single software
bug due to an overly aggressive optimization may cost much more.

We can do a detailed analysis of *simple* peephole-like optimizers,
but *proving* that a complex optimizer does not remove needed checks
is hardly practically possible.


On Fri, Aug 8, 2014 at 3:26 AM, Yury Gribov <y.gribov@samsung.com> wrote:
> Hi all,
>
> I have been working on Asan global optimization pass lately. The goal is to
> remove redundant Asan checks from sanitized code. This should hopefully
> reduce Asan's speed/size overhead (which is currently ~2x). The patch is not
> yet ready for trunk (e.g. I haven't done bootstrap, etc. but Asan testsuite
> passed wo errors) but I thought I'd send it for preliminary review of
> algorithm and data structures (*).
>
> Current implementation (based on existing sanopt pass) uses a simple
> iterative intra-procedural dataflow to compute information about living
> checks. For each pointer we track the size of memory that was already
> checked for it. During dataflow iterations, living checks are merged at
> blocks start in a natural way.
>
> I decided to keep current (local) Asan optimizations because they reduce
> compilation time by dropping many obviously redundant checks much earlier in
> the compilation pipeline.
>
> Current results seem to be encouraging: the pass was able to remove 112
> checks (out of 1768) in gcc/asan.c without significant increase in sanopt
> pass runtime.
>
> Before upstreaming this code, I plan to
> 1) develop extensive set of tests to make sure that sanopt performs
> conservative optimization i.e. does not remove checks too agressively (I
> guess this is a critically important prerequisite so any suggestions are
> welcomed)
> 2) disable optimizations for very large functions to avoid unbearable
> compile times
> 3) do detailed performance and efficiency measuments for Asan-bootstrap
>
> I also have some ideas for improving this code (and I'm certainly open to
> suggestions from community):
> 1) propagating checking information through assignments and PHI-nodes (and
> maybe even pointer arithmetic) should improve efficiency
> 2) ditto for optimization of symbolic ranges; actually this looks like a
> relatively easy addition to current code (I just need to keep a list of
> checked symbolic ranges to check_info structure)
> 3) in addition to redundant check removal, we could also move duplicate
> checks from e.g. branches of if-statement to their dominators.
>
> -Y
>
> (*) The patch relies on some additional changes in hash-table and CFG which
> have not yet been upstreamed so it won't go with current trunk.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Asan static optimization (draft)
  2014-08-14 16:53 ` Konstantin Serebryany
@ 2014-08-15  6:55   ` Yuri Gribov
  2014-08-15 18:34     ` Konstantin Serebryany
  0 siblings, 1 reply; 10+ messages in thread
From: Yuri Gribov @ 2014-08-15  6:55 UTC (permalink / raw)
  To: Konstantin Serebryany
  Cc: Yury Gribov, GCC Patches, Jakub Jelinek, Marek Polacek,
	Dmitry Vyukov, Dodji Seketeli, Viacheslav Garbuzov

On Thu, Aug 14, 2014 at 8:53 PM, Konstantin Serebryany
<konstantin.s.serebryany@gmail.com> wrote:
> In order for your work to be generally useful, I'd ask several things:
> - Update https://code.google.com/p/address-sanitizer/wiki/CompileTimeOptimizations
> with examples that will be handled

Done (to be honest I only plan to do full redundancy elimination for
now, hoisting would hopefully follow later). Note I'm still
experimenting so there may be some changes in actual implementation.

> - Create small standalone test cases in C
> - Don't put the tests under GPL (otherwise we'll not be able to reuse
> them in LLVM)

I already have a bunch of tests (which I plan to extend further). How
should I submit them s.t. they could be reused by LLVM?

>>> make sure that sanopt performs conservative optimization
> Yes. I don't know a good solution to this problem, which is why we did
> not attack it before.
> Increasing asan speed will save lots of CPU time, but missing a single software
> bug due to an overly aggressive optimization may cost much more.

Yeah. I thought about manually inspecting optimizations that are
performed for some large files (e.g. GCC's asan.c) and maybe doing
some  random verifications of Asan trophies
(http://code.google.com/p/address-sanitizer/wiki/FoundBugs). Ideally
we'd have some test generator but making a reasonable one for C sounds
laughable. Perhaps there is some prior work on verification of Java
range checks optimizers?

-Y

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Asan static optimization (draft)
  2014-08-08 10:43   ` Dmitry Vyukov
@ 2014-08-15  6:59     ` Yuri Gribov
  0 siblings, 0 replies; 10+ messages in thread
From: Yuri Gribov @ 2014-08-15  6:59 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Jakub Jelinek, Yury Gribov, GCC Patches, Marek Polacek,
	Konstantin Serebryany, Dodji Seketeli, Viacheslav Garbuzov

On Fri, Aug 8, 2014 at 2:43 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> Similar optimization could be used for tsan builtins, or some of the ubsan
>> builtins (which is the reason why the pass is called sanopt).
>
> That would be great.
> Note that tsan optimizations must be based around a different
> criteria. Any unlock/release operations must reset set of already
> checked locations.

Right, I think the general algorithm should be quite similar, only the
KILL sets and actual check instructions would be different.
Unfortunately I don't (yet) have any experience neither with TSan, nor
with UBSan so I'd prefer to leave their optimization for future. The
code should be straightforward to adapt (via templating or callbacking
in couple of places).

-Y

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Asan static optimization (draft)
  2014-08-15  6:55   ` Yuri Gribov
@ 2014-08-15 18:34     ` Konstantin Serebryany
  2014-08-18 20:06       ` Yuri Gribov
  0 siblings, 1 reply; 10+ messages in thread
From: Konstantin Serebryany @ 2014-08-15 18:34 UTC (permalink / raw)
  To: Yuri Gribov
  Cc: Yury Gribov, GCC Patches, Jakub Jelinek, Marek Polacek,
	Dmitry Vyukov, Dodji Seketeli, Viacheslav Garbuzov

On Thu, Aug 14, 2014 at 11:55 PM, Yuri Gribov <tetra2005@gmail.com> wrote:
> On Thu, Aug 14, 2014 at 8:53 PM, Konstantin Serebryany
> <konstantin.s.serebryany@gmail.com> wrote:
>> In order for your work to be generally useful, I'd ask several things:
>> - Update https://code.google.com/p/address-sanitizer/wiki/CompileTimeOptimizations
>> with examples that will be handled
>
> Done (to be honest I only plan to do full redundancy elimination for
> now, hoisting would hopefully follow later). Note I'm still
> experimenting so there may be some changes in actual implementation.

Thanks.
if we are running -O0, we should not care about optimizing asan
instrumentation.
If this is -O1 or higher,  then most (but not all) of your cases
*should* be optimized by the compiler before asan kicks in.
(This may be different for GCC-asan because GCC-asan runs a bit too
early, AFAICT. Maybe this *is* the problem we need to fix).
If there is a case where the regular optimizer can optimize the code
before asan but it doesn't --
we should fix the general optimizer or the phase ordering instead of
enhancing asan opt phase.
I am mainly interested in cases where the general optimizer can not
possibly improve the code,
but asan opt can eliminate redundant checks.


>
>> - Create small standalone test cases in C
>> - Don't put the tests under GPL (otherwise we'll not be able to reuse
>> them in LLVM)
>
> I already have a bunch of tests (which I plan to extend further). How
> should I submit them s.t. they could be reused by LLVM?

Maybe just start accumulating tests on the CompileTimeOptimizations wiki
(as full functions that one can copy-paste to a .c file and build)?
Once some new optimization is implemented in a compiler X,
we'll copy the test with proper harness code (FileCheck/dejagnu/etc)
to the X's repository

>

>>>> make sure that sanopt performs conservative optimization
>> Yes. I don't know a good solution to this problem, which is why we did
>> not attack it before.
>> Increasing asan speed will save lots of CPU time, but missing a single software
>> bug due to an overly aggressive optimization may cost much more.
>
> Yeah. I thought about manually inspecting optimizations that are
> performed for some large files (e.g. GCC's asan.c) and maybe doing
> some  random verifications of Asan trophies
> (http://code.google.com/p/address-sanitizer/wiki/FoundBugs). Ideally
> we'd have some test generator but making a reasonable one for C sounds
> laughable. Perhaps there is some prior work on verification of Java
> range checks optimizers?

There is ton of work for range check optimizers, but none of that
fully applies to asan,
since asan also checks use-after-free and init-order-fiasco.


>
> -Y

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Asan static optimization (draft)
  2014-08-15 18:34     ` Konstantin Serebryany
@ 2014-08-18 20:06       ` Yuri Gribov
  2014-08-19 23:43         ` Konstantin Serebryany
  0 siblings, 1 reply; 10+ messages in thread
From: Yuri Gribov @ 2014-08-18 20:06 UTC (permalink / raw)
  To: Konstantin Serebryany
  Cc: Yury Gribov, GCC Patches, Jakub Jelinek, Marek Polacek,
	Dmitry Vyukov, Dodji Seketeli, Viacheslav Garbuzov

On Fri, Aug 15, 2014 at 10:34 PM, Konstantin Serebryany
<konstantin.s.serebryany@gmail.com> wrote:
> If this is -O1 or higher,  then most (but not all) of your cases
> *should* be optimized by the compiler before asan kicks in.

You mean hoisting memory accesses from branches?
Sure, the tests are simplified beyond all limits.
On the other hand even basic aliasing constraints
would obviously prevent optimization of pointer accesses
(but not of Asan checks!).

> (This may be different for GCC-asan because GCC-asan runs a bit too
> early, AFAICT. Maybe this *is* the problem we need to fix).

Thanks for mentioning this, I'll try to experiment
with moving Asan pass across the compilation pipeline and
report my findings.

> I am mainly interested in cases where the general optimizer can not
> possibly improve the code,
> but asan opt can eliminate redundant checks.

As I mentined, I believe that aliasing may often restrict
optimizer's code hoisting ability.
And we certainly need Asan opt for combining checks.

>> I already have a bunch of tests (which I plan to extend further). How
>> should I submit them s.t. they could be reused by LLVM?
>
> Maybe just start accumulating tests on the CompileTimeOptimizations wiki
> (as full functions that one can copy-paste to a .c file and build)?
> Once some new optimization is implemented in a compiler X,
> we'll copy the test with proper harness code (FileCheck/dejagnu/etc)
> to the X's repository

Ok, so I'll firstly post tests on wiki and explicitely reference their
origin when submitting patches.

>> Perhaps there is some prior work on verification of Java
>> range checks optimizers?
>
> There is ton of work for range check optimizers,
> but none of that
> fully applies to asan,
> since asan also checks use-after-free and init-order-fiasco.

Could you give more details? How would
use-after-free/init-order-fiasco influence verification of redundant
elimination?

-Y

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Asan static optimization (draft)
  2014-08-18 20:06       ` Yuri Gribov
@ 2014-08-19 23:43         ` Konstantin Serebryany
  2014-08-20  0:06           ` Konstantin Serebryany
  0 siblings, 1 reply; 10+ messages in thread
From: Konstantin Serebryany @ 2014-08-19 23:43 UTC (permalink / raw)
  To: Yuri Gribov
  Cc: Yury Gribov, GCC Patches, Jakub Jelinek, Marek Polacek,
	Dmitry Vyukov, Dodji Seketeli, Viacheslav Garbuzov

On Mon, Aug 18, 2014 at 1:05 PM, Yuri Gribov <tetra2005@gmail.com> wrote:
> On Fri, Aug 15, 2014 at 10:34 PM, Konstantin Serebryany
> <konstantin.s.serebryany@gmail.com> wrote:
>> If this is -O1 or higher,  then most (but not all) of your cases
>> *should* be optimized by the compiler before asan kicks in.
>
> You mean hoisting memory accesses from branches?
> Sure, the tests are simplified beyond all limits.
> On the other hand even basic aliasing constraints
> would obviously prevent optimization of pointer accesses
> (but not of Asan checks!).

Exactly. I am interested in full non-simplified tests where the
regular optimizer has no chances,
but asan- (tsan-, msan-) specific optimizer has.

>
>> (This may be different for GCC-asan because GCC-asan runs a bit too
>> early, AFAICT. Maybe this *is* the problem we need to fix).
>
> Thanks for mentioning this, I'll try to experiment
> with moving Asan pass across the compilation pipeline and
> report my findings.
>
>> I am mainly interested in cases where the general optimizer can not
>> possibly improve the code,
>> but asan opt can eliminate redundant checks.
>
> As I mentined, I believe that aliasing may often restrict
> optimizer's code hoisting ability.
> And we certainly need Asan opt for combining checks.
>
>>> I already have a bunch of tests (which I plan to extend further). How
>>> should I submit them s.t. they could be reused by LLVM?
>>
>> Maybe just start accumulating tests on the CompileTimeOptimizations wiki
>> (as full functions that one can copy-paste to a .c file and build)?
>> Once some new optimization is implemented in a compiler X,
>> we'll copy the test with proper harness code (FileCheck/dejagnu/etc)
>> to the X's repository
>
> Ok, so I'll firstly post tests on wiki and explicitely reference their
> origin when submitting patches.
>
>>> Perhaps there is some prior work on verification of Java
>>> range checks optimizers?
>>
>> There is ton of work for range check optimizers,
>> but none of that
>> fully applies to asan,
>> since asan also checks use-after-free and init-order-fiasco.
>
> Could you give more details? How would
> use-after-free/init-order-fiasco influence verification of redundant
> elimination?


Consider this:

extern void foo(int *);
int *a = new int [10];
foo(a);
... = a[5];

here, a[5] can not go out of bounds and regular Java-oriented
algorithms will mark this as safe.
However, foo() may delete 'a' and we'll have a use-after-free here.

--kcc

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Asan static optimization (draft)
  2014-08-19 23:43         ` Konstantin Serebryany
@ 2014-08-20  0:06           ` Konstantin Serebryany
  0 siblings, 0 replies; 10+ messages in thread
From: Konstantin Serebryany @ 2014-08-20  0:06 UTC (permalink / raw)
  To: Yuri Gribov
  Cc: Yury Gribov, GCC Patches, Jakub Jelinek, Marek Polacek,
	Dmitry Vyukov, Dodji Seketeli, Viacheslav Garbuzov

BTW, here is one particular kind of static analysis which I am very
much interested in:
https://code.google.com/p/address-sanitizer/wiki/CompileTimeOptimizations#Escape_analysis_for_use-after-return

Here we can prove that a does not escape foo() and thus we may avoid
putting it on fake stack.

int bar();
int foo(unsigned i) {
  int a[3] = {bar(), bar(), bar()};
  return a[i % 3];
}

--kcc

On Tue, Aug 19, 2014 at 4:42 PM, Konstantin Serebryany
<konstantin.s.serebryany@gmail.com> wrote:
> On Mon, Aug 18, 2014 at 1:05 PM, Yuri Gribov <tetra2005@gmail.com> wrote:
>> On Fri, Aug 15, 2014 at 10:34 PM, Konstantin Serebryany
>> <konstantin.s.serebryany@gmail.com> wrote:
>>> If this is -O1 or higher,  then most (but not all) of your cases
>>> *should* be optimized by the compiler before asan kicks in.
>>
>> You mean hoisting memory accesses from branches?
>> Sure, the tests are simplified beyond all limits.
>> On the other hand even basic aliasing constraints
>> would obviously prevent optimization of pointer accesses
>> (but not of Asan checks!).
>
> Exactly. I am interested in full non-simplified tests where the
> regular optimizer has no chances,
> but asan- (tsan-, msan-) specific optimizer has.
>
>>
>>> (This may be different for GCC-asan because GCC-asan runs a bit too
>>> early, AFAICT. Maybe this *is* the problem we need to fix).
>>
>> Thanks for mentioning this, I'll try to experiment
>> with moving Asan pass across the compilation pipeline and
>> report my findings.
>>
>>> I am mainly interested in cases where the general optimizer can not
>>> possibly improve the code,
>>> but asan opt can eliminate redundant checks.
>>
>> As I mentined, I believe that aliasing may often restrict
>> optimizer's code hoisting ability.
>> And we certainly need Asan opt for combining checks.
>>
>>>> I already have a bunch of tests (which I plan to extend further). How
>>>> should I submit them s.t. they could be reused by LLVM?
>>>
>>> Maybe just start accumulating tests on the CompileTimeOptimizations wiki
>>> (as full functions that one can copy-paste to a .c file and build)?
>>> Once some new optimization is implemented in a compiler X,
>>> we'll copy the test with proper harness code (FileCheck/dejagnu/etc)
>>> to the X's repository
>>
>> Ok, so I'll firstly post tests on wiki and explicitely reference their
>> origin when submitting patches.
>>
>>>> Perhaps there is some prior work on verification of Java
>>>> range checks optimizers?
>>>
>>> There is ton of work for range check optimizers,
>>> but none of that
>>> fully applies to asan,
>>> since asan also checks use-after-free and init-order-fiasco.
>>
>> Could you give more details? How would
>> use-after-free/init-order-fiasco influence verification of redundant
>> elimination?
>
>
> Consider this:
>
> extern void foo(int *);
> int *a = new int [10];
> foo(a);
> ... = a[5];
>
> here, a[5] can not go out of bounds and regular Java-oriented
> algorithms will mark this as safe.
> However, foo() may delete 'a' and we'll have a use-after-free here.
>
> --kcc

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-08-20  0:06 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-08 10:26 [PATCH] Asan static optimization (draft) Yury Gribov
2014-08-08 10:37 ` Jakub Jelinek
2014-08-08 10:43   ` Dmitry Vyukov
2014-08-15  6:59     ` Yuri Gribov
2014-08-14 16:53 ` Konstantin Serebryany
2014-08-15  6:55   ` Yuri Gribov
2014-08-15 18:34     ` Konstantin Serebryany
2014-08-18 20:06       ` Yuri Gribov
2014-08-19 23:43         ` Konstantin Serebryany
2014-08-20  0:06           ` Konstantin Serebryany

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).