public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc r12-8632] analyzer: handle repeated accesses after init of unknown size [PR105285]
@ 2022-07-27 21:55 David Malcolm
  0 siblings, 0 replies; only message in thread
From: David Malcolm @ 2022-07-27 21:55 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:05530fcea07a9ee5c7501867f3f11f0fbc504a06

commit r12-8632-g05530fcea07a9ee5c7501867f3f11f0fbc504a06
Author: David Malcolm <dmalcolm@redhat.com>
Date:   Wed Jul 27 17:38:53 2022 -0400

    analyzer: handle repeated accesses after init of unknown size [PR105285]
    
    (cherry-picked from r13-7-g00c4405cd7f6a144d0a439e4d848d246920e6ff3)
    
    PR analyzer/105285 reports a false positive from
    -Wanalyzer-null-dereference on git.git's reftable/reader.c.
    
    A reduced version of the problem can be seen in test_1a of
    gcc.dg/analyzer/symbolic-12.c in the following:
    
    void test_1a (void *p, unsigned next_off)
    {
      struct st_1 *r = p;
    
      external_fn();
    
      if (next_off >= r->size)
        return;
    
      if (next_off >= r->size)
        /* We should have already returned if this is the case.  */
        __analyzer_dump_path (); /* { dg-bogus "path" } */
    }
    
    where the analyzer erroneously considers this path, where
    (next_off >= r->size) is both false and then true:
    
    symbolic-12.c: In function ‘test_1a’:
    symbolic-12.c:22:5: note: path
       22 |     __analyzer_dump_path (); /* { dg-bogus "path" } */
          |     ^~~~~~~~~~~~~~~~~~~~~~~
      ‘test_1a’: events 1-5
        |
        |   17 |   if (next_off >= r->size)
        |      |      ^
        |      |      |
        |      |      (1) following ‘false’ branch...
        |......
        |   20 |   if (next_off >= r->size)
        |      |      ~            ~~~~~~~
        |      |      |             |
        |      |      |             (2) ...to here
        |      |      (3) following ‘true’ branch...
        |   21 |     /* We should have already returned if this is the case.  */
        |   22 |     __analyzer_dump_path (); /* { dg-bogus "path" } */
        |      |     ~~~~~~~~~~~~~~~~~~~~~~~
        |      |     |
        |      |     (4) ...to here
        |      |     (5) here
        |
    
    The root cause is that, at the call to the external function, the
    analyzer considers the cluster for *p to have been touched, binding it
    to a conjured_svalue, but because p is void * no particular size is
    known for the write, and so the cluster is bound using a symbolic key
    covering the base region.  Later, the accesses to r->size are handled by
    binding_cluster::get_any_binding, but binding_cluster::get_binding fails
    to find a match for the concrete field lookup, due to the key for the
    binding being symbolic, and reaching this code:
    
    1522  /* If this cluster has been touched by a symbolic write, then the content
    1523     of any subregion not currently specifically bound is "UNKNOWN".  */
    1524  if (m_touched)
    1525    {
    1526      region_model_manager *rmm_mgr = mgr->get_svalue_manager ();
    1527      return rmm_mgr->get_or_create_unknown_svalue (reg->get_type ());
    1528    }
    
    Hence each access to r->size is an unknown svalue, and thus the
    condition (next_off >= r->size) isn't tracked, leading to the path with
    contradictory conditions being treated as satisfiable.
    
    In the original reproducer in git's reftable/reader.c, the call to the
    external fn is:
      reftable_record_type(rec)
    which is considered to possibly write to *rec, which is *tab, where tab
    is the void * argument to reftable_reader_seek_void, and thus after the
    call to reftable_record_type some arbitrary amount of *rec could have
    been written to.
    
    This patch fixes things by detecting the "this cluster has been 'filled'
    with a conjured value of unknown size" case, and handling
    get_any_binding on it by returning a sub_svalue of the conjured_svalue,
    so that repeated accesses to r->size give the same symbolic value, so
    that the constraint manager rejects the bogus execution path, fixing the
    false positive.
    
    gcc/analyzer/ChangeLog:
            PR analyzer/105285
            * store.cc (binding_cluster::get_any_binding): Handle accessing
            sub_svalues of clusters where the base region has a symbolic
            binding.
    
    gcc/testsuite/ChangeLog:
            PR analyzer/105285
            * gcc.dg/analyzer/symbolic-12.c: New test.
    
    Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Diff:
---
 gcc/analyzer/store.cc                       |  12 ++++
 gcc/testsuite/gcc.dg/analyzer/symbolic-12.c | 106 ++++++++++++++++++++++++++++
 2 files changed, 118 insertions(+)

diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index 35f66a4b6fc..f5f8fe061f5 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -1519,6 +1519,18 @@ binding_cluster::get_any_binding (store_manager *mgr,
       = get_binding_recursive (mgr, reg))
     return direct_sval;
 
+  /* If we had a write to a cluster of unknown size, we might
+     have a self-binding of the whole base region with an svalue,
+     where the base region is symbolic.
+     Handle such cases by returning sub_svalue instances.  */
+  if (const svalue *cluster_sval = maybe_get_simple_value (mgr))
+    {
+      /* Extract child svalue from parent svalue.  */
+      region_model_manager *rmm_mgr = mgr->get_svalue_manager ();
+      return rmm_mgr->get_or_create_sub_svalue (reg->get_type (),
+						cluster_sval, reg);
+    }
+
   /* If this cluster has been touched by a symbolic write, then the content
      of any subregion not currently specifically bound is "UNKNOWN".  */
   if (m_touched)
diff --git a/gcc/testsuite/gcc.dg/analyzer/symbolic-12.c b/gcc/testsuite/gcc.dg/analyzer/symbolic-12.c
new file mode 100644
index 00000000000..d7c50de9f27
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/symbolic-12.c
@@ -0,0 +1,106 @@
+#include "analyzer-decls.h"
+
+void external_fn(void);
+
+struct st_1
+{
+  char *name;
+  unsigned size;
+};
+
+void test_1a (void *p, unsigned next_off)
+{
+  struct st_1 *r = p;
+
+  external_fn();
+
+  if (next_off >= r->size)
+    return;
+
+  if (next_off >= r->size)
+    /* We should have already returned if this is the case.  */
+    __analyzer_dump_path (); /* { dg-bogus "path" } */
+}
+
+void test_1b (void *p, unsigned next_off)
+{
+  struct st_1 *r = p;
+
+  if (next_off >= r->size)
+    return;
+
+  if (next_off >= r->size)
+    /* We should have already returned if this is the case.  */
+    __analyzer_dump_path (); /* { dg-bogus "path" } */
+}
+
+void test_1c (struct st_1 *r, unsigned next_off)
+{
+  if (next_off >= r->size)
+    return;
+
+  if (next_off >= r->size)
+    /* We should have already returned if this is the case.  */
+    __analyzer_dump_path (); /* { dg-bogus "path" } */
+}
+
+void test_1d (struct st_1 *r, unsigned next_off)
+{
+  external_fn();
+
+  if (next_off >= r->size)
+    return;
+
+  if (next_off >= r->size)
+    /* We should have already returned if this is the case.  */
+    __analyzer_dump_path (); /* { dg-bogus "path" } */
+}
+
+void test_1e (void *p, unsigned next_off)
+{
+  struct st_1 *r = p;
+
+  while (1)
+    {
+      external_fn();
+
+      if (next_off >= r->size)
+	return;
+
+      __analyzer_dump_path (); /* { dg-message "path" } */
+    }
+}
+
+struct st_2
+{
+  char *name;
+  unsigned arr[10];
+};
+
+void test_2a (void *p, unsigned next_off)
+{
+  struct st_2 *r = p;
+
+  external_fn();
+
+  if (next_off >= r->arr[5])
+    return;
+
+  if (next_off >= r->arr[5])
+    /* We should have already returned if this is the case.  */
+    __analyzer_dump_path (); /* { dg-bogus "path" } */
+}
+
+void test_2b (void *p, unsigned next_off, int idx)
+{
+  struct st_2 *r = p;
+
+  external_fn();
+
+  if (next_off >= r->arr[idx])
+    return;
+
+  if (next_off >= r->arr[idx])
+    /* We should have already returned if this is the case.  */
+    __analyzer_dump_path (); /* { dg-bogus "path" } */
+}


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-07-27 21:55 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-27 21:55 [gcc r12-8632] analyzer: handle repeated accesses after init of unknown size [PR105285] David Malcolm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).