public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Uninit warnings due to optimizing short-circuit conditionals
@ 2022-02-14 15:57 David Malcolm
  2022-02-14 16:26 ` Jeff Law
  2022-02-14 16:57 ` Mark Wielaard
  0 siblings, 2 replies; 11+ messages in thread
From: David Malcolm @ 2022-02-14 15:57 UTC (permalink / raw)
  To: gcc; +Cc: Mark Wielaard, David Malcolm

[CCing Mark in the hopes of insight from the valgrind side of things]

There is a false positive from -Wanalyzer-use-of-uninitialized-value on
gcc.dg/analyzer/pr102692.c here:

  ‘fix_overlays_before’: events 1-3
    |
    |   75 |   while (tail
    |      |          ~~~~
    |   76 |          && (tem = make_lisp_ptr (tail, 5),
    |      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |      |          |
    |      |          (1) following ‘false’ branch (when ‘tail’ is NULL)...
    |   77 |              (end = marker_position (XOVERLAY (tem)->end)) >= pos))
    |      |              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |......
    |   82 |   if (!tail || end < prev || !tail->next)
    |      |       ~~~~~    ~~~~~~~~~~
    |      |       |            |
    |      |       |            (3) use of uninitialized value ‘end’ here
    |      |       (2) ...to here
    |

The issue is that inner || of the conditionals have been folded within the
frontend from a chain of control flow:

   5   │   if (tail == 0B) goto <D.1986>; else goto <D.1988>;
   6   │   <D.1988>:
   7   │   if (end < prev) goto <D.1986>; else goto <D.1989>;
   8   │   <D.1989>:
   9   │   _1 = tail->next;
  10   │   if (_1 == 0B) goto <D.1986>; else goto <D.1987>;
  11   │   <D.1986>:

to an OR expr (and then to a bitwise-or by the gimplifier):

   5   │   _1 = tail == 0B;
   6   │   _2 = end < prev;
   7   │   _3 = _1 | _2;
   8   │   if (_3 != 0) goto <D.1986>; else goto <D.1988>;
   9   │   <D.1988>:
  10   │   _4 = tail->next;
  11   │   if (_4 == 0B) goto <D.1986>; else goto <D.1987>;

This happens for sufficiently simple conditionals in fold_truth_andor.
In particular, the (end < prev) is short-circuited without optimization,
but is evaluated with optimization, leading to the false positive.

Given how early this folding occurs, it seems the simplest fix is to
try to detect places where this optimization appears to have happened,
and suppress uninit warnings within the statement that would have
been short-circuited (and thus e.g. ignoring them when evaluating _2
above for the case where _1 is known to be true at the (_1 | _2) , and
thus _2 being redundant).

Attached is a patch that implements this.

There are some more details in the patch, but I'm wondering if this is a
known problem, and how e.g. valgrind copes with such code.  My patch
feels like something of a hack, but I'm not sure of any other way around
it given that the conditional is folded directly within the frontend.

I've successfully bootstrapped & regrtested the patch on
x86_64-pc-linux-gnu; it fixes the false positive.

Thoughts?  Thanks for reading
Dave

analyzer: fix uninit false +ve due to optimized conditionals [PR102692]

gcc/analyzer/ChangeLog:
	PR analyzer/102692
	* exploded-graph.h (impl_region_model_context::get_stmt): New.
	* region-model.cc: Include "gimple-ssa.h", "tree-phinodes.h",
	"tree-ssa-operands.h", and "ssa-iterators.h".
	(within_short_circuited_stmt_p): New.
	(region_model::check_for_poison): Don't warn about uninit values
	if within_short_circuited_stmt_p.
	* region-model.h (region_model_context::get_stmt): New vfunc.
	(noop_region_model_context::get_stmt): New.

gcc/testsuite/ChangeLog:
	PR analyzer/102692
	* gcc.dg/analyzer/pr102692-2.c: New test.
	* gcc.dg/analyzer/pr102692.c: Remove xfail.  Remove -O2 from
	options and move to...
	* gcc.dg/analyzer/torture/pr102692.c: ...here.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
---
 gcc/analyzer/exploded-graph.h                 |   2 +
 gcc/analyzer/region-model.cc                  | 112 ++++++++++++++++++
 gcc/analyzer/region-model.h                   |   5 +
 gcc/testsuite/gcc.dg/analyzer/pr102692-2.c    |  22 ++++
 .../gcc.dg/analyzer/{ => torture}/pr102692.c  |   4 +-
 5 files changed, 143 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr102692-2.c
 rename gcc/testsuite/gcc.dg/analyzer/{ => torture}/pr102692.c (94%)

diff --git a/gcc/analyzer/exploded-graph.h b/gcc/analyzer/exploded-graph.h
index 1854193c65b..1f52725dc98 100644
--- a/gcc/analyzer/exploded-graph.h
+++ b/gcc/analyzer/exploded-graph.h
@@ -90,6 +90,8 @@ class impl_region_model_context : public region_model_context
 		       const state_machine **out_sm,
 		       unsigned *out_sm_idx) FINAL OVERRIDE;
 
+  const gimple *get_stmt () const OVERRIDE { return m_stmt; }
+
   exploded_graph *m_eg;
   log_user m_logger;
   exploded_node *m_enode_for_diag;
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index e659cf03a86..8a509c44178 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -68,6 +68,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "stor-layout.h"
 #include "attribs.h"
 #include "tree-object-size.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "tree-ssa-operands.h"
+#include "ssa-iterators.h"
+#include "gimple-pretty-print.h"
 
 #if ENABLE_ANALYZER
 
@@ -829,6 +834,108 @@ region_model::get_gassign_result (const gassign *assign,
     }
 }
 
+/* Workaround for discarding certain false positives from
+   -Wanalyzer-use-of-uninitialized-value
+   of the form:
+     ((A OR-IF B) OR-IF C)
+   and:
+     ((A AND-IF B) AND-IF C)
+   where evaluating B is redundant, but could involve simple accesses of
+   uninitialized locals.
+
+   When optimization is turned on the FE can immediately fold compound
+   conditionals.  Specifically, c_parser_condition parses this condition:
+     ((A OR-IF B) OR-IF C)
+   and calls c_fully_fold on the condition.
+   Within c_fully_fold, fold_truth_andor is called, which bails when
+   optimization is off, but if any optimization is turned on can convert the
+     ((A OR-IF B) OR-IF C)
+   into:
+     ((A OR B) OR_IF C)
+   for sufficiently simple B
+   i.e. the inner OR-IF becomes an OR.
+   At gimplification time the inner OR becomes BIT_IOR_EXPR (in gimplify_expr),
+   giving this for the inner condition:
+      tmp = A | B;
+      if (tmp)
+   thus effectively synthesizing a redundant access of B when optimization
+   is turned on, when compared to:
+      if (A) goto L1; else goto L4;
+  L1: if (B) goto L2; else goto L4;
+  L2: if (C) goto L3; else goto L4;
+   for the unoptimized case.
+
+   Return true if CTXT appears to be  handling such a short-circuitable stmt,
+   such as the def-stmt for B for the:
+      tmp = A | B;
+   case above, for the case where A is true and thus B would have been
+   short-circuited without optimization, using MODEL for the value of A.  */
+
+static bool
+within_short_circuited_stmt_p (const region_model *model,
+			       region_model_context *ctxt)
+{
+  gcc_assert (ctxt);
+  const gimple *curr_stmt = ctxt->get_stmt ();
+  if (curr_stmt == NULL)
+    return false;
+
+  /* We must have an assignment to a temporary of _Bool type.  */
+  const gassign *assign_stmt = dyn_cast <const gassign *> (curr_stmt);
+  if (!assign_stmt)
+    return false;
+  tree lhs = gimple_assign_lhs (assign_stmt);
+  if (TREE_TYPE (lhs) != boolean_type_node)
+    return false;
+  if (TREE_CODE (lhs) != SSA_NAME)
+    return false;
+  if (SSA_NAME_VAR (lhs) != NULL_TREE)
+    return false;
+
+  /* The temporary bool must be used exactly once: as the second arg of
+     a BIT_IOR_EXPR or BIT_AND_EXPR.  */
+  use_operand_p use_op;
+  gimple *use_stmt;
+  if (!single_imm_use (lhs, &use_op, &use_stmt))
+    return false;
+  const gassign *use_assign = dyn_cast <const gassign *> (use_stmt);
+  if (!use_assign)
+    return false;
+  enum tree_code op = gimple_assign_rhs_code (use_assign);
+  if (!(op == BIT_IOR_EXPR ||op == BIT_AND_EXPR))
+    return false;
+  if (!(gimple_assign_rhs1 (use_assign) != lhs
+	&& gimple_assign_rhs2 (use_assign) == lhs))
+    return false;
+
+  /* The first arg of the bitwise stmt must have a known value in MODEL
+     that implies that the value of the second arg doesn't matter, i.e.
+     1 for bitwise or, 0 for bitwise and.  */
+  tree other_arg = gimple_assign_rhs1 (use_assign);
+  /* Use a NULL ctxt here to avoid generating warnings.  */
+  const svalue *other_arg_sval = model->get_rvalue (other_arg, NULL);
+  tree other_arg_cst = other_arg_sval->maybe_get_constant ();
+  if (!other_arg_cst)
+    return false;
+  switch (op)
+    {
+    default:
+      gcc_unreachable ();
+    case BIT_IOR_EXPR:
+      if (zerop (other_arg_cst))
+	return false;
+      break;
+    case BIT_AND_EXPR:
+      if (!zerop (other_arg_cst))
+	return false;
+      break;
+    }
+
+  /* All tests passed.  We appear to be in a stmt that generates a boolean
+     temporary with a value that won't matter.  */
+  return true;
+}
+
 /* Check for SVAL being poisoned, adding a warning to CTXT.
    Return SVAL, or, if a warning is added, another value, to avoid
    repeatedly complaining about the same poisoned value in followup code.  */
@@ -852,6 +959,11 @@ region_model::check_for_poison (const svalue *sval,
 	  && is_empty_type (sval->get_type ()))
 	return sval;
 
+      /* Special case to avoid certain false positives.  */
+      if (pkind == POISON_KIND_UNINIT
+	  && within_short_circuited_stmt_p (this, ctxt))
+	  return sval;
+
       /* If we have an SSA name for a temporary, we don't want to print
 	 '<unknown>'.
 	 Poisoned values are shared by type, and so we can't reconstruct
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 46cf37e6b26..c2c89a20d50 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -930,6 +930,9 @@ class region_model_context
   virtual bool get_taint_map (sm_state_map **out_smap,
 			      const state_machine **out_sm,
 			      unsigned *out_sm_idx) = 0;
+
+  /* Get the current statement, if any.  */
+  virtual const gimple *get_stmt () const = 0;
 };
 
 /* A "do nothing" subclass of region_model_context.  */
@@ -980,6 +983,8 @@ public:
   {
     return false;
   }
+
+  const gimple *get_stmt () const OVERRIDE { return NULL; }
 };
 
 /* A subclass of region_model_context for determining if operations fail
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr102692-2.c b/gcc/testsuite/gcc.dg/analyzer/pr102692-2.c
new file mode 100644
index 00000000000..c72fde21744
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr102692-2.c
@@ -0,0 +1,22 @@
+/* { dg-additional-options "-O1" } */
+
+struct Lisp_Overlay
+{
+  struct Lisp_Overlay *next;
+};
+
+void
+test_1 (struct Lisp_Overlay *tail, long prev)
+{
+  long end;
+  if (!tail || end < prev || !tail->next) /* { dg-warning "use of uninitialized value 'end'" } */
+    return;
+}
+
+void
+test_2 (struct Lisp_Overlay *tail, long prev)
+{
+  long end;
+  if (tail && end < prev && !tail->next) /* { dg-warning "use of uninitialized value 'end'" } */
+    return;
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr102692.c b/gcc/testsuite/gcc.dg/analyzer/torture/pr102692.c
similarity index 94%
rename from gcc/testsuite/gcc.dg/analyzer/pr102692.c
rename to gcc/testsuite/gcc.dg/analyzer/torture/pr102692.c
index c8993c82980..a6c6bc47896 100644
--- a/gcc/testsuite/gcc.dg/analyzer/pr102692.c
+++ b/gcc/testsuite/gcc.dg/analyzer/torture/pr102692.c
@@ -1,4 +1,4 @@
-/* { dg-additional-options "-O2 -Wno-analyzer-too-complex" } */
+/* { dg-additional-options "-Wno-analyzer-too-complex" } */
 /* TODO: remove the need for -Wno-analyzer-too-complex.  */
 
 struct lisp;
@@ -73,7 +73,7 @@ fix_overlays_before (struct buffer *bp, long prev, long pos)
       parent = tail;
       tail = tail->next;
     }
-  if (!tail || end < prev || !tail->next) /* { dg-bogus "use of uninitialized value 'end'" "uninit" { xfail *-*-* } } */
+  if (!tail || end < prev || !tail->next) /* { dg-bogus "use of uninitialized value 'end'" "uninit" } */
     /* { dg-bogus "dereference of NULL 'tail'" "null deref" { target *-*-* } .-1 } */
     return;
   right_pair = parent;
-- 
2.26.3


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Uninit warnings due to optimizing short-circuit conditionals
  2022-02-14 15:57 Uninit warnings due to optimizing short-circuit conditionals David Malcolm
@ 2022-02-14 16:26 ` Jeff Law
  2022-02-14 17:10   ` David Malcolm
  2022-02-14 16:57 ` Mark Wielaard
  1 sibling, 1 reply; 11+ messages in thread
From: Jeff Law @ 2022-02-14 16:26 UTC (permalink / raw)
  To: David Malcolm, gcc; +Cc: Mark Wielaard



On 2/14/2022 8:57 AM, David Malcolm via Gcc wrote:
> [CCing Mark in the hopes of insight from the valgrind side of things]
>
> There is a false positive from -Wanalyzer-use-of-uninitialized-value on
> gcc.dg/analyzer/pr102692.c here:
>
>    ‘fix_overlays_before’: events 1-3
>      |
>      |   75 |   while (tail
>      |      |          ~~~~
>      |   76 |          && (tem = make_lisp_ptr (tail, 5),
>      |      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>      |      |          |
>      |      |          (1) following ‘false’ branch (when ‘tail’ is NULL)...
>      |   77 |              (end = marker_position (XOVERLAY (tem)->end)) >= pos))
>      |      |              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>      |......
>      |   82 |   if (!tail || end < prev || !tail->next)
>      |      |       ~~~~~    ~~~~~~~~~~
>      |      |       |            |
>      |      |       |            (3) use of uninitialized value ‘end’ here
>      |      |       (2) ...to here
>      |
>
> The issue is that inner || of the conditionals have been folded within the
> frontend from a chain of control flow:
>
>     5   │   if (tail == 0B) goto <D.1986>; else goto <D.1988>;
>     6   │   <D.1988>:
>     7   │   if (end < prev) goto <D.1986>; else goto <D.1989>;
>     8   │   <D.1989>:
>     9   │   _1 = tail->next;
>    10   │   if (_1 == 0B) goto <D.1986>; else goto <D.1987>;
>    11   │   <D.1986>:
>
> to an OR expr (and then to a bitwise-or by the gimplifier):
>
>     5   │   _1 = tail == 0B;
>     6   │   _2 = end < prev;
>     7   │   _3 = _1 | _2;
>     8   │   if (_3 != 0) goto <D.1986>; else goto <D.1988>;
>     9   │   <D.1988>:
>    10   │   _4 = tail->next;
>    11   │   if (_4 == 0B) goto <D.1986>; else goto <D.1987>;
>
> This happens for sufficiently simple conditionals in fold_truth_andor.
> In particular, the (end < prev) is short-circuited without optimization,
> but is evaluated with optimization, leading to the false positive.
>
> Given how early this folding occurs, it seems the simplest fix is to
> try to detect places where this optimization appears to have happened,
> and suppress uninit warnings within the statement that would have
> been short-circuited (and thus e.g. ignoring them when evaluating _2
> above for the case where _1 is known to be true at the (_1 | _2) , and
> thus _2 being redundant).
>
> Attached is a patch that implements this.
>
> There are some more details in the patch, but I'm wondering if this is a
> known problem, and how e.g. valgrind copes with such code.  My patch
> feels like something of a hack, but I'm not sure of any other way around
> it given that the conditional is folded directly within the frontend.
Presumably when "tail ==0", "end" is initialized somewhere?  If so, yes, 
this is a known issue.  There's a BZ about it somewhere (I don' t have 
the # handy, but it's probably on the Wuninitialized tracker).


Jeff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Uninit warnings due to optimizing short-circuit conditionals
  2022-02-14 15:57 Uninit warnings due to optimizing short-circuit conditionals David Malcolm
  2022-02-14 16:26 ` Jeff Law
@ 2022-02-14 16:57 ` Mark Wielaard
  2022-02-14 17:20   ` David Malcolm
  1 sibling, 1 reply; 11+ messages in thread
From: Mark Wielaard @ 2022-02-14 16:57 UTC (permalink / raw)
  To: David Malcolm, gcc; +Cc: Julian Seward

Hi David,

On Mon, 2022-02-14 at 10:57 -0500, David Malcolm wrote:
> [CCing Mark in the hopes of insight from the valgrind side of things]

Adding Julian to CC so he can correct me if I say something silly.

> There is a false positive from -Wanalyzer-use-of-uninitialized-value on
> gcc.dg/analyzer/pr102692.c here:
> 
>   ‘fix_overlays_before’: events 1-3
>     |
>     |   75 |   while (tail
>     |      |          ~~~~
>     |   76 |          && (tem = make_lisp_ptr (tail, 5),
>     |      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>     |      |          |
>     |      |          (1) following ‘false’ branch (when ‘tail’ is NULL)...
>     |   77 |              (end = marker_position (XOVERLAY (tem)->end)) >= pos))
>     |      |              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>     |......
>     |   82 |   if (!tail || end < prev || !tail->next)
>     |      |       ~~~~~    ~~~~~~~~~~
>     |      |       |            |
>     |      |       |            (3) use of uninitialized value ‘end’ here
>     |      |       (2) ...to here
>     |
> 
> The issue is that inner || of the conditionals have been folded within the
> frontend from a chain of control flow:
> 
>    5   │   if (tail == 0B) goto <D.1986>; else goto <D.1988>;
>    6   │   <D.1988>:
>    7   │   if (end < prev) goto <D.1986>; else goto <D.1989>;
>    8   │   <D.1989>:
>    9   │   _1 = tail->next;
>   10   │   if (_1 == 0B) goto <D.1986>; else goto <D.1987>;
>   11   │   <D.1986>:
> 
> to an OR expr (and then to a bitwise-or by the gimplifier):
> 
>    5   │   _1 = tail == 0B;
>    6   │   _2 = end < prev;
>    7   │   _3 = _1 | _2;
>    8   │   if (_3 != 0) goto <D.1986>; else goto <D.1988>;
>    9   │   <D.1988>:
>   10   │   _4 = tail->next;
>   11   │   if (_4 == 0B) goto <D.1986>; else goto <D.1987>;
> 
> This happens for sufficiently simple conditionals in fold_truth_andor.
> In particular, the (end < prev) is short-circuited without optimization,
> but is evaluated with optimization, leading to the false positive.
> 
> Given how early this folding occurs, it seems the simplest fix is to
> try to detect places where this optimization appears to have happened,
> and suppress uninit warnings within the statement that would have
> been short-circuited (and thus e.g. ignoring them when evaluating _2
> above for the case where _1 is known to be true at the (_1 | _2) , and
> thus _2 being redundant).
> 
> Attached is a patch that implements this.
> 
> There are some more details in the patch, but I'm wondering if this is a
> known problem, and how e.g. valgrind copes with such code.  My patch
> feels like something of a hack, but I'm not sure of any other way around
> it given that the conditional is folded directly within the frontend.

As far as I know this is what valgrind memcheck also does with an
bitwise or. It knows that _3 is defined and true if either _1 or _2 is
defined and true. Or more generically that the result bits of a bitwise
or are defined for those bits that are both defined or where one is
defined and has the value 1.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Uninit warnings due to optimizing short-circuit conditionals
  2022-02-14 16:26 ` Jeff Law
@ 2022-02-14 17:10   ` David Malcolm
  0 siblings, 0 replies; 11+ messages in thread
From: David Malcolm @ 2022-02-14 17:10 UTC (permalink / raw)
  To: Jeff Law, gcc; +Cc: Mark Wielaard

On Mon, 2022-02-14 at 09:26 -0700, Jeff Law wrote:
> 
> 
> On 2/14/2022 8:57 AM, David Malcolm via Gcc wrote:
> > [CCing Mark in the hopes of insight from the valgrind side of things]
> > 
> > There is a false positive from -Wanalyzer-use-of-uninitialized-value
> > on
> > gcc.dg/analyzer/pr102692.c here:
> > 
> >    ‘fix_overlays_before’: events 1-3
> >      |
> >      |   75 |   while (tail
> >      |      |          ~~~~
> >      |   76 |          && (tem = make_lisp_ptr (tail, 5),
> >      |      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >      |      |          |
> >      |      |          (1) following ‘false’ branch (when ‘tail’ is
> > NULL)...
> >      |   77 |              (end = marker_position (XOVERLAY (tem)-
> > >end)) >= pos))
> >      |      |             
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >      |......
> >      |   82 |   if (!tail || end < prev || !tail->next)
> >      |      |       ~~~~~    ~~~~~~~~~~
> >      |      |       |            |
> >      |      |       |            (3) use of uninitialized value ‘end’
> > here
> >      |      |       (2) ...to here
> >      |
> > 
> > The issue is that inner || of the conditionals have been folded
> > within the
> > frontend from a chain of control flow:
> > 
> >     5   │   if (tail == 0B) goto <D.1986>; else goto <D.1988>;
> >     6   │   <D.1988>:
> >     7   │   if (end < prev) goto <D.1986>; else goto <D.1989>;
> >     8   │   <D.1989>:
> >     9   │   _1 = tail->next;
> >    10   │   if (_1 == 0B) goto <D.1986>; else goto <D.1987>;
> >    11   │   <D.1986>:
> > 
> > to an OR expr (and then to a bitwise-or by the gimplifier):
> > 
> >     5   │   _1 = tail == 0B;
> >     6   │   _2 = end < prev;
> >     7   │   _3 = _1 | _2;
> >     8   │   if (_3 != 0) goto <D.1986>; else goto <D.1988>;
> >     9   │   <D.1988>:
> >    10   │   _4 = tail->next;
> >    11   │   if (_4 == 0B) goto <D.1986>; else goto <D.1987>;
> > 
> > This happens for sufficiently simple conditionals in
> > fold_truth_andor.
> > In particular, the (end < prev) is short-circuited without
> > optimization,
> > but is evaluated with optimization, leading to the false positive.
> > 
> > Given how early this folding occurs, it seems the simplest fix is to
> > try to detect places where this optimization appears to have
> > happened,
> > and suppress uninit warnings within the statement that would have
> > been short-circuited (and thus e.g. ignoring them when evaluating _2
> > above for the case where _1 is known to be true at the (_1 | _2) ,
> > and
> > thus _2 being redundant).
> > 
> > Attached is a patch that implements this.
> > 
> > There are some more details in the patch, but I'm wondering if this
> > is a
> > known problem, and how e.g. valgrind copes with such code.  My patch
> > feels like something of a hack, but I'm not sure of any other way
> > around
> > it given that the conditional is folded directly within the frontend.
> Presumably when "tail ==0", "end" is initialized somewhere?  

I don't think it does, but it shouldn't be a problem in the code as
written, due to short-circuiting (as I understand things) - but the
short-circuiting is being removed by the optimizer.

"end" only gets initialized at line 71 of the source below, for the
case where the initial value of "tail" is non-NULL:

  63   │ void
  64   │ fix_overlays_before (struct buffer *bp, long prev, long pos)
  65   │ {
  66   │   struct Lisp_Overlay *tail = bp->overlays_before, *parent = 0, *right_pair;
  67   │   struct lisp *tem;
  68   │   long end;
  69   │   while (tail
  70   │      && (tem = make_lisp_ptr (tail, 5),
  71   │          (end = marker_position (XOVERLAY (tem)->end)) >= pos))
  72   │     {
  73   │       parent = tail;
  74   │       tail = tail->next;
  75   │     }
  76   │   if (!tail || end < prev || !tail->next) /* { dg-bogus "use of uninitialized value 'end'" "uninit" } */
  77   │     /* { dg-bogus "dereference of NULL 'tail'" "null deref" { target *-*-* } .-1 } */
  78   │     return;

At -O2 we have this at pr102692.c.022t.ssa:

  59   │   <bb 2> :
  60   │   tail_23 = bp_22(D)->overlays_before;
  61   │   parent_24 = 0B;
  62   │   goto <bb 4>; [INV]
  63   │ 
  64   │   <bb 3> :
  65   │   parent_32 = tail_11;
  66   │   tail_33 = tail_11->next;
  67   │ 
  68   │   <bb 4> :
  69   │   # tail_11 = PHI <tail_23(2), tail_33(3)>
  70   │   # parent_13 = PHI <parent_24(2), parent_32(3)>
  71   │   # end_15 = PHI <end_25(D)(2), end_30(3)>
  72   │   if (tail_11 != 0B)
  73   │     goto <bb 5>; [INV]
  74   │   else
  75   │     goto <bb 6>; [INV]
  76   │ 
  77   │   <bb 5> :
  78   │   tem_27 = make_lisp_ptr (tail_11, 5);
  79   │   _1 = XOVERLAY (tem_27);
  80   │   _2 = _1->end;
  81   │   end_30 = marker_position (_2);
  82   │   if (end_30 >= pos_31(D))
  83   │     goto <bb 3>; [INV]
  84   │   else
  85   │     goto <bb 6>; [INV]
  86   │ 
  87   │   <bb 6> :
  88   │   # end_16 = PHI <end_15(4), end_30(5)>
  89   │   _3 = tail_11 == 0B;
  90   │   _4 = end_16 < prev_34(D);
  91   │   _5 = _3 | _4;
  92   │   if (_5 != 0)
  93   │     goto <bb 8>; [INV]
  94   │   else
  95   │     goto <bb 7>; [INV]
  96   │ 
  97   │   <bb 7> :
  98   │   _6 = tail_11->next;
  99   │   if (_6 == 0B)
 100   │     goto <bb 8>; [INV]
 101   │   else
 102   │     goto <bb 9>; [INV]
 103   │ 


This line of the source:
  76   │   if (!tail || end < prev || !tail->next) 
has become BBs 6 and 7.

Consider the path:
  ENTRY -> BB 2 -> BB 4 -> BB 6 (initial tail being NULL)

If initially tail_23 is NULL, then:
* at BB 2 -> BB 4 we have tail_11 = tail_23 and end_15 = end_25(D)
* at BB 4 -> BB 6 we have end_16 = end_15 = end_25(D)

and so we have:

  89   │   _3 = tail_11 == 0B;

with tail_11 being NULL, and thus we know _3 is true on this path, and
then:

  90   │   _4 = end_16 < prev_34(D);

where end_16 is end_25(D), the "initial" value of "end" - but "end"
isn't initialized.

So we have: _4 = UNINITIALIZED < VALID;  (the 2nd arg, prev is a param)
which is an evaluation using an uninitialized value.

However _4 only gets used by:

  91   │   _5 = _3 | _4;

all boolean, with _3 known to be true.

Assuming I'm reading all this correctly, this seems like overzealous
folding to me, and my special-casing of it - the folding turns a
TRUTH_OR_EXPR into a TRUTH_OR, given that the arguments
are simple_operand_p_2, which seems to apply for the simplest lookups
of local variables, without side-effects.

tree.def says:

  /* ANDIF and ORIF allow the second operand not to be computed if the
     value of the expression is determined from the first operand.  AND,
     OR, and XOR always compute the second operand whether its value is
     needed or not (for side effects).  The operand may have

and presumably the "allow the 2nd operand not to be computed" is not
the same as "require the 2nd operand not to be computed".

Does the above reasoning (and my special-case workaround) sound correct
to you?

Thanks!
Dave



> If so, yes, 
> this is a known issue.  There's a BZ about it somewhere (I don' t have 
> the # handy, but it's probably on the Wuninitialized tracker).
> 
> 
> Jeff
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Uninit warnings due to optimizing short-circuit conditionals
  2022-02-14 16:57 ` Mark Wielaard
@ 2022-02-14 17:20   ` David Malcolm
  2022-02-14 17:37     ` Mark Wielaard
  0 siblings, 1 reply; 11+ messages in thread
From: David Malcolm @ 2022-02-14 17:20 UTC (permalink / raw)
  To: Mark Wielaard, gcc; +Cc: Julian Seward

On Mon, 2022-02-14 at 17:57 +0100, Mark Wielaard wrote:
> Hi David,
> 
> On Mon, 2022-02-14 at 10:57 -0500, David Malcolm wrote:
> > [CCing Mark in the hopes of insight from the valgrind side of
> > things]
> 
> Adding Julian to CC so he can correct me if I say something silly.
> 
> > There is a false positive from -Wanalyzer-use-of-uninitialized-
> > value on
> > gcc.dg/analyzer/pr102692.c here:
> > 
> >   ‘fix_overlays_before’: events 1-3
> >     |
> >     |   75 |   while (tail
> >     |      |          ~~~~
> >     |   76 |          && (tem = make_lisp_ptr (tail, 5),
> >     |      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >     |      |          |
> >     |      |          (1) following ‘false’ branch (when ‘tail’ is
> > NULL)...
> >     |   77 |              (end = marker_position (XOVERLAY (tem)-
> > >end)) >= pos))
> >     |      |             
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >     |......
> >     |   82 |   if (!tail || end < prev || !tail->next)
> >     |      |       ~~~~~    ~~~~~~~~~~
> >     |      |       |            |
> >     |      |       |            (3) use of uninitialized value
> > ‘end’ here
> >     |      |       (2) ...to here
> >     |
> > 
> > The issue is that inner || of the conditionals have been folded
> > within the
> > frontend from a chain of control flow:
> > 
> >    5   │   if (tail == 0B) goto <D.1986>; else goto <D.1988>;
> >    6   │   <D.1988>:
> >    7   │   if (end < prev) goto <D.1986>; else goto <D.1989>;
> >    8   │   <D.1989>:
> >    9   │   _1 = tail->next;
> >   10   │   if (_1 == 0B) goto <D.1986>; else goto <D.1987>;
> >   11   │   <D.1986>:
> > 
> > to an OR expr (and then to a bitwise-or by the gimplifier):
> > 
> >    5   │   _1 = tail == 0B;
> >    6   │   _2 = end < prev;
> >    7   │   _3 = _1 | _2;
> >    8   │   if (_3 != 0) goto <D.1986>; else goto <D.1988>;
> >    9   │   <D.1988>:
> >   10   │   _4 = tail->next;
> >   11   │   if (_4 == 0B) goto <D.1986>; else goto <D.1987>;
> > 
> > This happens for sufficiently simple conditionals in
> > fold_truth_andor.
> > In particular, the (end < prev) is short-circuited without
> > optimization,
> > but is evaluated with optimization, leading to the false positive.
> > 
> > Given how early this folding occurs, it seems the simplest fix is
> > to
> > try to detect places where this optimization appears to have
> > happened,
> > and suppress uninit warnings within the statement that would have
> > been short-circuited (and thus e.g. ignoring them when evaluating
> > _2
> > above for the case where _1 is known to be true at the (_1 | _2) ,
> > and
> > thus _2 being redundant).
> > 
> > Attached is a patch that implements this.
> > 
> > There are some more details in the patch, but I'm wondering if this
> > is a
> > known problem, and how e.g. valgrind copes with such code.  My
> > patch
> > feels like something of a hack, but I'm not sure of any other way
> > around
> > it given that the conditional is folded directly within the
> > frontend.
> 
> As far as I know this is what valgrind memcheck also does with an
> bitwise or. It knows that _3 is defined and true if either _1 or _2
> is
> defined and true. Or more generically that the result bits of a
> bitwise
> or are defined for those bits that are both defined or where one is
> defined and has the value 1.

Aha - thanks.  I think the distinction here is that:

* GCC's -fanalyzer complains about uninitialized values immediately
when it sees one being fetched for use in any expression (replacing the
value with a safe one to avoid further complaints), without considering
how they are going to be used in the expression, whereas

* it sounds like valgrind keeps track of uninitialized bits, propagates
the "uninit-ness" of the bits, and complains at certain times when
uninitialized bits are used in certain ways.

Dave




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Uninit warnings due to optimizing short-circuit conditionals
  2022-02-14 17:20   ` David Malcolm
@ 2022-02-14 17:37     ` Mark Wielaard
  2022-02-15  7:25       ` Richard Biener
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Wielaard @ 2022-02-14 17:37 UTC (permalink / raw)
  To: David Malcolm, gcc; +Cc: Julian Seward

On Mon, 2022-02-14 at 12:20 -0500, David Malcolm wrote:
> On Mon, 2022-02-14 at 17:57 +0100, Mark Wielaard wrote:
> > On Mon, 2022-02-14 at 10:57 -0500, David Malcolm wrote:
> > > [CCing Mark in the hopes of insight from the valgrind side of
> > > things]
> > 
> > Adding Julian to CC so he can correct me if I say something silly.
> > 
> > > There is a false positive from -Wanalyzer-use-of-uninitialized-
> > > value on
> > > gcc.dg/analyzer/pr102692.c here:
> > > 
> > >   ‘fix_overlays_before’: events 1-3
> > >     |
> > >     |   75 |   while (tail
> > >     |      |          ~~~~
> > >     |   76 |          && (tem = make_lisp_ptr (tail, 5),
> > >     |      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > >     |      |          |
> > >     |      |          (1) following ‘false’ branch (when ‘tail’ is
> > > NULL)...
> > >     |   77 |              (end = marker_position (XOVERLAY (tem)-
> > > > end)) >= pos))
> > > 
> > >     |      |             
> > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > >     |......
> > >     |   82 |   if (!tail || end < prev || !tail->next)
> > >     |      |       ~~~~~    ~~~~~~~~~~
> > >     |      |       |            |
> > >     |      |       |            (3) use of uninitialized value
> > > ‘end’ here
> > >     |      |       (2) ...to here
> > >     |
> > > 
> > > The issue is that inner || of the conditionals have been folded
> > > within the
> > > frontend from a chain of control flow:
> > > 
> > >    5   │   if (tail == 0B) goto <D.1986>; else goto <D.1988>;
> > >    6   │   <D.1988>:
> > >    7   │   if (end < prev) goto <D.1986>; else goto <D.1989>;
> > >    8   │   <D.1989>:
> > >    9   │   _1 = tail->next;
> > >   10   │   if (_1 == 0B) goto <D.1986>; else goto <D.1987>;
> > >   11   │   <D.1986>:
> > > 
> > > to an OR expr (and then to a bitwise-or by the gimplifier):
> > > 
> > >    5   │   _1 = tail == 0B;
> > >    6   │   _2 = end < prev;
> > >    7   │   _3 = _1 | _2;
> > >    8   │   if (_3 != 0) goto <D.1986>; else goto <D.1988>;
> > >    9   │   <D.1988>:
> > >   10   │   _4 = tail->next;
> > >   11   │   if (_4 == 0B) goto <D.1986>; else goto <D.1987>;
> > > 
> > > This happens for sufficiently simple conditionals in
> > > fold_truth_andor.
> > > In particular, the (end < prev) is short-circuited without
> > > optimization,
> > > but is evaluated with optimization, leading to the false positive.
> > > 
> > > Given how early this folding occurs, it seems the simplest fix is
> > > to
> > > try to detect places where this optimization appears to have
> > > happened,
> > > and suppress uninit warnings within the statement that would have
> > > been short-circuited (and thus e.g. ignoring them when evaluating
> > > _2
> > > above for the case where _1 is known to be true at the (_1 | _2) ,
> > > and
> > > thus _2 being redundant).
> > > 
> > > Attached is a patch that implements this.
> > > 
> > > There are some more details in the patch, but I'm wondering if this
> > > is a
> > > known problem, and how e.g. valgrind copes with such code.  My
> > > patch
> > > feels like something of a hack, but I'm not sure of any other way
> > > around
> > > it given that the conditional is folded directly within the
> > > frontend.
> > 
> > As far as I know this is what valgrind memcheck also does with an
> > bitwise or. It knows that _3 is defined and true if either _1 or _2
> > is
> > defined and true. Or more generically that the result bits of a
> > bitwise
> > or are defined for those bits that are both defined or where one is
> > defined and has the value 1.
> 
> Aha - thanks.  I think the distinction here is that:
> 
> * GCC's -fanalyzer complains about uninitialized values immediately
> when it sees one being fetched for use in any expression (replacing the
> value with a safe one to avoid further complaints), without considering
> how they are going to be used in the expression, whereas
> 
> * it sounds like valgrind keeps track of uninitialized bits, propagates
> the "uninit-ness" of the bits, and complains at certain times when
> uninitialized bits are used in certain ways.

Yes. valgrind keeps track of uninitialized bits and propagates them
around till "use". Where use is anything that might alter the
observable behavior of the program. Which is control flow transfers,
conditional moves, addresses used in memory accesses, and data passed
to system calls.

This paper describes some of the memcheck tricks:
https://valgrind.org/docs/memcheck2005.pdf

Cheers,

Mark

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Uninit warnings due to optimizing short-circuit conditionals
  2022-02-14 17:37     ` Mark Wielaard
@ 2022-02-15  7:25       ` Richard Biener
  2022-02-15 12:29         ` Mark Wielaard
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Biener @ 2022-02-15  7:25 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: David Malcolm, GCC Development, Julian Seward

On Mon, Feb 14, 2022 at 6:38 PM Mark Wielaard <mark@klomp.org> wrote:
>
> On Mon, 2022-02-14 at 12:20 -0500, David Malcolm wrote:
> > On Mon, 2022-02-14 at 17:57 +0100, Mark Wielaard wrote:
> > > On Mon, 2022-02-14 at 10:57 -0500, David Malcolm wrote:
> > > > [CCing Mark in the hopes of insight from the valgrind side of
> > > > things]
> > >
> > > Adding Julian to CC so he can correct me if I say something silly.
> > >
> > > > There is a false positive from -Wanalyzer-use-of-uninitialized-
> > > > value on
> > > > gcc.dg/analyzer/pr102692.c here:
> > > >
> > > >   ‘fix_overlays_before’: events 1-3
> > > >     |
> > > >     |   75 |   while (tail
> > > >     |      |          ~~~~
> > > >     |   76 |          && (tem = make_lisp_ptr (tail, 5),
> > > >     |      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > >     |      |          |
> > > >     |      |          (1) following ‘false’ branch (when ‘tail’ is
> > > > NULL)...
> > > >     |   77 |              (end = marker_position (XOVERLAY (tem)-
> > > > > end)) >= pos))
> > > >
> > > >     |      |
> > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > >     |......
> > > >     |   82 |   if (!tail || end < prev || !tail->next)
> > > >     |      |       ~~~~~    ~~~~~~~~~~
> > > >     |      |       |            |
> > > >     |      |       |            (3) use of uninitialized value
> > > > ‘end’ here
> > > >     |      |       (2) ...to here
> > > >     |
> > > >
> > > > The issue is that inner || of the conditionals have been folded
> > > > within the
> > > > frontend from a chain of control flow:
> > > >
> > > >    5   │   if (tail == 0B) goto <D.1986>; else goto <D.1988>;
> > > >    6   │   <D.1988>:
> > > >    7   │   if (end < prev) goto <D.1986>; else goto <D.1989>;
> > > >    8   │   <D.1989>:
> > > >    9   │   _1 = tail->next;
> > > >   10   │   if (_1 == 0B) goto <D.1986>; else goto <D.1987>;
> > > >   11   │   <D.1986>:
> > > >
> > > > to an OR expr (and then to a bitwise-or by the gimplifier):
> > > >
> > > >    5   │   _1 = tail == 0B;
> > > >    6   │   _2 = end < prev;
> > > >    7   │   _3 = _1 | _2;
> > > >    8   │   if (_3 != 0) goto <D.1986>; else goto <D.1988>;
> > > >    9   │   <D.1988>:
> > > >   10   │   _4 = tail->next;
> > > >   11   │   if (_4 == 0B) goto <D.1986>; else goto <D.1987>;
> > > >
> > > > This happens for sufficiently simple conditionals in
> > > > fold_truth_andor.
> > > > In particular, the (end < prev) is short-circuited without
> > > > optimization,
> > > > but is evaluated with optimization, leading to the false positive.
> > > >
> > > > Given how early this folding occurs, it seems the simplest fix is
> > > > to
> > > > try to detect places where this optimization appears to have
> > > > happened,
> > > > and suppress uninit warnings within the statement that would have
> > > > been short-circuited (and thus e.g. ignoring them when evaluating
> > > > _2
> > > > above for the case where _1 is known to be true at the (_1 | _2) ,
> > > > and
> > > > thus _2 being redundant).
> > > >
> > > > Attached is a patch that implements this.
> > > >
> > > > There are some more details in the patch, but I'm wondering if this
> > > > is a
> > > > known problem, and how e.g. valgrind copes with such code.  My
> > > > patch
> > > > feels like something of a hack, but I'm not sure of any other way
> > > > around
> > > > it given that the conditional is folded directly within the
> > > > frontend.
> > >
> > > As far as I know this is what valgrind memcheck also does with an
> > > bitwise or. It knows that _3 is defined and true if either _1 or _2
> > > is
> > > defined and true. Or more generically that the result bits of a
> > > bitwise
> > > or are defined for those bits that are both defined or where one is
> > > defined and has the value 1.
> >
> > Aha - thanks.  I think the distinction here is that:
> >
> > * GCC's -fanalyzer complains about uninitialized values immediately
> > when it sees one being fetched for use in any expression (replacing the
> > value with a safe one to avoid further complaints), without considering
> > how they are going to be used in the expression, whereas
> >
> > * it sounds like valgrind keeps track of uninitialized bits, propagates
> > the "uninit-ness" of the bits, and complains at certain times when
> > uninitialized bits are used in certain ways.
>
> Yes. valgrind keeps track of uninitialized bits and propagates them
> around till "use". Where use is anything that might alter the
> observable behavior of the program. Which is control flow transfers,
> conditional moves, addresses used in memory accesses, and data passed
> to system calls.
>
> This paper describes some of the memcheck tricks:
> https://valgrind.org/docs/memcheck2005.pdf

That probably means bugs like https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63311
could be resolved as fixed (in valgrind).

But indeed GCCs own uninit diagnostics fall foul of this as well.  I suppose
one could track uses and if they are used in bitwise operations could
demote them to "maybe" even though they appear unconditionally.  Now,
we cannot distinguish

  if (a || b)

from

  if (b || a)

after combining the ifs of course.

Another possibility is to try to compute a kind of must-init analysis before
removing short-cutting transforms which practically means removing the
transform from GENERIC folding.

Richard.

> Cheers,
>
> Mark

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Uninit warnings due to optimizing short-circuit conditionals
  2022-02-15  7:25       ` Richard Biener
@ 2022-02-15 12:29         ` Mark Wielaard
  2022-02-15 13:00           ` Julian Seward
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Wielaard @ 2022-02-15 12:29 UTC (permalink / raw)
  To: Richard Biener; +Cc: David Malcolm, GCC Development, Julian Seward

Hi Richard,

On Tue, 2022-02-15 at 08:25 +0100, Richard Biener wrote:
> On Mon, Feb 14, 2022 at 6:38 PM Mark Wielaard <mark@klomp.org> wrote:
> > Yes. valgrind keeps track of uninitialized bits and propagates them
> > around till "use". Where use is anything that might alter the
> > observable behavior of the program. Which is control flow
> > transfers, conditional moves, addresses used in memory accesses,
> > and data passed to system calls.
> > 
> > This paper describes some of the memcheck tricks:
> > https://valgrind.org/docs/memcheck2005.pdf
> 
> That probably means bugs like 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63311
> could be resolved as fixed (in valgrind).

I just tried the testcase from that bug and it still replicates with
gcc 11.2.1 and valgrind 3.18.1. And as far as I can see it really
cannot be fixed in valgrind since gcc really generates a conditional
jump based on an uninit variable in this case.

It does look a bit like what Julian described in:

  Memcheck Reloaded
  dealing with compiler-generated branches on undefined values
https://archive.fosdem.org/2020/schedule/event/debugging_memcheck_reloaded/

Which should be able to recover/reconstruct the original control flow.
In cases like:

int result
bool ok = compute_something(&result)
if (ok && result == 42) { ... }

where gcc turns that last line upside down:

if (result == 42 && ok) { ... }

But it doesn't work in this case. Probably because this is a slightly
more complex case involving 3 distinct variables instead of 2.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Uninit warnings due to optimizing short-circuit conditionals
  2022-02-15 12:29         ` Mark Wielaard
@ 2022-02-15 13:00           ` Julian Seward
  2022-02-15 13:28             ` Richard Biener
  0 siblings, 1 reply; 11+ messages in thread
From: Julian Seward @ 2022-02-15 13:00 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: Richard Biener, David Malcolm, GCC Development

Sorry for the delayed response.  I've been paging this all back in.

I first saw this problem when memcheck-ing Firefox as compiled by Clang, some
years back.  Not long after GCC was also at it.  The transformation in
question is (at the C level):

A && B  ==>  B && A   if it can be proved that A
                      is always false whenever B is undefined
                      and (I assume) that B is provably exception-free

where && means the standard lazy left-first C logical-AND.  I believe this
might have become more prevalent due to ever-more aggressive inlining (at
least for Firefox), which presented the compilers with greater opportunities
to make the required proofs.

After wondering what to do about this viz-a-viz Memcheck, I realised after a
while that, because Memcheck does know how to do exact definedness propagation
through bitwise and/or, I could "fix" it by recovering the underlying &&
expression through analysis of local fragments of the control flow graph.  So
it now looks for

    if !A goto X
    if !B goto X
    <then-clause>
    X:

and generates IR for analysis as if it had seen
  "if (A bitwise-& B) <then-clause>".

Note that the bitwise vs logical-AND distinction isn't really correct; what
we're really talking about is non-lazy vs lazy AND.  The number of bits
involved in the representation is irrelevant.

A couple of other notes:

* Valgrind only deals with 2-argument &&s; that is all that seemed necessary.
  In principle though the transformation generalises to any number of terms
  &&-ed together, not just 2.

* For reasons I don't really remember now, I didn't need to deal with the
  equivalent OR case.  It's all the same at the machine code level.  (Waves
  hands and mumbles something about De Morgan ..)

I'm sure all the above info is in the slides of the Fosdem talk that Mark
mentioned.  I don't think the above contributes anything new.

On Tue, Feb 15, 2022 at 1:29 PM Mark Wielaard <mark@klomp.org> wrote:
>
> Hi Richard,
>
> On Tue, 2022-02-15 at 08:25 +0100, Richard Biener wrote:
> > On Mon, Feb 14, 2022 at 6:38 PM Mark Wielaard <mark@klomp.org> wrote:
> > > Yes. valgrind keeps track of uninitialized bits and propagates them
> > > around till "use". Where use is anything that might alter the
> > > observable behavior of the program. Which is control flow
> > > transfers, conditional moves, addresses used in memory accesses,
> > > and data passed to system calls.
> > >
> > > This paper describes some of the memcheck tricks:
> > > https://valgrind.org/docs/memcheck2005.pdf
> >
> > That probably means bugs like
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63311
> > could be resolved as fixed (in valgrind).
>
> I just tried the testcase from that bug and it still replicates with
> gcc 11.2.1 and valgrind 3.18.1. And as far as I can see it really
> cannot be fixed in valgrind since gcc really generates a conditional
> jump based on an uninit variable in this case.
>
> It does look a bit like what Julian described in:
>
>   Memcheck Reloaded
>   dealing with compiler-generated branches on undefined values
> https://archive.fosdem.org/2020/schedule/event/debugging_memcheck_reloaded/
>
> Which should be able to recover/reconstruct the original control flow.
> In cases like:
>
> int result
> bool ok = compute_something(&result)
> if (ok && result == 42) { ... }
>
> where gcc turns that last line upside down:
>
> if (result == 42 && ok) { ... }
>
> But it doesn't work in this case. Probably because this is a slightly
> more complex case involving 3 distinct variables instead of 2.
>
> Cheers,
>
> Mark

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Uninit warnings due to optimizing short-circuit conditionals
  2022-02-15 13:00           ` Julian Seward
@ 2022-02-15 13:28             ` Richard Biener
  2022-02-15 21:40               ` David Malcolm
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Biener @ 2022-02-15 13:28 UTC (permalink / raw)
  To: Julian Seward; +Cc: Mark Wielaard, David Malcolm, GCC Development

On Tue, Feb 15, 2022 at 2:00 PM Julian Seward <sewardj42@gmail.com> wrote:
>
> Sorry for the delayed response.  I've been paging this all back in.
>
> I first saw this problem when memcheck-ing Firefox as compiled by Clang, some
> years back.  Not long after GCC was also at it.  The transformation in
> question is (at the C level):
>
> A && B  ==>  B && A   if it can be proved that A
>                       is always false whenever B is undefined
>                       and (I assume) that B is provably exception-free
>
> where && means the standard lazy left-first C logical-AND.  I believe this
> might have become more prevalent due to ever-more aggressive inlining (at
> least for Firefox), which presented the compilers with greater opportunities
> to make the required proofs.

Note GCC doesn't try to prove this, instead it reasons that when
B is undefined it takes an indeterminate value and if A is _not_ always
false then the program would have invoked undefined behavior, so we
can disregard this possibility and assume B is not undefined.  So
either B is not undefined and everything is OK, or B is undefined but
then A must be always false.

Note that when A is always false we may have transformed a valid
program (does not access B) into a program invoking undefined behavior
(in C language terms).  We don't treat undefined uses as "very" undefined
behavior but I do remember we've shot ourselves in the foot with this
transform - in this case we'd have to make the use of B determinate
somehow, something we cannot yet do.  So we'd want a transform
like

 A && B ==> OK(B) && A

where 'OK' sanitizes B in case it is undefined.  The error we can run into
is that two of the uninit Bs can be equated to two different values,
breaking the B == B invariant (technically also OK, but not if it was us
that introduced the undefinedness in the first place).

Richard.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Uninit warnings due to optimizing short-circuit conditionals
  2022-02-15 13:28             ` Richard Biener
@ 2022-02-15 21:40               ` David Malcolm
  0 siblings, 0 replies; 11+ messages in thread
From: David Malcolm @ 2022-02-15 21:40 UTC (permalink / raw)
  To: Richard Biener, Julian Seward; +Cc: Mark Wielaard, GCC Development

On Tue, 2022-02-15 at 14:28 +0100, Richard Biener wrote:
> On Tue, Feb 15, 2022 at 2:00 PM Julian Seward <sewardj42@gmail.com>
> wrote:
> > 
> > Sorry for the delayed response.  I've been paging this all back in.
> > 
> > I first saw this problem when memcheck-ing Firefox as compiled by
> > Clang, some
> > years back.  Not long after GCC was also at it.  The transformation
> > in
> > question is (at the C level):
> > 
> > A && B  ==>  B && A   if it can be proved that A
> >                       is always false whenever B is undefined
> >                       and (I assume) that B is provably exception-
> > free
> > 
> > where && means the standard lazy left-first C logical-AND.  I
> > believe this
> > might have become more prevalent due to ever-more aggressive
> > inlining (at
> > least for Firefox), which presented the compilers with greater
> > opportunities
> > to make the required proofs.
> 
> Note GCC doesn't try to prove this, instead it reasons that when
> B is undefined it takes an indeterminate value and if A is _not_
> always
> false then the program would have invoked undefined behavior, so we
> can disregard this possibility and assume B is not undefined.  So
> either B is not undefined and everything is OK, or B is undefined but
> then A must be always false.
> 
> Note that when A is always false we may have transformed a valid
> program (does not access B) into a program invoking undefined
> behavior
> (in C language terms).  We don't treat undefined uses as "very"
> undefined
> behavior but I do remember we've shot ourselves in the foot with this
> transform - in this case we'd have to make the use of B determinate
> somehow, something we cannot yet do.  So we'd want a transform
> like
> 
>  A && B ==> OK(B) && A
> 
> where 'OK' sanitizes B in case it is undefined.  The error we can run
> into
> is that two of the uninit Bs can be equated to two different values,
> breaking the B == B invariant (technically also OK, but not if it was
> us
> that introduced the undefinedness in the first place).
> 
> Richard.

Thanks everyone for the various insights.

I've gone ahead and committed my workaround for the -fanalyzer uninit
false positive to trunk (as r12-7251-
g1e2fe6715a949f80c1204ae244baad3cd80ffaf0).

Dave


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-02-15 21:40 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-14 15:57 Uninit warnings due to optimizing short-circuit conditionals David Malcolm
2022-02-14 16:26 ` Jeff Law
2022-02-14 17:10   ` David Malcolm
2022-02-14 16:57 ` Mark Wielaard
2022-02-14 17:20   ` David Malcolm
2022-02-14 17:37     ` Mark Wielaard
2022-02-15  7:25       ` Richard Biener
2022-02-15 12:29         ` Mark Wielaard
2022-02-15 13:00           ` Julian Seward
2022-02-15 13:28             ` Richard Biener
2022-02-15 21:40               ` David Malcolm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).