public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Malcolm <dmalcolm@redhat.com>
To: Jeff Law <jeffreyalaw@gmail.com>, gcc@gcc.gnu.org
Cc: Mark Wielaard <mjw@redhat.com>
Subject: Re: Uninit warnings due to optimizing short-circuit conditionals
Date: Mon, 14 Feb 2022 12:10:25 -0500	[thread overview]
Message-ID: <dac179a13b4db12b601cecd1ad8b38524adc949f.camel@redhat.com> (raw)
In-Reply-To: <601e7f97-bd52-bc66-7a1a-c26c18c5a794@gmail.com>

On Mon, 2022-02-14 at 09:26 -0700, Jeff Law wrote:
> 
> 
> On 2/14/2022 8:57 AM, David Malcolm via Gcc wrote:
> > [CCing Mark in the hopes of insight from the valgrind side of things]
> > 
> > There is a false positive from -Wanalyzer-use-of-uninitialized-value
> > on
> > gcc.dg/analyzer/pr102692.c here:
> > 
> >    ‘fix_overlays_before’: events 1-3
> >      |
> >      |   75 |   while (tail
> >      |      |          ~~~~
> >      |   76 |          && (tem = make_lisp_ptr (tail, 5),
> >      |      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >      |      |          |
> >      |      |          (1) following ‘false’ branch (when ‘tail’ is
> > NULL)...
> >      |   77 |              (end = marker_position (XOVERLAY (tem)-
> > >end)) >= pos))
> >      |      |             
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >      |......
> >      |   82 |   if (!tail || end < prev || !tail->next)
> >      |      |       ~~~~~    ~~~~~~~~~~
> >      |      |       |            |
> >      |      |       |            (3) use of uninitialized value ‘end’
> > here
> >      |      |       (2) ...to here
> >      |
> > 
> > The issue is that inner || of the conditionals have been folded
> > within the
> > frontend from a chain of control flow:
> > 
> >     5   │   if (tail == 0B) goto <D.1986>; else goto <D.1988>;
> >     6   │   <D.1988>:
> >     7   │   if (end < prev) goto <D.1986>; else goto <D.1989>;
> >     8   │   <D.1989>:
> >     9   │   _1 = tail->next;
> >    10   │   if (_1 == 0B) goto <D.1986>; else goto <D.1987>;
> >    11   │   <D.1986>:
> > 
> > to an OR expr (and then to a bitwise-or by the gimplifier):
> > 
> >     5   │   _1 = tail == 0B;
> >     6   │   _2 = end < prev;
> >     7   │   _3 = _1 | _2;
> >     8   │   if (_3 != 0) goto <D.1986>; else goto <D.1988>;
> >     9   │   <D.1988>:
> >    10   │   _4 = tail->next;
> >    11   │   if (_4 == 0B) goto <D.1986>; else goto <D.1987>;
> > 
> > This happens for sufficiently simple conditionals in
> > fold_truth_andor.
> > In particular, the (end < prev) is short-circuited without
> > optimization,
> > but is evaluated with optimization, leading to the false positive.
> > 
> > Given how early this folding occurs, it seems the simplest fix is to
> > try to detect places where this optimization appears to have
> > happened,
> > and suppress uninit warnings within the statement that would have
> > been short-circuited (and thus e.g. ignoring them when evaluating _2
> > above for the case where _1 is known to be true at the (_1 | _2) ,
> > and
> > thus _2 being redundant).
> > 
> > Attached is a patch that implements this.
> > 
> > There are some more details in the patch, but I'm wondering if this
> > is a
> > known problem, and how e.g. valgrind copes with such code.  My patch
> > feels like something of a hack, but I'm not sure of any other way
> > around
> > it given that the conditional is folded directly within the frontend.
> Presumably when "tail ==0", "end" is initialized somewhere?  

I don't think it does, but it shouldn't be a problem in the code as
written, due to short-circuiting (as I understand things) - but the
short-circuiting is being removed by the optimizer.

"end" only gets initialized at line 71 of the source below, for the
case where the initial value of "tail" is non-NULL:

  63   │ void
  64   │ fix_overlays_before (struct buffer *bp, long prev, long pos)
  65   │ {
  66   │   struct Lisp_Overlay *tail = bp->overlays_before, *parent = 0, *right_pair;
  67   │   struct lisp *tem;
  68   │   long end;
  69   │   while (tail
  70   │      && (tem = make_lisp_ptr (tail, 5),
  71   │          (end = marker_position (XOVERLAY (tem)->end)) >= pos))
  72   │     {
  73   │       parent = tail;
  74   │       tail = tail->next;
  75   │     }
  76   │   if (!tail || end < prev || !tail->next) /* { dg-bogus "use of uninitialized value 'end'" "uninit" } */
  77   │     /* { dg-bogus "dereference of NULL 'tail'" "null deref" { target *-*-* } .-1 } */
  78   │     return;

At -O2 we have this at pr102692.c.022t.ssa:

  59   │   <bb 2> :
  60   │   tail_23 = bp_22(D)->overlays_before;
  61   │   parent_24 = 0B;
  62   │   goto <bb 4>; [INV]
  63   │ 
  64   │   <bb 3> :
  65   │   parent_32 = tail_11;
  66   │   tail_33 = tail_11->next;
  67   │ 
  68   │   <bb 4> :
  69   │   # tail_11 = PHI <tail_23(2), tail_33(3)>
  70   │   # parent_13 = PHI <parent_24(2), parent_32(3)>
  71   │   # end_15 = PHI <end_25(D)(2), end_30(3)>
  72   │   if (tail_11 != 0B)
  73   │     goto <bb 5>; [INV]
  74   │   else
  75   │     goto <bb 6>; [INV]
  76   │ 
  77   │   <bb 5> :
  78   │   tem_27 = make_lisp_ptr (tail_11, 5);
  79   │   _1 = XOVERLAY (tem_27);
  80   │   _2 = _1->end;
  81   │   end_30 = marker_position (_2);
  82   │   if (end_30 >= pos_31(D))
  83   │     goto <bb 3>; [INV]
  84   │   else
  85   │     goto <bb 6>; [INV]
  86   │ 
  87   │   <bb 6> :
  88   │   # end_16 = PHI <end_15(4), end_30(5)>
  89   │   _3 = tail_11 == 0B;
  90   │   _4 = end_16 < prev_34(D);
  91   │   _5 = _3 | _4;
  92   │   if (_5 != 0)
  93   │     goto <bb 8>; [INV]
  94   │   else
  95   │     goto <bb 7>; [INV]
  96   │ 
  97   │   <bb 7> :
  98   │   _6 = tail_11->next;
  99   │   if (_6 == 0B)
 100   │     goto <bb 8>; [INV]
 101   │   else
 102   │     goto <bb 9>; [INV]
 103   │ 


This line of the source:
  76   │   if (!tail || end < prev || !tail->next) 
has become BBs 6 and 7.

Consider the path:
  ENTRY -> BB 2 -> BB 4 -> BB 6 (initial tail being NULL)

If initially tail_23 is NULL, then:
* at BB 2 -> BB 4 we have tail_11 = tail_23 and end_15 = end_25(D)
* at BB 4 -> BB 6 we have end_16 = end_15 = end_25(D)

and so we have:

  89   │   _3 = tail_11 == 0B;

with tail_11 being NULL, and thus we know _3 is true on this path, and
then:

  90   │   _4 = end_16 < prev_34(D);

where end_16 is end_25(D), the "initial" value of "end" - but "end"
isn't initialized.

So we have: _4 = UNINITIALIZED < VALID;  (the 2nd arg, prev is a param)
which is an evaluation using an uninitialized value.

However _4 only gets used by:

  91   │   _5 = _3 | _4;

all boolean, with _3 known to be true.

Assuming I'm reading all this correctly, this seems like overzealous
folding to me, and my special-casing of it - the folding turns a
TRUTH_OR_EXPR into a TRUTH_OR, given that the arguments
are simple_operand_p_2, which seems to apply for the simplest lookups
of local variables, without side-effects.

tree.def says:

  /* ANDIF and ORIF allow the second operand not to be computed if the
     value of the expression is determined from the first operand.  AND,
     OR, and XOR always compute the second operand whether its value is
     needed or not (for side effects).  The operand may have

and presumably the "allow the 2nd operand not to be computed" is not
the same as "require the 2nd operand not to be computed".

Does the above reasoning (and my special-case workaround) sound correct
to you?

Thanks!
Dave



> If so, yes, 
> this is a known issue.  There's a BZ about it somewhere (I don' t have 
> the # handy, but it's probably on the Wuninitialized tracker).
> 
> 
> Jeff
> 



  reply	other threads:[~2022-02-14 17:10 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-14 15:57 David Malcolm
2022-02-14 16:26 ` Jeff Law
2022-02-14 17:10   ` David Malcolm [this message]
2022-02-14 16:57 ` Mark Wielaard
2022-02-14 17:20   ` David Malcolm
2022-02-14 17:37     ` Mark Wielaard
2022-02-15  7:25       ` Richard Biener
2022-02-15 12:29         ` Mark Wielaard
2022-02-15 13:00           ` Julian Seward
2022-02-15 13:28             ` Richard Biener
2022-02-15 21:40               ` David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dac179a13b4db12b601cecd1ad8b38524adc949f.camel@redhat.com \
    --to=dmalcolm@redhat.com \
    --cc=gcc@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=mjw@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).