public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Julian Seward <sewardj42@gmail.com>
To: Mark Wielaard <mark@klomp.org>
Cc: Richard Biener <richard.guenther@gmail.com>,
	David Malcolm <dmalcolm@redhat.com>,
	GCC Development <gcc@gcc.gnu.org>
Subject: Re: Uninit warnings due to optimizing short-circuit conditionals
Date: Tue, 15 Feb 2022 14:00:47 +0100	[thread overview]
Message-ID: <CAPQtpYuKWMgYJua7dLrCsph-wMLgLZDYjuoLuD-fVZA5r2Hm=Q@mail.gmail.com> (raw)
In-Reply-To: <a0b6fd4233d1cb5f52986a20e38d8aaf8276629b.camel@klomp.org>

Sorry for the delayed response.  I've been paging this all back in.

I first saw this problem when memcheck-ing Firefox as compiled by Clang, some
years back.  Not long after GCC was also at it.  The transformation in
question is (at the C level):

A && B  ==>  B && A   if it can be proved that A
                      is always false whenever B is undefined
                      and (I assume) that B is provably exception-free

where && means the standard lazy left-first C logical-AND.  I believe this
might have become more prevalent due to ever-more aggressive inlining (at
least for Firefox), which presented the compilers with greater opportunities
to make the required proofs.

After wondering what to do about this viz-a-viz Memcheck, I realised after a
while that, because Memcheck does know how to do exact definedness propagation
through bitwise and/or, I could "fix" it by recovering the underlying &&
expression through analysis of local fragments of the control flow graph.  So
it now looks for

    if !A goto X
    if !B goto X
    <then-clause>
    X:

and generates IR for analysis as if it had seen
  "if (A bitwise-& B) <then-clause>".

Note that the bitwise vs logical-AND distinction isn't really correct; what
we're really talking about is non-lazy vs lazy AND.  The number of bits
involved in the representation is irrelevant.

A couple of other notes:

* Valgrind only deals with 2-argument &&s; that is all that seemed necessary.
  In principle though the transformation generalises to any number of terms
  &&-ed together, not just 2.

* For reasons I don't really remember now, I didn't need to deal with the
  equivalent OR case.  It's all the same at the machine code level.  (Waves
  hands and mumbles something about De Morgan ..)

I'm sure all the above info is in the slides of the Fosdem talk that Mark
mentioned.  I don't think the above contributes anything new.

On Tue, Feb 15, 2022 at 1:29 PM Mark Wielaard <mark@klomp.org> wrote:
>
> Hi Richard,
>
> On Tue, 2022-02-15 at 08:25 +0100, Richard Biener wrote:
> > On Mon, Feb 14, 2022 at 6:38 PM Mark Wielaard <mark@klomp.org> wrote:
> > > Yes. valgrind keeps track of uninitialized bits and propagates them
> > > around till "use". Where use is anything that might alter the
> > > observable behavior of the program. Which is control flow
> > > transfers, conditional moves, addresses used in memory accesses,
> > > and data passed to system calls.
> > >
> > > This paper describes some of the memcheck tricks:
> > > https://valgrind.org/docs/memcheck2005.pdf
> >
> > That probably means bugs like
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63311
> > could be resolved as fixed (in valgrind).
>
> I just tried the testcase from that bug and it still replicates with
> gcc 11.2.1 and valgrind 3.18.1. And as far as I can see it really
> cannot be fixed in valgrind since gcc really generates a conditional
> jump based on an uninit variable in this case.
>
> It does look a bit like what Julian described in:
>
>   Memcheck Reloaded
>   dealing with compiler-generated branches on undefined values
> https://archive.fosdem.org/2020/schedule/event/debugging_memcheck_reloaded/
>
> Which should be able to recover/reconstruct the original control flow.
> In cases like:
>
> int result
> bool ok = compute_something(&result)
> if (ok && result == 42) { ... }
>
> where gcc turns that last line upside down:
>
> if (result == 42 && ok) { ... }
>
> But it doesn't work in this case. Probably because this is a slightly
> more complex case involving 3 distinct variables instead of 2.
>
> Cheers,
>
> Mark

  reply	other threads:[~2022-02-15 13:01 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-14 15:57 David Malcolm
2022-02-14 16:26 ` Jeff Law
2022-02-14 17:10   ` David Malcolm
2022-02-14 16:57 ` Mark Wielaard
2022-02-14 17:20   ` David Malcolm
2022-02-14 17:37     ` Mark Wielaard
2022-02-15  7:25       ` Richard Biener
2022-02-15 12:29         ` Mark Wielaard
2022-02-15 13:00           ` Julian Seward [this message]
2022-02-15 13:28             ` Richard Biener
2022-02-15 21:40               ` David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPQtpYuKWMgYJua7dLrCsph-wMLgLZDYjuoLuD-fVZA5r2Hm=Q@mail.gmail.com' \
    --to=sewardj42@gmail.com \
    --cc=dmalcolm@redhat.com \
    --cc=gcc@gcc.gnu.org \
    --cc=mark@klomp.org \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).