From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <richard.guenther@gmail.com>
Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com
 [IPv6:2a00:1450:4864:20::52c])
 by sourceware.org (Postfix) with ESMTPS id 7672E3858C78
 for <gcc@gcc.gnu.org>; Tue, 15 Feb 2022 07:25:28 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7672E3858C78
Received: by mail-ed1-x52c.google.com with SMTP id w2so11937514edc.8
 for <gcc@gcc.gnu.org>; Mon, 14 Feb 2022 23:25:28 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc:content-transfer-encoding;
 bh=XBKlyEnkXmDyBOvHp8vkOPNp2siGDbXCtpuHvXsgm5A=;
 b=hqQwvkqMVqvqTLYj9cSmhI1c1g9X6jIAvusbpPaWV6GNDeyu+wJrRhy8k7UZMOwI/E
 XES7NuUpCPL6EDXISA+deY+gHd00vrXzW4xCFRSLV2VKjLf9cj5elirJxiRmrCjpBFCN
 0oCQfEyzX2jM9ssPzuU8t9gZoiAFm1ooARJ4fFMkTFeZPWy9/uESYfwlDxQXz/iPsRO4
 yvJeh+kI/AA6a76N/OqF8pCxVNkiZSccB5vi7CyB7CqcLMldz2HsidiajlklsYWm3R6X
 MArm7Ka3PbvlRM60XvJKSe7PfeUkK2jh+ubGr1jrm8MkVlM5WJpp9tkrPBMPMN9h4ZDe
 XWtw==
X-Gm-Message-State: AOAM533WrdzeeURCfiU5U7nh4Q3wyR1vGCyGYtdqucQFe1J3hxo2Hxs8
 kO+T+/pvhnona4SFgfI4VoXWbhq0emdElo6Ycl4=
X-Google-Smtp-Source: ABdhPJyn2AEgxh7USnVl3ICn95EUr3dizp/fNRDYEkBNG4dUR6zD0b9PKFJK6KJRhDHHt0tLeIhddwxUHER7AQ2aWRM=
X-Received: by 2002:aa7:d858:: with SMTP id f24mr2602214eds.156.1644909927076; 
 Mon, 14 Feb 2022 23:25:27 -0800 (PST)
MIME-Version: 1.0
References: <20220214155757.861877-1-dmalcolm@redhat.com>
 <71de3204e639eed5052ca9e6416334aba6b2d1c7.camel@klomp.org>
 <ab8b3b762fcabc2827e3b2f82cff6a11c9cd2ee3.camel@redhat.com>
 <3bfbfbf02e2d17d45b4a91e5ea5f855e0a62e5f5.camel@klomp.org>
In-Reply-To: <3bfbfbf02e2d17d45b4a91e5ea5f855e0a62e5f5.camel@klomp.org>
From: Richard Biener <richard.guenther@gmail.com>
Date: Tue, 15 Feb 2022 08:25:16 +0100
Message-ID: <CAFiYyc2f5ZCf3_=ix1YYc5vRjZCikJMPAkComM35hnY3-DQJNA@mail.gmail.com>
Subject: Re: Uninit warnings due to optimizing short-circuit conditionals
To: Mark Wielaard <mark@klomp.org>
Cc: David Malcolm <dmalcolm@redhat.com>, GCC Development <gcc@gcc.gnu.org>, 
 Julian Seward <sewardj42@gmail.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT,
 RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc mailing list <gcc.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc>,
 <mailto:gcc-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <mailto:gcc-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc>,
 <mailto:gcc-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Feb 2022 07:25:30 -0000

On Mon, Feb 14, 2022 at 6:38 PM Mark Wielaard <mark@klomp.org> wrote:
>
> On Mon, 2022-02-14 at 12:20 -0500, David Malcolm wrote:
> > On Mon, 2022-02-14 at 17:57 +0100, Mark Wielaard wrote:
> > > On Mon, 2022-02-14 at 10:57 -0500, David Malcolm wrote:
> > > > [CCing Mark in the hopes of insight from the valgrind side of
> > > > things]
> > >
> > > Adding Julian to CC so he can correct me if I say something silly.
> > >
> > > > There is a false positive from -Wanalyzer-use-of-uninitialized-
> > > > value on
> > > > gcc.dg/analyzer/pr102692.c here:
> > > >
> > > >   =E2=80=98fix_overlays_before=E2=80=99: events 1-3
> > > >     |
> > > >     |   75 |   while (tail
> > > >     |      |          ~~~~
> > > >     |   76 |          && (tem =3D make_lisp_ptr (tail, 5),
> > > >     |      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > >     |      |          |
> > > >     |      |          (1) following =E2=80=98false=E2=80=99 branch =
(when =E2=80=98tail=E2=80=99 is
> > > > NULL)...
> > > >     |   77 |              (end =3D marker_position (XOVERLAY (tem)-
> > > > > end)) >=3D pos))
> > > >
> > > >     |      |
> > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > >     |......
> > > >     |   82 |   if (!tail || end < prev || !tail->next)
> > > >     |      |       ~~~~~    ~~~~~~~~~~
> > > >     |      |       |            |
> > > >     |      |       |            (3) use of uninitialized value
> > > > =E2=80=98end=E2=80=99 here
> > > >     |      |       (2) ...to here
> > > >     |
> > > >
> > > > The issue is that inner || of the conditionals have been folded
> > > > within the
> > > > frontend from a chain of control flow:
> > > >
> > > >    5   =E2=94=82   if (tail =3D=3D 0B) goto <D.1986>; else goto <D.=
1988>;
> > > >    6   =E2=94=82   <D.1988>:
> > > >    7   =E2=94=82   if (end < prev) goto <D.1986>; else goto <D.1989=
>;
> > > >    8   =E2=94=82   <D.1989>:
> > > >    9   =E2=94=82   _1 =3D tail->next;
> > > >   10   =E2=94=82   if (_1 =3D=3D 0B) goto <D.1986>; else goto <D.19=
87>;
> > > >   11   =E2=94=82   <D.1986>:
> > > >
> > > > to an OR expr (and then to a bitwise-or by the gimplifier):
> > > >
> > > >    5   =E2=94=82   _1 =3D tail =3D=3D 0B;
> > > >    6   =E2=94=82   _2 =3D end < prev;
> > > >    7   =E2=94=82   _3 =3D _1 | _2;
> > > >    8   =E2=94=82   if (_3 !=3D 0) goto <D.1986>; else goto <D.1988>=
;
> > > >    9   =E2=94=82   <D.1988>:
> > > >   10   =E2=94=82   _4 =3D tail->next;
> > > >   11   =E2=94=82   if (_4 =3D=3D 0B) goto <D.1986>; else goto <D.19=
87>;
> > > >
> > > > This happens for sufficiently simple conditionals in
> > > > fold_truth_andor.
> > > > In particular, the (end < prev) is short-circuited without
> > > > optimization,
> > > > but is evaluated with optimization, leading to the false positive.
> > > >
> > > > Given how early this folding occurs, it seems the simplest fix is
> > > > to
> > > > try to detect places where this optimization appears to have
> > > > happened,
> > > > and suppress uninit warnings within the statement that would have
> > > > been short-circuited (and thus e.g. ignoring them when evaluating
> > > > _2
> > > > above for the case where _1 is known to be true at the (_1 | _2) ,
> > > > and
> > > > thus _2 being redundant).
> > > >
> > > > Attached is a patch that implements this.
> > > >
> > > > There are some more details in the patch, but I'm wondering if this
> > > > is a
> > > > known problem, and how e.g. valgrind copes with such code.  My
> > > > patch
> > > > feels like something of a hack, but I'm not sure of any other way
> > > > around
> > > > it given that the conditional is folded directly within the
> > > > frontend.
> > >
> > > As far as I know this is what valgrind memcheck also does with an
> > > bitwise or. It knows that _3 is defined and true if either _1 or _2
> > > is
> > > defined and true. Or more generically that the result bits of a
> > > bitwise
> > > or are defined for those bits that are both defined or where one is
> > > defined and has the value 1.
> >
> > Aha - thanks.  I think the distinction here is that:
> >
> > * GCC's -fanalyzer complains about uninitialized values immediately
> > when it sees one being fetched for use in any expression (replacing the
> > value with a safe one to avoid further complaints), without considering
> > how they are going to be used in the expression, whereas
> >
> > * it sounds like valgrind keeps track of uninitialized bits, propagates
> > the "uninit-ness" of the bits, and complains at certain times when
> > uninitialized bits are used in certain ways.
>
> Yes. valgrind keeps track of uninitialized bits and propagates them
> around till "use". Where use is anything that might alter the
> observable behavior of the program. Which is control flow transfers,
> conditional moves, addresses used in memory accesses, and data passed
> to system calls.
>
> This paper describes some of the memcheck tricks:
> https://valgrind.org/docs/memcheck2005.pdf

That probably means bugs like https://gcc.gnu.org/bugzilla/show_bug.cgi?id=
=3D63311
could be resolved as fixed (in valgrind).

But indeed GCCs own uninit diagnostics fall foul of this as well.  I suppos=
e
one could track uses and if they are used in bitwise operations could
demote them to "maybe" even though they appear unconditionally.  Now,
we cannot distinguish

  if (a || b)

from

  if (b || a)

after combining the ifs of course.

Another possibility is to try to compute a kind of must-init analysis befor=
e
removing short-cutting transforms which practically means removing the
transform from GENERIC folding.

Richard.

> Cheers,
>
> Mark