From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by sourceware.org (Postfix) with ESMTPS id 7672E3858C78 for ; Tue, 15 Feb 2022 07:25:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7672E3858C78 Received: by mail-ed1-x52c.google.com with SMTP id w2so11937514edc.8 for ; Mon, 14 Feb 2022 23:25:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=XBKlyEnkXmDyBOvHp8vkOPNp2siGDbXCtpuHvXsgm5A=; b=hqQwvkqMVqvqTLYj9cSmhI1c1g9X6jIAvusbpPaWV6GNDeyu+wJrRhy8k7UZMOwI/E XES7NuUpCPL6EDXISA+deY+gHd00vrXzW4xCFRSLV2VKjLf9cj5elirJxiRmrCjpBFCN 0oCQfEyzX2jM9ssPzuU8t9gZoiAFm1ooARJ4fFMkTFeZPWy9/uESYfwlDxQXz/iPsRO4 yvJeh+kI/AA6a76N/OqF8pCxVNkiZSccB5vi7CyB7CqcLMldz2HsidiajlklsYWm3R6X MArm7Ka3PbvlRM60XvJKSe7PfeUkK2jh+ubGr1jrm8MkVlM5WJpp9tkrPBMPMN9h4ZDe XWtw== X-Gm-Message-State: AOAM533WrdzeeURCfiU5U7nh4Q3wyR1vGCyGYtdqucQFe1J3hxo2Hxs8 kO+T+/pvhnona4SFgfI4VoXWbhq0emdElo6Ycl4= X-Google-Smtp-Source: ABdhPJyn2AEgxh7USnVl3ICn95EUr3dizp/fNRDYEkBNG4dUR6zD0b9PKFJK6KJRhDHHt0tLeIhddwxUHER7AQ2aWRM= X-Received: by 2002:aa7:d858:: with SMTP id f24mr2602214eds.156.1644909927076; Mon, 14 Feb 2022 23:25:27 -0800 (PST) MIME-Version: 1.0 References: <20220214155757.861877-1-dmalcolm@redhat.com> <71de3204e639eed5052ca9e6416334aba6b2d1c7.camel@klomp.org> <3bfbfbf02e2d17d45b4a91e5ea5f855e0a62e5f5.camel@klomp.org> In-Reply-To: <3bfbfbf02e2d17d45b4a91e5ea5f855e0a62e5f5.camel@klomp.org> From: Richard Biener Date: Tue, 15 Feb 2022 08:25:16 +0100 Message-ID: Subject: Re: Uninit warnings due to optimizing short-circuit conditionals To: Mark Wielaard Cc: David Malcolm , GCC Development , Julian Seward Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Feb 2022 07:25:30 -0000 On Mon, Feb 14, 2022 at 6:38 PM Mark Wielaard wrote: > > On Mon, 2022-02-14 at 12:20 -0500, David Malcolm wrote: > > On Mon, 2022-02-14 at 17:57 +0100, Mark Wielaard wrote: > > > On Mon, 2022-02-14 at 10:57 -0500, David Malcolm wrote: > > > > [CCing Mark in the hopes of insight from the valgrind side of > > > > things] > > > > > > Adding Julian to CC so he can correct me if I say something silly. > > > > > > > There is a false positive from -Wanalyzer-use-of-uninitialized- > > > > value on > > > > gcc.dg/analyzer/pr102692.c here: > > > > > > > > =E2=80=98fix_overlays_before=E2=80=99: events 1-3 > > > > | > > > > | 75 | while (tail > > > > | | ~~~~ > > > > | 76 | && (tem =3D make_lisp_ptr (tail, 5), > > > > | | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > | | | > > > > | | (1) following =E2=80=98false=E2=80=99 branch = (when =E2=80=98tail=E2=80=99 is > > > > NULL)... > > > > | 77 | (end =3D marker_position (XOVERLAY (tem)- > > > > > end)) >=3D pos)) > > > > > > > > | | > > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > |...... > > > > | 82 | if (!tail || end < prev || !tail->next) > > > > | | ~~~~~ ~~~~~~~~~~ > > > > | | | | > > > > | | | (3) use of uninitialized value > > > > =E2=80=98end=E2=80=99 here > > > > | | (2) ...to here > > > > | > > > > > > > > The issue is that inner || of the conditionals have been folded > > > > within the > > > > frontend from a chain of control flow: > > > > > > > > 5 =E2=94=82 if (tail =3D=3D 0B) goto ; else goto ; > > > > 6 =E2=94=82 : > > > > 7 =E2=94=82 if (end < prev) goto ; else goto ; > > > > 8 =E2=94=82 : > > > > 9 =E2=94=82 _1 =3D tail->next; > > > > 10 =E2=94=82 if (_1 =3D=3D 0B) goto ; else goto ; > > > > 11 =E2=94=82 : > > > > > > > > to an OR expr (and then to a bitwise-or by the gimplifier): > > > > > > > > 5 =E2=94=82 _1 =3D tail =3D=3D 0B; > > > > 6 =E2=94=82 _2 =3D end < prev; > > > > 7 =E2=94=82 _3 =3D _1 | _2; > > > > 8 =E2=94=82 if (_3 !=3D 0) goto ; else goto = ; > > > > 9 =E2=94=82 : > > > > 10 =E2=94=82 _4 =3D tail->next; > > > > 11 =E2=94=82 if (_4 =3D=3D 0B) goto ; else goto ; > > > > > > > > This happens for sufficiently simple conditionals in > > > > fold_truth_andor. > > > > In particular, the (end < prev) is short-circuited without > > > > optimization, > > > > but is evaluated with optimization, leading to the false positive. > > > > > > > > Given how early this folding occurs, it seems the simplest fix is > > > > to > > > > try to detect places where this optimization appears to have > > > > happened, > > > > and suppress uninit warnings within the statement that would have > > > > been short-circuited (and thus e.g. ignoring them when evaluating > > > > _2 > > > > above for the case where _1 is known to be true at the (_1 | _2) , > > > > and > > > > thus _2 being redundant). > > > > > > > > Attached is a patch that implements this. > > > > > > > > There are some more details in the patch, but I'm wondering if this > > > > is a > > > > known problem, and how e.g. valgrind copes with such code. My > > > > patch > > > > feels like something of a hack, but I'm not sure of any other way > > > > around > > > > it given that the conditional is folded directly within the > > > > frontend. > > > > > > As far as I know this is what valgrind memcheck also does with an > > > bitwise or. It knows that _3 is defined and true if either _1 or _2 > > > is > > > defined and true. Or more generically that the result bits of a > > > bitwise > > > or are defined for those bits that are both defined or where one is > > > defined and has the value 1. > > > > Aha - thanks. I think the distinction here is that: > > > > * GCC's -fanalyzer complains about uninitialized values immediately > > when it sees one being fetched for use in any expression (replacing the > > value with a safe one to avoid further complaints), without considering > > how they are going to be used in the expression, whereas > > > > * it sounds like valgrind keeps track of uninitialized bits, propagates > > the "uninit-ness" of the bits, and complains at certain times when > > uninitialized bits are used in certain ways. > > Yes. valgrind keeps track of uninitialized bits and propagates them > around till "use". Where use is anything that might alter the > observable behavior of the program. Which is control flow transfers, > conditional moves, addresses used in memory accesses, and data passed > to system calls. > > This paper describes some of the memcheck tricks: > https://valgrind.org/docs/memcheck2005.pdf That probably means bugs like https://gcc.gnu.org/bugzilla/show_bug.cgi?id= =3D63311 could be resolved as fixed (in valgrind). But indeed GCCs own uninit diagnostics fall foul of this as well. I suppos= e one could track uses and if they are used in bitwise operations could demote them to "maybe" even though they appear unconditionally. Now, we cannot distinguish if (a || b) from if (b || a) after combining the ifs of course. Another possibility is to try to compute a kind of must-init analysis befor= e removing short-cutting transforms which practically means removing the transform from GENERIC folding. Richard. > Cheers, > > Mark