From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 29E9E3858C27 for ; Mon, 14 Feb 2022 17:10:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 29E9E3858C27 Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-617--rROpiTLOr6A13X0nzFEzg-1; Mon, 14 Feb 2022 12:10:28 -0500 X-MC-Unique: -rROpiTLOr6A13X0nzFEzg-1 Received: by mail-qk1-f198.google.com with SMTP id g26-20020a05620a13da00b0047e14765148so9554936qkl.1 for ; Mon, 14 Feb 2022 09:10:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=B/wwiugru+l0bJq5gW3zv7raqBQPq610rQSKzlFPhFQ=; b=j/lrfjI3zKR/125wp6cVJRk9/ZATIkOZl/rDvjvZ7o473mR+R16ZeIkOIHfWRLdmUL Z9aqd3PsPPxN+mP/6u8HreKA1z/9I0SJntm/AvnK2JQzwG4eZp5arjRQiLYFLYBsrkYQ oS0GfmYGz80IJ9EQEG5MbXqz6DPwIPBJhaSdNU9IioNjglz0yBLEyOmbTaych0nLbjpA 5bPJWSip+F4lUoJdtQf7MxcWa1yEwxNTif8ong3SWiBg4vY88wrHztrzOQbpOssIeclv voDFUIUa8Cqglwj8iyuAPy7gxRk58jtsSBNXSHCR3QLyoolsiqp2I1DCtLRy2qsv3Htw icgw== X-Gm-Message-State: AOAM531LDrXbOO8ovdkGb8XaLQIlUQXcyIgh0SqxPXCpdMXbU/KNliW5 8j8UjwDVsvGuMeyIelVr3Fz3xgkekVqdu/4RCF7MSAO7N3qBqS48aQoxQmhVfB3Z6T5p1DECmbk ApAWZBiQ= X-Received: by 2002:a37:34c:: with SMTP id 73mr383798qkd.290.1644858627708; Mon, 14 Feb 2022 09:10:27 -0800 (PST) X-Google-Smtp-Source: ABdhPJxcMDYC/Z0EwXf8uionLhxQ5CfAOBCKSinki68/DzBPraFaOHyS06fp0UawwpiF/751EYZy1w== X-Received: by 2002:a37:34c:: with SMTP id 73mr383769qkd.290.1644858627372; Mon, 14 Feb 2022 09:10:27 -0800 (PST) Received: from t14s.localdomain (c-73-69-212-193.hsd1.nh.comcast.net. [73.69.212.193]) by smtp.gmail.com with ESMTPSA id c2sm11999492qkp.0.2022.02.14.09.10.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Feb 2022 09:10:26 -0800 (PST) Message-ID: Subject: Re: Uninit warnings due to optimizing short-circuit conditionals From: David Malcolm To: Jeff Law , gcc@gcc.gnu.org Cc: Mark Wielaard Date: Mon, 14 Feb 2022 12:10:25 -0500 In-Reply-To: <601e7f97-bd52-bc66-7a1a-c26c18c5a794@gmail.com> References: <20220214155757.861877-1-dmalcolm@redhat.com> <601e7f97-bd52-bc66-7a1a-c26c18c5a794@gmail.com> User-Agent: Evolution 3.38.4 (3.38.4-1.fc33) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00, BODY_8BITS, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Feb 2022 17:10:32 -0000 On Mon, 2022-02-14 at 09:26 -0700, Jeff Law wrote: > > > On 2/14/2022 8:57 AM, David Malcolm via Gcc wrote: > > [CCing Mark in the hopes of insight from the valgrind side of things] > > > > There is a false positive from -Wanalyzer-use-of-uninitialized-value > > on > > gcc.dg/analyzer/pr102692.c here: > > > >    ‘fix_overlays_before’: events 1-3 > >      | > >      |   75 |   while (tail > >      |      |          ~~~~ > >      |   76 |          && (tem = make_lisp_ptr (tail, 5), > >      |      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > >      |      |          | > >      |      |          (1) following ‘false’ branch (when ‘tail’ is > > NULL)... > >      |   77 |              (end = marker_position (XOVERLAY (tem)- > > >end)) >= pos)) > >      |      |              > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > >      |...... > >      |   82 |   if (!tail || end < prev || !tail->next) > >      |      |       ~~~~~    ~~~~~~~~~~ > >      |      |       |            | > >      |      |       |            (3) use of uninitialized value ‘end’ > > here > >      |      |       (2) ...to here > >      | > > > > The issue is that inner || of the conditionals have been folded > > within the > > frontend from a chain of control flow: > > > >     5   │   if (tail == 0B) goto ; else goto ; > >     6   │   : > >     7   │   if (end < prev) goto ; else goto ; > >     8   │   : > >     9   │   _1 = tail->next; > >    10   │   if (_1 == 0B) goto ; else goto ; > >    11   │   : > > > > to an OR expr (and then to a bitwise-or by the gimplifier): > > > >     5   │   _1 = tail == 0B; > >     6   │   _2 = end < prev; > >     7   │   _3 = _1 | _2; > >     8   │   if (_3 != 0) goto ; else goto ; > >     9   │   : > >    10   │   _4 = tail->next; > >    11   │   if (_4 == 0B) goto ; else goto ; > > > > This happens for sufficiently simple conditionals in > > fold_truth_andor. > > In particular, the (end < prev) is short-circuited without > > optimization, > > but is evaluated with optimization, leading to the false positive. > > > > Given how early this folding occurs, it seems the simplest fix is to > > try to detect places where this optimization appears to have > > happened, > > and suppress uninit warnings within the statement that would have > > been short-circuited (and thus e.g. ignoring them when evaluating _2 > > above for the case where _1 is known to be true at the (_1 | _2) , > > and > > thus _2 being redundant). > > > > Attached is a patch that implements this. > > > > There are some more details in the patch, but I'm wondering if this > > is a > > known problem, and how e.g. valgrind copes with such code.  My patch > > feels like something of a hack, but I'm not sure of any other way > > around > > it given that the conditional is folded directly within the frontend. > Presumably when "tail ==0", "end" is initialized somewhere?   I don't think it does, but it shouldn't be a problem in the code as written, due to short-circuiting (as I understand things) - but the short-circuiting is being removed by the optimizer. "end" only gets initialized at line 71 of the source below, for the case where the initial value of "tail" is non-NULL: 63 │ void 64 │ fix_overlays_before (struct buffer *bp, long prev, long pos) 65 │ { 66 │ struct Lisp_Overlay *tail = bp->overlays_before, *parent = 0, *right_pair; 67 │ struct lisp *tem; 68 │ long end; 69 │ while (tail 70 │ && (tem = make_lisp_ptr (tail, 5), 71 │ (end = marker_position (XOVERLAY (tem)->end)) >= pos)) 72 │ { 73 │ parent = tail; 74 │ tail = tail->next; 75 │ } 76 │ if (!tail || end < prev || !tail->next) /* { dg-bogus "use of uninitialized value 'end'" "uninit" } */ 77 │ /* { dg-bogus "dereference of NULL 'tail'" "null deref" { target *-*-* } .-1 } */ 78 │ return; At -O2 we have this at pr102692.c.022t.ssa: 59 │ : 60 │ tail_23 = bp_22(D)->overlays_before; 61 │ parent_24 = 0B; 62 │ goto ; [INV] 63 │ 64 │ : 65 │ parent_32 = tail_11; 66 │ tail_33 = tail_11->next; 67 │ 68 │ : 69 │ # tail_11 = PHI 70 │ # parent_13 = PHI 71 │ # end_15 = PHI 72 │ if (tail_11 != 0B) 73 │ goto ; [INV] 74 │ else 75 │ goto ; [INV] 76 │ 77 │ : 78 │ tem_27 = make_lisp_ptr (tail_11, 5); 79 │ _1 = XOVERLAY (tem_27); 80 │ _2 = _1->end; 81 │ end_30 = marker_position (_2); 82 │ if (end_30 >= pos_31(D)) 83 │ goto ; [INV] 84 │ else 85 │ goto ; [INV] 86 │ 87 │ : 88 │ # end_16 = PHI 89 │ _3 = tail_11 == 0B; 90 │ _4 = end_16 < prev_34(D); 91 │ _5 = _3 | _4; 92 │ if (_5 != 0) 93 │ goto ; [INV] 94 │ else 95 │ goto ; [INV] 96 │ 97 │ : 98 │ _6 = tail_11->next; 99 │ if (_6 == 0B) 100 │ goto ; [INV] 101 │ else 102 │ goto ; [INV] 103 │ This line of the source: 76 │ if (!tail || end < prev || !tail->next) has become BBs 6 and 7. Consider the path: ENTRY -> BB 2 -> BB 4 -> BB 6 (initial tail being NULL) If initially tail_23 is NULL, then: * at BB 2 -> BB 4 we have tail_11 = tail_23 and end_15 = end_25(D) * at BB 4 -> BB 6 we have end_16 = end_15 = end_25(D) and so we have: 89 │ _3 = tail_11 == 0B; with tail_11 being NULL, and thus we know _3 is true on this path, and then: 90 │ _4 = end_16 < prev_34(D); where end_16 is end_25(D), the "initial" value of "end" - but "end" isn't initialized. So we have: _4 = UNINITIALIZED < VALID; (the 2nd arg, prev is a param) which is an evaluation using an uninitialized value. However _4 only gets used by: 91 │ _5 = _3 | _4; all boolean, with _3 known to be true. Assuming I'm reading all this correctly, this seems like overzealous folding to me, and my special-casing of it - the folding turns a TRUTH_OR_EXPR into a TRUTH_OR, given that the arguments are simple_operand_p_2, which seems to apply for the simplest lookups of local variables, without side-effects. tree.def says: /* ANDIF and ORIF allow the second operand not to be computed if the value of the expression is determined from the first operand. AND, OR, and XOR always compute the second operand whether its value is needed or not (for side effects). The operand may have and presumably the "allow the 2nd operand not to be computed" is not the same as "require the 2nd operand not to be computed". Does the above reasoning (and my special-case workaround) sound correct to you? Thanks! Dave > If so, yes, > this is a known issue.  There's a BZ about it somewhere (I don' t have > the # handy, but it's probably on the Wuninitialized tracker). > > > Jeff >