From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sourceware.org (Postfix) with ESMTPS id 6B867385B516 for ; Thu, 23 Mar 2023 08:06:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6B867385B516 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=irisa.fr Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=irisa.fr Authentication-Results: mail2-relais-roc.national.inria.fr; dkim=none (message not signed) header.i=none X-Ironport-Dmarc-Check-Result: validskip X-IronPort-AV: E=Sophos;i="5.98,283,1673910000"; d="scan'208";a="98661362" Received: from ptb-5cg22835fs.irisa.fr (HELO [131.254.21.198]) ([131.254.21.198]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Mar 2023 09:06:33 +0100 Message-ID: <0dc78fd3-d06f-61bb-993a-e2b1f587dac4@irisa.fr> Date: Thu, 23 Mar 2023 09:06:33 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [Static Analyzer] Loop handling - False positive for malloc-sm To: David Malcolm , gcc@gcc.gnu.org References: <34efc6e0-5bd8-879c-0288-154ba28f5f05@irisa.fr> <3b77234afb96947c9694d375b43b3096cbd45467.camel@redhat.com> <805abf28-3991-df57-51b5-d1e1f4f398b6@irisa.fr> From: Pierrick Philippe Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 22/03/2023 19:19, David Malcolm wrote: > On Tue, 2023-03-21 at 09:21 +0100, Pierrick Philippe wrote: [stripping] >> In fact, this could be done directly by the analyzer, and only >> calling >> state machine APIs for loop handling which still has not reached >> such a fixed point in their program state for the analyzed loop, with >> a >> maximum number of execution fixed by the analyzer to limit execution >> time. >> >> Does what I'm saying make sense? > I think so, though I'm not sure how it would work in practice. > Consider e.g. > > for (int i = 0; i < n; i++) > head = prepend_node (head, i); > > which builds a chain of N dynamically-allocated nodes in a linked list. Well, that would be a case where the loop's analysis will depend of the state machine. If we consider the malloc-sm, it would allow it to track as different pointers each allocated pointers, until the limit of symbolic execution imposed by the analyzer is reached, are the svalue of N if it is a known integer at the current analysis point. For other, such as a the file-sm, it would only be needed to symbolically execute it once, assuming prepend_node() is not opening any files. So this state machine would not have to be executed more than once on the loop at this program point by the analyzer. I think the different steps for such a different cases of loop analysis, is somehow using the second point of the RFE you shared above. The "algorithm" I came with when thinking about it looks like this. Of course, I'm definitely not an expert on the analyzer, so it is possibly not feasible. * Detect loop, and try to get the termination constraint (possibly reduced if possible). * Iterate on the loops' node N:     * If N is the loop's first node:         * Check if the actual program state is in a sufficient state to satisfy the loop's termination constraint,             If so, stop analyzing the loop         * Otherwise, check if the maximum number of symbolic execution fixed by the analyzer is reached,             If so, stop analyzing the loop         * Otherwise, keep going     * Call every sm still impacting their program state map on node N This should work for loops iterating on integer, for other kind of loops, it might be trickier though. >> In terms of implementation, loop detection can be done by looking for >> strongly connected components (SCCs) >> in a function graph having more than one node. >> I don't know if this is how it is already done within the analyzer or >> not? > It isn't yet done in the analyzer, but as noted above there is code in > GCC that already does that (in cfgloop.{h,cc}). I definitely have to look at this files. Thank you for your time, Pierrick