From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11517 invoked by alias); 2 Mar 2020 01:59:46 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 11508 invoked by uid 89); 2 Mar 2020 01:59:46 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-9.5 required=5.0 tests=AWL,BAYES_50,KAM_SHORT,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 spammy=firstly, sk:yilubu, yilubupteducn, D*edu.cn X-HELO: us-smtp-delivery-1.mimecast.com Received: from us-smtp-2.mimecast.com (HELO us-smtp-delivery-1.mimecast.com) (207.211.31.81) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 02 Mar 2020 01:59:44 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1583114382; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mvr2C2p53frH3cZBPnXK3Ovs5TIDIt1pPSF1/Krk3dI=; b=hRAZZlVI/R+izW0gAhvHDIptWfnCup8Ar49cfGPmLBeqA9hadU6r5dhL+/kAfYrYp0CDI5 KolEZcF1ZOgpcSh+tE2q5dN6Sx5x1Xa3+H2B/EV0cKO9mkYHN7/N2g9MRuqUjQeFhK+e2u j7wRjxjCfoWDVxZsyujOAUoG7dF3Y4I= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-211-8_Ss24KhP2ujx8d52lnSDw-1; Sun, 01 Mar 2020 20:59:35 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 18008107ACC4; Mon, 2 Mar 2020 01:59:34 +0000 (UTC) Received: from ovpn-116-56.phx2.redhat.com (ovpn-116-56.phx2.redhat.com [10.3.116.56]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8184F5D9C9; Mon, 2 Mar 2020 01:59:33 +0000 (UTC) Message-ID: Subject: Re: Re: GSoC: some questions about the static analyzer pass From: David Malcolm To: "yilu@bupt.edu.cn" , gcc Date: Mon, 02 Mar 2020 01:59:00 -0000 In-Reply-To: <2020030123480984955259@bupt.edu.cn>+3D882D308568723C References: <202002280931579734209@bupt.edu.cn> +93CCB1C73BAC95E2, <2020030123480984955259@bupt.edu.cn>+3D882D308568723C User-Agent: Evolution 3.32.5 (3.32.5-1.fc30) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2020-03/txt/msg00009.txt On Sun, 2020-03-01 at 23:48 +0800, yilu@bupt.edu.cn wrote: > Hello, David!=20 >=20=20 > It's so excited to receive a reply from you directedly. My name is > Beenle, an undergraduate student in Beijing, China.=20 Hello Beenle. > I've read the analyzer internal documentation, but I'm not sure I > understood well. The following are my thoughts about the analyzer and > I=E2=80=99m not sure they are right or feasible. Would it be possible for= you > to give me some advice?=20 I'll try. > For the detecting part, the core is the state machine. But it's > working on the gimple-SSA representation, it=E2=80=99s more complex. > According to the graphs and the state machine, we detect the problem. > I wonder that since I know little about gimple and ssa, should I read > the 12 gimple internal and 13 analysis and optimization of gimple > tuples internal firstly so that I can understand the approach of the > analyzer well and maybe produce some new ideas? Yes: it's worth reading through the GCC internals manual - in particular the sections about gimple-SSA. You should definitely also read "7 Testsuites": I tend to follow Test- Driven Development with the analyzer: I create some simple test files that demonstrate bugs that the analyzer ought to detect, and turn them into DejaGnu tests. > And the reason why we build the supergraph is that the gimple-SSA > representation gives us the callgraph and CFGs of the program. > And there is a problem derived from the graph--since we have to > represent the graph to the user (is there any other reason?), we have > to simplify the graph. So how to prune and merge is an important > part. There are several "graphs": - the supergraph combines the CFGs and the callgraph - the exploded_graph combines the supergraph with state (and call strings) - the graph within a region_model at a node within the exploded_graph, expressing pointers to regions. Pruning within region_model instances, and merging them is indeed very important, to avoid the analysis "exploding". I've been working on a rewrite of how region_models are pruned, which I hope to land in "master" in the next week. > The last is how to represent the problem to the users. How to find a > simplest and feasible way is important too. Yes. diagnostic-manager.cc generates events, and then tries to purge them (a kind of peephole optimization, but optimizing for readability to the end-user, rather than optimizing code). > For the GSoC, I would like to extend the analyzer for C++ and > optimize the pruning and merging algorithm. I=E2=80=99m also interested in > optimizing the region model, but not sure if time permitted. This is > just my first thought, and I=E2=80=99m eager for your suggestion, such as= on > the time limitation and the urgent need for the project. All of these ideas sound very worthwhile. Perhaps it's worth focusing on C++ support. For example: - extending the malloc/free checking to cover new/delete and=20 new[]/delete[] (and mismatches between them) - exception-handling - properly handling constructors and destructors - virtual functions - ...etc The above could easily take the whole summer, and would be self- contained. > It's still a bit difficult for me to read the codes after reading the > internal documentation. Is there any way that I can understand > easier?=20 FWIW I wrote some notes on GCC for newcomers here: https://dmalcolm.fedorapeople.org/gcc/newbies-guide/index.html I would recommend building the code from source, and trying to step through the analyzer. Use "--disable-bootstrap" when building from source: it's much quicker. Build it, and change to the "gcc" subdirectory (containing the built "cc1" and "xgcc" binaries). I wrote some instructions on debugging here: https://dmalcolm.fedorapeople.org/gcc/newbies-guide/debugging.html The entrypoint to the analyzer is ana::run_checkers, so try putting a breakpoint there and stepping through it. Also, -fdump-analyzer and -fdump-analyzer-stderr may prove useful for seeing what's going on. > And when I try to test the analyzer, I got a 404 when I try to open=20=20 > https://gcc.gnu.org/git/gcc even without the firewall. How can I > solve this problem? I don't know where you got that URL; I get a 404 for it as well. I hope you're able to access the links I gave above. Some of it is on github, if that's accessible from China. Try the instructions here for obtaining a git clone of the latest sources: https://gcc.gnu.org/git.html > Thank you very much for your patience! I=E2=80=99m looking forward to your > reply. >=20 > Best wishes! >=20 > Beenle Yi I hope the above was helpful David > >=20=20 > > From: David Malcolm > > Date: 2020-02-28 10:01 > > To: yilu@bupt.edu.cn; gcc > > Subject: Re: GSoC: some questions about the static analyzer pass > > On Fri, 2020-02-28 at 09:31 +0800, yilu@bupt.edu.cn wrote: > > > Hello everybody! I've learned about the projects in Google summer > > of > > > code, and I'm so interested in the static analyzer pass. I wonder > > > that where I can read the code of the static analyzer pass and > > how to > > > contact with the mental David Malcolm. Can anyone help me please? > > > Thank you so much. > >=20=20 > > Hi! > >=20=20 > > The code is in the gcc/analyzer subdirectory of the GCC source > > tree: > > https://gcc.gnu.org/git/?p=3Dgcc.git;a=3Dtree;f=3Dgcc/analyzer > >=20=20 > > Internal documentation is here: > > https://gcc.gnu.org/onlinedocs/gccint/Static-Analyzer.html > >=20=20 > > Some other notes can be seen at: > > https://gcc.gnu.org/wiki/DavidMalcolm/StaticAnalyzer > >=20=20 > >=20=20 > > Hope this is helpful > > David > >=20=20 > >=20=20 > >=20=20