* Re: Re: GSoC: some questions about the static analyzer pass
2020-03-01 15:48 ` yilu
@ 2020-03-02 1:59 ` David Malcolm
0 siblings, 0 replies; 4+ messages in thread
From: David Malcolm @ 2020-03-02 1:59 UTC (permalink / raw)
To: yilu, gcc
On Sun, 2020-03-01 at 23:48 +0800, yilu@bupt.edu.cn wrote:
> Hello, David!
>
> It's so excited to receive a reply from you directedly. My name is
> Beenle, an undergraduate student in Beijing, China.
Hello Beenle.
> I've read the analyzer internal documentation, but I'm not sure I
> understood well. The following are my thoughts about the analyzer and
> I’m not sure they are right or feasible. Would it be possible for you
> to give me some advice?
I'll try.
> For the detecting part, the core is the state machine. But it's
> working on the gimple-SSA representation, it’s more complex.
> According to the graphs and the state machine, we detect the problem.
> I wonder that since I know little about gimple and ssa, should I read
> the 12 gimple internal and 13 analysis and optimization of gimple
> tuples internal firstly so that I can understand the approach of the
> analyzer well and maybe produce some new ideas?
Yes: it's worth reading through the GCC internals manual - in
particular the sections about gimple-SSA.
You should definitely also read "7 Testsuites": I tend to follow Test-
Driven Development with the analyzer: I create some simple test files
that demonstrate bugs that the analyzer ought to detect, and turn them
into DejaGnu tests.
> And the reason why we build the supergraph is that the gimple-SSA
> representation gives us the callgraph and CFGs of the program.
> And there is a problem derived from the graph--since we have to
> represent the graph to the user (is there any other reason?), we have
> to simplify the graph. So how to prune and merge is an important
> part.
There are several "graphs":
- the supergraph combines the CFGs and the callgraph
- the exploded_graph combines the supergraph with state (and call
strings)
- the graph within a region_model at a node within the exploded_graph,
expressing pointers to regions.
Pruning within region_model instances, and merging them is indeed very
important, to avoid the analysis "exploding".
I've been working on a rewrite of how region_models are pruned, which I
hope to land in "master" in the next week.
> The last is how to represent the problem to the users. How to find a
> simplest and feasible way is important too.
Yes. diagnostic-manager.cc generates events, and then tries to purge
them (a kind of peephole optimization, but optimizing for readability
to the end-user, rather than optimizing code).
> For the GSoC, I would like to extend the analyzer for C++ and
> optimize the pruning and merging algorithm. I’m also interested in
> optimizing the region model, but not sure if time permitted. This is
> just my first thought, and I’m eager for your suggestion, such as on
> the time limitation and the urgent need for the project.
All of these ideas sound very worthwhile.
Perhaps it's worth focusing on C++ support. For example:
- extending the malloc/free checking to cover new/delete and
new[]/delete[] (and mismatches between them)
- exception-handling
- properly handling constructors and destructors
- virtual functions
- ...etc
The above could easily take the whole summer, and would be self-
contained.
> It's still a bit difficult for me to read the codes after reading the
> internal documentation. Is there any way that I can understand
> easier?
FWIW I wrote some notes on GCC for newcomers here:
https://dmalcolm.fedorapeople.org/gcc/newbies-guide/index.html
I would recommend building the code from source, and trying to step
through the analyzer.
Use "--disable-bootstrap" when building from source: it's much quicker.
Build it, and change to the "gcc" subdirectory (containing the built
"cc1" and "xgcc" binaries).
I wrote some instructions on debugging here:
https://dmalcolm.fedorapeople.org/gcc/newbies-guide/debugging.html
The entrypoint to the analyzer is ana::run_checkers, so try putting a
breakpoint there and stepping through it.
Also, -fdump-analyzer and -fdump-analyzer-stderr may prove useful for
seeing what's going on.
> And when I try to test the analyzer, I got a 404 when I try to open
> https://gcc.gnu.org/git/gcc even without the firewall. How can I
> solve this problem?
I don't know where you got that URL; I get a 404 for it as well.
I hope you're able to access the links I gave above. Some of it is on
github, if that's accessible from China.
Try the instructions here for obtaining a git clone of the latest
sources:
https://gcc.gnu.org/git.html
> Thank you very much for your patience! I’m looking forward to your
> reply.
>
> Best wishes!
>
> Beenle Yi
I hope the above was helpful
David
> >
> > From: David Malcolm
> > Date: 2020-02-28 10:01
> > To: yilu@bupt.edu.cn; gcc
> > Subject: Re: GSoC: some questions about the static analyzer pass
> > On Fri, 2020-02-28 at 09:31 +0800, yilu@bupt.edu.cn wrote:
> > > Hello everybody! I've learned about the projects in Google summer
> > of
> > > code, and I'm so interested in the static analyzer pass. I wonder
> > > that where I can read the code of the static analyzer pass and
> > how to
> > > contact with the mental David Malcolm. Can anyone help me please?
> > > Thank you so much.
> >
> > Hi!
> >
> > The code is in the gcc/analyzer subdirectory of the GCC source
> > tree:
> > https://gcc.gnu.org/git/?p=gcc.git;a=tree;f=gcc/analyzer
> >
> > Internal documentation is here:
> > https://gcc.gnu.org/onlinedocs/gccint/Static-Analyzer.html
> >
> > Some other notes can be seen at:
> > https://gcc.gnu.org/wiki/DavidMalcolm/StaticAnalyzer
> >
> >
> > Hope this is helpful
> > David
> >
> >
> >
^ permalink raw reply [flat|nested] 4+ messages in thread