public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Malcolm <dmalcolm@redhat.com>
To: Shengyu Huang <kumom.huang@gmail.com>
Cc: GCC Development <gcc@gcc.gnu.org>
Subject: Re: [GSoC][Static Analyzer] First proposal draft and a few more questions/requests
Date: Sun, 02 Apr 2023 18:53:59 -0400	[thread overview]
Message-ID: <8c9a526ae6ce6868d21da887ee5cecd5c5fa28ac.camel@redhat.com> (raw)
In-Reply-To: <7B619AA7-C42D-4486-8017-72BB54297005@gmail.com>

On Sat, 2023-04-01 at 16:19 +0200, Shengyu Huang wrote:
> Hi Dave,
> 
> > > 
> > > I has looked into compiling those files with the patch some time
> > > ago;
> > > looking at my notes, one issue was with this on-stack buffer:
> > >    char extra[1024];
> > > declared outside the loop.  Inside the loop, it gets modified in
> > > various ways:
> > >    extra[0] = '\0';
> > > and
> > >    if (fread(extra, 1, extsize, fpZip) == extsize) {
> > > where the latter means "extra" becomes tainted.
> > > 
> > > However "extra" is barely used, and is effectively reset each
> > > time
> > > through the loop - but the analyzer doesn't figure that out.  So
> > > the
> > > loop analysis explodes, as it tries to keep track of the
> > > possibility
> > > that "extra" is still tainted from previous iteration(s), despite
> > > the
> > > fact that it's going to be clobbered before it ever gets used.
> > > 
> > > So one fix might be to extend the state-purging code so that it
> > > somehow
> > > "sees" that "extra" gets clobbered before it gets used, and thus
> > > we can
> > > purge the tainted state from it.
> > 
> > Thanks for your notes. I think we may be talking about the same
> > thing? If you look at the updated proposal (I have changed it quite
> > a lot since I first sent it out), you’ll see there is one relevant
> > paper for state merging (although it is slightly different from
> > state purging, I think the goal and general methodology is
> > similar): https://dslab.epfl.ch/pubs/stateMerging.pdf 
> > 
> > I was trying to say if some similar situation happened for other
> > types of checkers, I expected state explosion would also happen. I
> > tried to construct a similar example (with the same kind of reset
> > and nested conditionals + a loop) but for double-free, so far no
> > success yet. I’ll pick it up afterwards, at latest by next
> > Saturday, because I need to prepare for a coming midterm on Friday.
> > I will also put this test case to the proposal because it seems
> > like a very good starting point for the project.
> 
> As promised, below is a small example that causes state explosion
> without taint state machine involved.
> 
> void test()
> {
>   void *p;
>   int a;
>   scanf(“%d", &a);
>   
>   while (a > 0)
>   {
>      p = malloc (1024);
>      if (a > 1)
>        free(p);
>      a--;
>    }
>    
>    if (a >0)
>      free(p);
> }
> 
> This example not only causes state explosion, but also reports false
> positive of double-free.

(nods)

Yeah, our handling of loops isn't great.  There's plenty of opportunity
within a GSoC project for tackling that.

> 
> By the way, do you have any feedback regarding my proposal
> (https://docs.google.com/document/d/1MRI1R5DaX8kM6DaqRQsEri5Mx2FvHmWv
> 13qe1W0Bj0g <
> https://docs.google.com/document/d/1MRI1R5DaX8kM6DaqRQsEri5Mx2FvHmWv1
> 3qe1W0Bj0g/edit>)? I am happy to allocate more time polishing the
> proposal if you find anything off there. If you prefer me sending it
> via email again (for ease of reference in the future maybe?), I am
> happy to do so as well.

Thanks for the proposal. 

Overall, it looks great.  Some notes:
- maybe specify the *GCC* static analyzer you first mention it
- you talk about "timeout" warnings.  The analyzer already can emit a
"timeout" warning of sorts, via -Wanalyzer-too-complex, though this is
based on the complexity of the exploded graph (e.g. # of nodes), rather
than actual timings.  Is the latter the kind of thing you had in mind,
or where you thinking about ways of making the "too complex" heuristics
smarter?  (I confess that you seem much more familiar with the theory
of this than I am!)
- the numbering of your references seems to have gotten out-of-sync; I
see references to [3] as a paper "Schwartz et al", but that's a link to
one of my blog posts.
- do you a link to a github account, or somewhere else that
demonstrates code you've written?  In particular, how is your C++ ?

Note that the deadline for submitting proposals to the official GSoC
website is April 4 - 18:00 UTC (i.e. this coming Tuesday) and that
Google are very strict about that deadline; see:
https://developers.google.com/open-source/gsoc/timeline

Good luck
Dave


  reply	other threads:[~2023-04-02 22:54 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-21 21:26 [GSoC][Static Analyzer] Some questions and request for a small patch to work on Shengyu Huang
2023-02-21 22:55 ` David Malcolm
2023-02-22 14:03   ` Jonathan Wakely
2023-02-23 11:17     ` James K. Lowden
2023-02-23 16:30       ` Jonathan Wakely
2023-02-22 14:11   ` Shengyu Huang
2023-02-22 14:53     ` Iain Sandoe
2023-02-22 15:13       ` Iain Sandoe
2023-02-27 13:35       ` Shengyu Huang
2023-02-27 13:45         ` Iain Sandoe
2023-02-27 13:51           ` Shengyu Huang
2023-02-27 13:49         ` Iain Sandoe
2023-02-27 14:01           ` Floyd, Paul
2023-02-22 15:43     ` David Malcolm
2023-02-28  9:18       ` Shengyu Huang
2023-02-28 23:59         ` David Malcolm
2023-03-01 11:16           ` Shengyu Huang
2023-03-01 13:48             ` David Malcolm
2023-02-28 14:46     ` [GSoC][Static Analyzer] Ideas for proposal Shengyu Huang
2023-03-01  0:22       ` David Malcolm
2023-03-12 22:20         ` Shengyu Huang
2023-03-13 15:51           ` David Malcolm
2023-03-20 17:28             ` [GSoC][Static Analyzer] First proposal draft and a few more questions/requests Shengyu Huang
2023-03-20 22:50               ` David Malcolm
2023-03-26 16:03                 ` Shengyu Huang
2023-03-26 17:14                   ` David Malcolm
2023-03-26 21:46                     ` Shengyu Huang
2023-04-01 14:19                       ` Shengyu Huang
2023-04-02 22:53                         ` David Malcolm [this message]
2023-04-03  0:02                           ` Shengyu Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8c9a526ae6ce6868d21da887ee5cecd5c5fa28ac.camel@redhat.com \
    --to=dmalcolm@redhat.com \
    --cc=gcc@gcc.gnu.org \
    --cc=kumom.huang@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).