Re: daily report on extending static analyzer project [GSoC]

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

From: David Malcolm <dmalcolm@redhat.com>
To: Ankur Saini <arsenic.secondary@gmail.com>
Cc: gcc@gcc.gnu.org
Subject: Re: daily report on extending static analyzer project [GSoC]
Date: Thu, 29 Jul 2021 20:05:29 -0400	[thread overview]
Message-ID: <f57d9f72b689a5693dfe918d852150dc3040441d.camel@redhat.com> (raw)
In-Reply-To: <E56E7DDB-8EB2-4AA3-B6AB-2C9A491BA4AD@gmail.com>

On Thu, 2021-07-29 at 18:20 +0530, Ankur Saini wrote:
> I have attached the patches(one is the updated version of previous
> patch to 
> detect calls via function pointers) of the changed done to make the
> analyzer 
> understand the calls to virtual functions for initial review. 
> 
> 1. I decided to make a dedicated function to create enodes and eedges
> for the 
> dynamically discovered calls as I found myself using the exact same
> peice of 
> code again to analyse vfunc calls.

Makes sense.

> 
> 2. Boostaraping and testing of these changes are underway.
> 
> 3. Regarding the regress tests that have to be added to test
> functionality of 
> vfunc extension patch :
> Should I add many test files for different types of inheritences or
> should I 
> add one ( or two ) test files, with a lot of fucntions in them testing
> different 
> types of calls ?

Both approaches have merit, and there's an element of personal taste.

I find that during development and debugging it's handy to have the
tests broken out into individual files, but it's good to eventually
combine the tests to minimize the number of invocations that the test
harness has to do.

That said, interprocedural tests tend to be fiddly, so it's often good
to keep these in separate files.

I tend to combine my tests and add them to git, and then to temporarily
trim them down when debugging them to minimize the amoung of unrelated
stuff I'm having to look at when debugging, knowing that git has the
full version saved.

I hope that answers your question.

> 
> ---
> Patches :

This isn't a full review, but...

fn_ptr.patch:

> diff --git a/gcc/testsuite/gcc.dg/analyzer/function-ptr-4.c b/gcc/testsuite/gcc.dg/analyzer/function-ptr-4.c
> new file mode 100644
> index 00000000000..c62510c026f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/function-ptr-4.c
> @@ -0,0 +1,25 @@
> +/*Test to see if the analyzer detect and analyze calls via 
> +  fucntion pointers or not. */
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +
> +void fun(int *int_ptr)
> +{
> +	free(int_ptr); /* { dg-warning "double-'free' of 'int_ptr'" } */
> +}
> +
> +void single_call()
> +{
> +	int *int_ptr = (int*)malloc(sizeof(int));
> +	void (*fun_ptr)(int *) = &fun;
> +	(*fun_ptr)(int_ptr);
> +}
> +
> +void double_call()
> +{
> +	int *int_ptr = (int*)malloc(sizeof(int));
> +	void (*fun_ptr)(int *) = &fun;
> +	(*fun_ptr)(int_ptr);
> +	(*fun_ptr)(int_ptr);
> +}

...thinking back to our discussion about events, it would be good to
verify that the analyzer is emitting them.  You can put directives
like:

   /* { dg-message "calling 'fun' from 'double_call'" } */

on the appropriate lines to test this via DejaGnu.

"analyzer: detect and analyzer vfunc calls"

[...snip...]

> @@ -1242,6 +1243,17 @@ exploded_node::on_stmt (exploded_graph &eg,
>  	unknown_side_effects = false;
>      }
>  
> +  /* If the statmement is a polymorphic call then assume 
> +     there are no side effects.  */
> +  gimple *call_stmt = const_cast<gimple *>(stmt);
> +  if (gcall *call = dyn_cast<gcall *> (call_stmt))
> +  {
> +    function *fun = this->get_function();
> +    cgraph_edge *e = cgraph_node::get (fun->decl)->get_edge (call);
> +    if ((e && e->indirect_info) && (e->indirect_info->polymorphic))
> +        unknown_side_effects = false;
> +  }
> +

This seems wrong; surely it depends on what the call is - or am I
missing something?  Is the issue that we're speculating lots of
possibilities as dynamic calls?  If so, would it be better to terminate
the remaining analysis path (if that makes sense), and assume that any
further analysis happens on extra edges added for the speculated calls?

FWIW I've been experimenting with adding "bifurcation" support so that
you can do:
  program_state *other = ctxt->bifurcate ();
and have it split the analysis into states (e.g. for handling realloc,
so that we can split into 3 states: "succeeded", "succeeded but moved",
"failed").  Unfortunately my code for this is a mess (it's a hacked up
prototype).  Should I try to post what I have for this?


[...snip...]


> @@ -3327,9 +3338,44 @@ exploded_graph::process_node (exploded_node *node)
>                  region_model *model = state.m_region_model;
>  
>                  /* Call is possibly happening via a function pointer.  */
> -                if (tree fn_decl = model->get_fndecl_for_call(call,&ctxt))
> -                  create_dynamic_call (call, fn_decl, node, next_state,
> -                                       next_point, &uncertainty);
> +                if (tree fn_decl = model->get_fndecl_for_call (call,&ctxt))
> +                  create_dynamic_call (call,
> +                  		       fn_decl,
> +                  		       node,
> +                  		       next_state,
> +                                       next_point,
> +                                       &uncertainty);
> +                else
> +                  {
> +                    /* Call is possibly a polymorphic call.
> +
> +                       In such case, use devirtisation tools to find 
> +                       possible callees of this function call and let the 
> +                       analyzer specluate them all.  */
> +		    function *fun = node->get_function ();
> +                    gcall *stmt  = const_cast<gcall *>(call);
> +                    cgraph_edge *e
> +                      = cgraph_node::get (fun->decl)->get_edge (stmt);
> +                    if (e->indirect_info->polymorphic)
> +                      {
> +			void *cache_token;
> +			bool final;
> +			vec <cgraph_node *> targets
> +			  = possible_polymorphic_call_targets (e,
> +							       &final,
> +							       &cache_token,
> +							       true);
> +	       		for (cgraph_node *x : targets)
> +	       		  {
> +	       		    create_dynamic_call (stmt,
> +						 x->decl,
> +						 node,
> +						 next_state,
> +						 next_point,
> +						 &uncertainty);
> +	       		  }
> +                      }
> +                  }

Interesting.

Caveat: I'm not familiar with the devirt code.

If we're speculating that a particular call happens, do we gain
information about what kind of object we're dealing with?

If we have a repeated call to the same vfunc, do we know that we're
calling the same function?

[...snip...]

I'm interested in seeing what test cases you have.


Hope this is helpful
Dave

next prev parent reply	other threads:[~2021-07-30  0:05 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-24 14:29 Ankur Saini
2021-06-24 20:53 ` David Malcolm
2021-06-25 15:03   ` Ankur Saini
2021-06-25 15:34     ` David Malcolm
2021-06-26 15:20       ` Ankur Saini
2021-06-27 18:48         ` David Malcolm
2021-06-28 14:53           ` Ankur Saini
2021-06-28 23:39             ` David Malcolm
2021-06-29 16:34               ` Ankur Saini
2021-06-29 19:53                 ` David Malcolm
     [not found]                   ` <AD7A4C2F-1451-4317-BE53-99DE9E9853AE@gmail.com>
2021-06-30 17:17                     ` David Malcolm
2021-07-02 14:18                       ` Ankur Saini
2021-07-03 14:37                         ` Ankur Saini
2021-07-05 16:15                           ` Ankur Saini
2021-07-06 23:11                             ` David Malcolm
2021-07-06 22:46                           ` David Malcolm
2021-07-06 22:50                             ` David Malcolm
2021-07-07 13:52                             ` Ankur Saini
2021-07-07 14:37                               ` David Malcolm
2021-07-10 15:57                                 ` Ankur Saini
2021-07-11 17:01                                   ` Ankur Saini
2021-07-11 18:01                                     ` David Malcolm
2021-07-11 17:49                                   ` David Malcolm
2021-07-12 16:37                                     ` Ankur Saini
2021-07-14 17:11                                       ` Ankur Saini
2021-07-14 23:23                                         ` David Malcolm
2021-07-16 15:34                                           ` Ankur Saini
2021-07-16 21:27                                             ` David Malcolm
2021-07-21 16:14                                               ` Ankur Saini
2021-07-22 17:10                                                 ` Ankur Saini
2021-07-22 23:21                                                   ` David Malcolm
2021-07-24 16:35                                                   ` Ankur Saini
2021-07-27 15:05                                                     ` Ankur Saini
2021-07-28 15:49                                                       ` Ankur Saini
2021-07-29 12:50                                                         ` Ankur Saini
2021-07-30  0:05                                                           ` David Malcolm [this message]
     [not found]                                                             ` <ACE21DBF-8163-4F28-B755-6B05FDA27A0E@gmail.com>
2021-07-30 14:48                                                               ` David Malcolm
2021-08-03 16:12                                                                 ` Ankur Saini
2021-08-04 16:02                                                                   ` Ankur Saini
2021-08-04 23:26                                                                     ` David Malcolm
2021-08-05 14:57                                                                       ` Ankur Saini
2021-08-05 23:09                                                                         ` David Malcolm
2021-08-06 15:41                                                                           ` Ankur Saini
2021-07-22 23:07                                                 ` David Malcolm
2021-07-14 23:07                                       ` David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f57d9f72b689a5693dfe918d852150dc3040441d.camel@redhat.com \
    --to=dmalcolm@redhat.com \
    --cc=arsenic.secondary@gmail.com \
    --cc=gcc@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).