public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Ankur Saini <arsenic.secondary@gmail.com>
To: David Malcolm <dmalcolm@redhat.com>
Cc: gcc@gcc.gnu.org
Subject: Re: daily report on extending static analyzer project [GSoC]
Date: Mon, 5 Jul 2021 21:45:39 +0530	[thread overview]
Message-ID: <B72BAAE6-D12C-4D60-9C98-2D80996547AB@gmail.com> (raw)
In-Reply-To: <E0BE2F98-6EF1-4197-8B8A-67C50B3219CB@gmail.com>

I forgot to send the daily report yesterday, so this one covers the work done on both days

AIM : 

- make the analyzer call the function with the updated call-string representation ( even the ones that doesn’t have a superedge )
- make the analyzer figure out the point of return from the function called without the superedge
- make the analyser figure out the correct point to return back in the caller function
- make enode and eedge representing the return call
- test the changes on the example I created before
- speculate what GCC generates for a vfunc call and discuss how can we use it to our advantage

—

PROGRESS  ( changes can be seen on "refs/users/arsenic/heads/analyzer_extension “ branch of the repository ) :

- Thanks to the new call-string representation, I was able to push calls to the call stack which doesn’t have a superedge and was successfully able to see the calls happening via the function pointer.

- To detect the returning point of the function I used the fact that such supernodes would contain an EXIT bb, would not have any return superedge and would still have a pending call-stack. 

- Now the next part was to find out the destination node of the return, for this I again made use of the new call string and created a custom accessor to get the caller and callee supernodes of the return call, then I extracted the gcall* from the caller supernode to ulpdate the program state, 

- now that I have got next state and next point, it was time to put the final piece of puzzle together and create exploded node and edge representing the returning call.

- I tested the changes on the the following program where the analyzer was earlier giving a false negative due to not detecting call via a function pointer

```
#include <stdio.h>
#include <stdlib.h>

void fun(int *int_ptr)
{
    free(int_ptr);
}

int test()
{
    int *int_ptr = (int*)malloc(sizeof(int));
    void (*fun_ptr)(int *) = &fun;
    (*fun_ptr)(int_ptr);

    return 0;
}

void test_2()
{
  test();
}
```
( compiler explorer link : https://godbolt.org/z/9KfenGET9 <https://godbolt.org/z/9KfenGET9> )

and results were showing success where the analyzer was now able to successfully detect, call and return from the function that was called via the function pointer and no longer reported the memory leak it was reporting before. : )

- I think I should point this out, in the process I created a lot of custom function to access/alter some data which was not possible before.

- now that calls via function pointer are taken care of, it was time to see what exactly happen what GCC generates when a function is dispatched dynamically, and as planned earlier, I went to  ipa-devirt.c ( devirtualizer’s implementation of GCC ) to investigate.

- althogh I didn’t understood everything that was happening there but here are some of the findings I though might be interesting for the project :- 
	> the polymorphic call is called with a OBJ_TYPE_REF which contains otr_type( a type of class whose method is called) and otr_token (the index into virtual table where address is taken)
	> the devirtualizer builds a type inheritance graph to keep track of entire inheritance hierarchy
	> the most interesting function I found was “possible_polymorphic_call_targets()” which returns the vector of all possible targets of polymorphic call represented by a calledge or a gcall.
	> what I understood the devirtualizer do is to search in these polymorphic calls and filter out the the calls which are more likely to be called ( known as likely calls ) and then turn them into speculative calls which are later turned into direct calls.

- another thing I was curious to know was, how would analyzer behave when encountered with a polymorphic call now that we are splitting the superedges at every call. 

the results were interesting, I was able to see analyzer splitting supernodes for the calls right away but this time they were not connected via a intraprocedural edge making the analyzer crashing at the callsite ( I would look more into it tomorrow ) 

the example I used was : -
```
struct A
{
    virtual int foo (void) 
    {
        return 42;
    }
};

struct B: public A
{
  int foo (void) 
    { 
    	return 0;
    }
};

int test()
{
    struct B b, *bptr=&b;
    bptr->foo();
    return bptr->foo();
}
```
( compiler explorer link : https://godbolt.org/z/d986ab7MY <https://godbolt.org/z/d986ab7MY> )

—

STATUS AT THE END OF THE DAY :- 

- make the analyzer call the function with the updated call-string representation ( even the ones that doesn’t have a superedge ) (done)
- make the analyzer figure out the point of return from the function called without the superedge (done)
- make the analyser figure out the correct point to return back in the caller function (done)
- make enode and eedge representing the return call (done)
- test the changes on the example I created before (done)
- speculate what GCC generates for a vfunc call and discuss how can we use it to our advantage (done)


Thank you
- Ankur

  reply	other threads:[~2021-07-05 16:15 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-24 14:29 Ankur Saini
2021-06-24 20:53 ` David Malcolm
2021-06-25 15:03   ` Ankur Saini
2021-06-25 15:34     ` David Malcolm
2021-06-26 15:20       ` Ankur Saini
2021-06-27 18:48         ` David Malcolm
2021-06-28 14:53           ` Ankur Saini
2021-06-28 23:39             ` David Malcolm
2021-06-29 16:34               ` Ankur Saini
2021-06-29 19:53                 ` David Malcolm
     [not found]                   ` <AD7A4C2F-1451-4317-BE53-99DE9E9853AE@gmail.com>
2021-06-30 17:17                     ` David Malcolm
2021-07-02 14:18                       ` Ankur Saini
2021-07-03 14:37                         ` Ankur Saini
2021-07-05 16:15                           ` Ankur Saini [this message]
2021-07-06 23:11                             ` David Malcolm
2021-07-06 22:46                           ` David Malcolm
2021-07-06 22:50                             ` David Malcolm
2021-07-07 13:52                             ` Ankur Saini
2021-07-07 14:37                               ` David Malcolm
2021-07-10 15:57                                 ` Ankur Saini
2021-07-11 17:01                                   ` Ankur Saini
2021-07-11 18:01                                     ` David Malcolm
2021-07-11 17:49                                   ` David Malcolm
2021-07-12 16:37                                     ` Ankur Saini
2021-07-14 17:11                                       ` Ankur Saini
2021-07-14 23:23                                         ` David Malcolm
2021-07-16 15:34                                           ` Ankur Saini
2021-07-16 21:27                                             ` David Malcolm
2021-07-21 16:14                                               ` Ankur Saini
2021-07-22 17:10                                                 ` Ankur Saini
2021-07-22 23:21                                                   ` David Malcolm
2021-07-24 16:35                                                   ` Ankur Saini
2021-07-27 15:05                                                     ` Ankur Saini
2021-07-28 15:49                                                       ` Ankur Saini
2021-07-29 12:50                                                         ` Ankur Saini
2021-07-30  0:05                                                           ` David Malcolm
     [not found]                                                             ` <ACE21DBF-8163-4F28-B755-6B05FDA27A0E@gmail.com>
2021-07-30 14:48                                                               ` David Malcolm
2021-08-03 16:12                                                                 ` Ankur Saini
2021-08-04 16:02                                                                   ` Ankur Saini
2021-08-04 23:26                                                                     ` David Malcolm
2021-08-05 14:57                                                                       ` Ankur Saini
2021-08-05 23:09                                                                         ` David Malcolm
2021-08-06 15:41                                                                           ` Ankur Saini
2021-07-22 23:07                                                 ` David Malcolm
2021-07-14 23:07                                       ` David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B72BAAE6-D12C-4D60-9C98-2D80996547AB@gmail.com \
    --to=arsenic.secondary@gmail.com \
    --cc=dmalcolm@redhat.com \
    --cc=gcc@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).