From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <arsenic.secondary@gmail.com>
Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com
 [IPv6:2607:f8b0:4864:20::1036])
 by sourceware.org (Postfix) with ESMTPS id B95EF3857419
 for <gcc@gcc.gnu.org>; Sat, 26 Jun 2021 15:20:17 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B95EF3857419
Received: by mail-pj1-x1036.google.com with SMTP id
 pf4-20020a17090b1d84b029016f6699c3f2so9799347pjb.0
 for <gcc@gcc.gnu.org>; Sat, 26 Jun 2021 08:20:17 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:message-id:mime-version:subject:date
 :in-reply-to:cc:to:references;
 bh=+F0d2yhktM6AH8zcsyEffnoCh9sWgazS87bCOOE4HxA=;
 b=fwMXNj90ew1xtVUmoxtStdGVKqy6rVYBBFP2Ap1EZE8Pkt35KJgSeb8LwkL0uxWcAI
 qOGC91tAlkXv8DL4w+cvWNLP//rJCOAHmmIj5vayZfGS66/uH4m6QWGFA5V9HQyPka8G
 Fav9HuVcVJyhiSfrKejQBdLzrlmO9mfEmB/TwZErBSGLPk2KmbW/3Xh6uYI3EVR5Vaim
 Z/Hwct/XnyFLLG+/JfJhpP9ntFBHQYmeh3KFeBdHirRTeE8JP8fPzl0nL/JqL+rsQiqn
 yRmIEVM5VDO4u/KPDXezxlhz33tD+gTgXeGhFel3g8mBTnitnJ2uU2sVy2Blv/ugzDrz
 XZSg==
X-Gm-Message-State: AOAM531DmO72lEvzBaYarPTWsa1nl4Ats8BjtZiqDJVr44yJ2ir5GKeq
 asMjCWYOs1VJfvtJrMfdHPg=
X-Google-Smtp-Source: ABdhPJz5a1WcZUFS30DVVDraBI2pFarVJrkQ9vLvYq7/J0Uz2SDstwwnAyV6R+HtrnrWESkn68ah7Q==
X-Received: by 2002:a17:90a:8d17:: with SMTP id
 c23mr2048982pjo.96.1624720816825; 
 Sat, 26 Jun 2021 08:20:16 -0700 (PDT)
Received: from ?IPv6:2409:4050:eca:41ab:f177:42be:2a0c:30bf?
 ([2409:4050:eca:41ab:f177:42be:2a0c:30bf])
 by smtp.gmail.com with ESMTPSA id gg5sm13106577pjb.0.2021.06.26.08.20.14
 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);
 Sat, 26 Jun 2021 08:20:16 -0700 (PDT)
From: Ankur Saini <arsenic.secondary@gmail.com>
Message-Id: <06DBCE04-B3AC-4091-979D-430507352213@gmail.com>
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.20.0.2.21\))
Subject: Re: daily report on extending static analyzer project [GSoC]
Date: Sat, 26 Jun 2021 20:50:09 +0530
In-Reply-To: <e3154f0cf3a16644503ea50360e0166d30478a92.camel@redhat.com>
Cc: gcc@gcc.gnu.org
To: David Malcolm <dmalcolm@redhat.com>
References: <E4C64409-9BEE-4FC2-B683-5D293143715E@gmail.com>
 <d5d0d6f895ee822d98f8ac66c39f05dc3b17618b.camel@redhat.com>
 <35A0246A-D4F8-4B41-A009-4A98F78E0395@gmail.com>
 <e3154f0cf3a16644503ea50360e0166d30478a92.camel@redhat.com>
X-Mailer: Apple Mail (2.3654.20.0.2.21)
X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, HTML_MESSAGE,
 KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
Content-Type: text/plain;
	charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.29
X-BeenThere: gcc@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc mailing list <gcc.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc>,
 <mailto:gcc-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <mailto:gcc-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc>,
 <mailto:gcc-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Sat, 26 Jun 2021 15:20:20 -0000


> On 25-Jun-2021, at 9:04 PM, David Malcolm <dmalcolm@redhat.com> wrote:
>=20
> On Fri, 2021-06-25 at 20:33 +0530, Ankur Saini wrote:
>> AIM for today :=20
>>=20
>> - try to create an intra-procedural link between the calls the =
calling
>> and returning snodes
>> - figure out the program point where exploded graph would know about
>> the function calls
>> - figure out how the exploded node will know which function to call
>> - create enodes and eedges for the calls
>>=20
>> =E2=80=94
>>=20
>> PROGRESS :
>>=20
>> - I created an intraprocedural link between where the the splitting =
is happening to connect the call and returning snodes. like this :-
>>=20
>> (in supergraph.cc at "supergraph::supergraph (logger *logger)" )
>> ```
>> 185             if (cgraph_edge *edge =3D supergraph_call_edge (fun, =
stmt))
>> 186             {
>> 187                m_cgraph_edge_to_caller_prev_node.put(edge, =
node_for_stmts);
>> 188                node_for_stmts =3D add_node (fun, bb, as_a <gcall =
*> (stmt), NULL);
>> 189                m_cgraph_edge_to_caller_next_node.put (edge, =
node_for_stmts);
>> 190             }
>> 191             else
>> 192             {
>> 193               gcall *call =3D dyn_cast<gcall *> (stmt);
>> 194               if (call)
>> 195               {
>> 196                 supernode *old_node_for_stmts =3D node_for_stmts;
>> 197                 node_for_stmts =3D add_node (fun, bb, as_a <gcall =
*> (stmt), NULL);
>                                                          =
^^^^^^^^^^^^^^^^^^^^^
> Given the dyn_cast of stmt to gcall * at line 193 you can use "call"
> here, without the as_a cast, as you've already got "stmt" as a gcall *
> as tline 193.

ok

>=20
> You might need to add a hash_map recording the mapping from such stmts
> to the edges, like line 189 does.  I'm not sure, but you may need it
> later.

but the node is being created if there is no cgraph_edge corresponding =
to the call, so to what edge will I map =E2=80=9Cnode_for_stmts" to ?

>=20
>=20
>> 198
>> 199                 superedge *sedge =3D new callgraph_superedge =
(old_node_for_stmts,
>> 200                     node_for_stmts,
>> 201                     SUPEREDGE_INTRAPROCEDURAL_CALL,
>> 202                     NULL);
>> 203                 add_edge (sedge);
>> 204               }               =20
>> 205             }
>> ```
>>=20
>> - now that we have a intraprocedural link between such calls, and the
>> analyzer will consider them as =E2=80=9Cimpossible edge=E2=80=9D ( =
whenever a "node-
>>> on_edge()=E2=80=9D returns false ) while processing worklist, and I =
think this
>> should be the correct place to speculate about the function call by
>> creating exploded nodes and edges representing calls ( maybe by =
adding
>> a custom edge info ).
>>=20
>> - after several of failed attempts to do as mentioned above, looks =
like
>> I was looking the wrong way all along. I think I just found out what =
my
>> mentor meant when telling me to look into "calls node->on_edge=E2=80=9D=
. During
>> the edge inspection ( in program_point::on_edge() ) , if it=E2=80=99s =
an
>> Intraprocedural s sedge, maybe I can add an extra intraprocedural =
sedge
>> to the correct edge right here with the info state of that program
>> point.=20
>=20
> I don't think we need a superedge for such a call, just an
> exploded_edge.  (Though perhaps adding a superedge might make things
> easier?  I'm not sure, but I'd first try not bothering to add one)

ok, will scratch this idea for now.

>=20
>>=20
>> Q. But even if we find out which function to call, how will the
>> analyzer know which snode does that function belong ?
>=20
> Use this method of supergraph:
>  supernode *get_node_for_function_entry (function *fun) const;
> to get the supernode for the entrypoint of a given function.
>=20
> You can get the function * from a fndecl via DECL_STRUCT_FUNCTION.

so once we get fndecl, it should be comparatively smooth sailing from =
there.=20

My attempt to get the value of function pointer from the state : -

- to access the region model of the state, I tried to access =
=E2=80=9Cm_region_model=E2=80=9D of that state.
- now I want to access cluster for a function pointer.
- but when looking at the accessible functions to region model class, I =
couldn=E2=80=99t seem to find the fitting one. ( the closest I could =
find was =E2=80=9Cregion_model::get_reachable_svalues()=E2=80=9D to get =
a set of all the svalues reachable from that model )

>=20
>> Q. on line 461 of program-point.cc=20
>>=20
>> ```
>> 457             else
>> 458               {
>> 459                 /* Otherwise, we ignore these edges  */
>> 460                 if (logger)
>> 461                   logger->log ("rejecting interprocedural edge");
>> 462                 return false;
>> 463               }
>> ```
>> why are we rejecting =E2=80=9Cinterprocedural" edge when we are =
examining an
>> =E2=80=9Cintraprocedural=E2=80=9D edge ? or is it for the =
"cg_sedge->m_cedge=E2=80=9D edge,
>> which is an interprocedural edge ?
>=20
> Currently, those interprocedural edges don't do much.  Above the =
"else"
> clause of the lines above the ones you quote is some support for call
> summaries.
>=20
> The idea is that we ought to be able to compute summaries of what a
> function call does, and avoid exponential explosions during the
> analysis by reusing summaries at a callsite.  But that code doesn't
> work well at the moment; see:
>  https://gcc.gnu.org/bugzilla/showdependencytree.cgi?id=3D99390 =
<https://gcc.gnu.org/bugzilla/showdependencytree.cgi?id=3D99390>
>=20
> If you ignore call summaries for now, I think you need to change this
> logic so it detects if we have a function pointer that we "know" the
> value of from the region_model, and have it generate an exploded_node
> and exploded_edge for the call.  Have a look at how SUPEREDGE_CALL is
> handled by program_state and program_point; you should implement
> something similar, I think.  Given that you need both the super_edge,
> point *and* state all together to detect this case, I think the logic
> you need to add probably needs to be in exploded_node::on_edge as a
> specialcase before the call there to next_point->on_edge.
>=20
> Hope this is helpful
> Dave

Thank you
- Ankur