From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by sourceware.org (Postfix) with ESMTPS id B95EF3857419 for ; Sat, 26 Jun 2021 15:20:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B95EF3857419 Received: by mail-pj1-x1036.google.com with SMTP id pf4-20020a17090b1d84b029016f6699c3f2so9799347pjb.0 for ; Sat, 26 Jun 2021 08:20:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=+F0d2yhktM6AH8zcsyEffnoCh9sWgazS87bCOOE4HxA=; b=fwMXNj90ew1xtVUmoxtStdGVKqy6rVYBBFP2Ap1EZE8Pkt35KJgSeb8LwkL0uxWcAI qOGC91tAlkXv8DL4w+cvWNLP//rJCOAHmmIj5vayZfGS66/uH4m6QWGFA5V9HQyPka8G Fav9HuVcVJyhiSfrKejQBdLzrlmO9mfEmB/TwZErBSGLPk2KmbW/3Xh6uYI3EVR5Vaim Z/Hwct/XnyFLLG+/JfJhpP9ntFBHQYmeh3KFeBdHirRTeE8JP8fPzl0nL/JqL+rsQiqn yRmIEVM5VDO4u/KPDXezxlhz33tD+gTgXeGhFel3g8mBTnitnJ2uU2sVy2Blv/ugzDrz XZSg== X-Gm-Message-State: AOAM531DmO72lEvzBaYarPTWsa1nl4Ats8BjtZiqDJVr44yJ2ir5GKeq asMjCWYOs1VJfvtJrMfdHPg= X-Google-Smtp-Source: ABdhPJz5a1WcZUFS30DVVDraBI2pFarVJrkQ9vLvYq7/J0Uz2SDstwwnAyV6R+HtrnrWESkn68ah7Q== X-Received: by 2002:a17:90a:8d17:: with SMTP id c23mr2048982pjo.96.1624720816825; Sat, 26 Jun 2021 08:20:16 -0700 (PDT) Received: from ?IPv6:2409:4050:eca:41ab:f177:42be:2a0c:30bf? ([2409:4050:eca:41ab:f177:42be:2a0c:30bf]) by smtp.gmail.com with ESMTPSA id gg5sm13106577pjb.0.2021.06.26.08.20.14 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 26 Jun 2021 08:20:16 -0700 (PDT) From: Ankur Saini Message-Id: <06DBCE04-B3AC-4091-979D-430507352213@gmail.com> Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.20.0.2.21\)) Subject: Re: daily report on extending static analyzer project [GSoC] Date: Sat, 26 Jun 2021 20:50:09 +0530 In-Reply-To: Cc: gcc@gcc.gnu.org To: David Malcolm References: <35A0246A-D4F8-4B41-A009-4A98F78E0395@gmail.com> X-Mailer: Apple Mail (2.3654.20.0.2.21) X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, HTML_MESSAGE, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Jun 2021 15:20:20 -0000 > On 25-Jun-2021, at 9:04 PM, David Malcolm wrote: >=20 > On Fri, 2021-06-25 at 20:33 +0530, Ankur Saini wrote: >> AIM for today :=20 >>=20 >> - try to create an intra-procedural link between the calls the = calling >> and returning snodes >> - figure out the program point where exploded graph would know about >> the function calls >> - figure out how the exploded node will know which function to call >> - create enodes and eedges for the calls >>=20 >> =E2=80=94 >>=20 >> PROGRESS : >>=20 >> - I created an intraprocedural link between where the the splitting = is happening to connect the call and returning snodes. like this :- >>=20 >> (in supergraph.cc at "supergraph::supergraph (logger *logger)" ) >> ``` >> 185 if (cgraph_edge *edge =3D supergraph_call_edge (fun, = stmt)) >> 186 { >> 187 m_cgraph_edge_to_caller_prev_node.put(edge, = node_for_stmts); >> 188 node_for_stmts =3D add_node (fun, bb, as_a (stmt), NULL); >> 189 m_cgraph_edge_to_caller_next_node.put (edge, = node_for_stmts); >> 190 } >> 191 else >> 192 { >> 193 gcall *call =3D dyn_cast (stmt); >> 194 if (call) >> 195 { >> 196 supernode *old_node_for_stmts =3D node_for_stmts; >> 197 node_for_stmts =3D add_node (fun, bb, as_a (stmt), NULL); > = ^^^^^^^^^^^^^^^^^^^^^ > Given the dyn_cast of stmt to gcall * at line 193 you can use "call" > here, without the as_a cast, as you've already got "stmt" as a gcall * > as tline 193. ok >=20 > You might need to add a hash_map recording the mapping from such stmts > to the edges, like line 189 does. I'm not sure, but you may need it > later. but the node is being created if there is no cgraph_edge corresponding = to the call, so to what edge will I map =E2=80=9Cnode_for_stmts" to ? >=20 >=20 >> 198 >> 199 superedge *sedge =3D new callgraph_superedge = (old_node_for_stmts, >> 200 node_for_stmts, >> 201 SUPEREDGE_INTRAPROCEDURAL_CALL, >> 202 NULL); >> 203 add_edge (sedge); >> 204 } =20 >> 205 } >> ``` >>=20 >> - now that we have a intraprocedural link between such calls, and the >> analyzer will consider them as =E2=80=9Cimpossible edge=E2=80=9D ( = whenever a "node- >>> on_edge()=E2=80=9D returns false ) while processing worklist, and I = think this >> should be the correct place to speculate about the function call by >> creating exploded nodes and edges representing calls ( maybe by = adding >> a custom edge info ). >>=20 >> - after several of failed attempts to do as mentioned above, looks = like >> I was looking the wrong way all along. I think I just found out what = my >> mentor meant when telling me to look into "calls node->on_edge=E2=80=9D= . During >> the edge inspection ( in program_point::on_edge() ) , if it=E2=80=99s = an >> Intraprocedural s sedge, maybe I can add an extra intraprocedural = sedge >> to the correct edge right here with the info state of that program >> point.=20 >=20 > I don't think we need a superedge for such a call, just an > exploded_edge. (Though perhaps adding a superedge might make things > easier? I'm not sure, but I'd first try not bothering to add one) ok, will scratch this idea for now. >=20 >>=20 >> Q. But even if we find out which function to call, how will the >> analyzer know which snode does that function belong ? >=20 > Use this method of supergraph: > supernode *get_node_for_function_entry (function *fun) const; > to get the supernode for the entrypoint of a given function. >=20 > You can get the function * from a fndecl via DECL_STRUCT_FUNCTION. so once we get fndecl, it should be comparatively smooth sailing from = there.=20 My attempt to get the value of function pointer from the state : - - to access the region model of the state, I tried to access = =E2=80=9Cm_region_model=E2=80=9D of that state. - now I want to access cluster for a function pointer. - but when looking at the accessible functions to region model class, I = couldn=E2=80=99t seem to find the fitting one. ( the closest I could = find was =E2=80=9Cregion_model::get_reachable_svalues()=E2=80=9D to get = a set of all the svalues reachable from that model ) >=20 >> Q. on line 461 of program-point.cc=20 >>=20 >> ``` >> 457 else >> 458 { >> 459 /* Otherwise, we ignore these edges */ >> 460 if (logger) >> 461 logger->log ("rejecting interprocedural edge"); >> 462 return false; >> 463 } >> ``` >> why are we rejecting =E2=80=9Cinterprocedural" edge when we are = examining an >> =E2=80=9Cintraprocedural=E2=80=9D edge ? or is it for the = "cg_sedge->m_cedge=E2=80=9D edge, >> which is an interprocedural edge ? >=20 > Currently, those interprocedural edges don't do much. Above the = "else" > clause of the lines above the ones you quote is some support for call > summaries. >=20 > The idea is that we ought to be able to compute summaries of what a > function call does, and avoid exponential explosions during the > analysis by reusing summaries at a callsite. But that code doesn't > work well at the moment; see: > https://gcc.gnu.org/bugzilla/showdependencytree.cgi?id=3D99390 = >=20 > If you ignore call summaries for now, I think you need to change this > logic so it detects if we have a function pointer that we "know" the > value of from the region_model, and have it generate an exploded_node > and exploded_edge for the call. Have a look at how SUPEREDGE_CALL is > handled by program_state and program_point; you should implement > something similar, I think. Given that you need both the super_edge, > point *and* state all together to detect this case, I think the logic > you need to add probably needs to be in exploded_node::on_edge as a > specialcase before the call there to next_point->on_edge. >=20 > Hope this is helpful > Dave Thank you - Ankur