Can someone help me understand cgraph_nodes & cgraph

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Can someone help me understand cgraph_nodes & cgraph_edges during WPA
@ 2021-07-23 11:46 Erick Ochoa
  0 siblings, 0 replies; only message in thread
From: Erick Ochoa @ 2021-07-23 11:46 UTC (permalink / raw)
  To: gcc

Hello,

I've been working on an LTO points-to analysis pass for a little
while. Because of LTO's design, gimple bodies are inaccessible during
WPA. This essentially means that every LTO pass compiles down function
bodies into their own IR which gets stored in function summaries and
later read during WPA. This is also what I plan to do.

I recently started looking into how IPA-CP works. I noticed that
again, IPA-CP compiles down every function into its own function
summary. However, while reading functions, it selectively decides
which information to store by looking at symtab_node::prevailing_p. I
was not aware of this function but from what I understand it is a way
of deciding which symtab_node's bodies survive when removing
duplicates before the execute stage of the pass. Is this correct? For
IPA-CP, those cgraph_nodes for which the predicate
symtab_node::prevailing_p are just read and discarded. This makes
sense if it is a duplication of content.

I know that different cgraph_nodes might represent the same function
but maybe one of them has been specialized, another version has been
inlined. I also think that two different cgraph_nodes might represent
the same function implementation (i.e., they shared the same body and
the same information but this information is duplicated during LGEN
across partitions). I believe that it is not until the WPA/execute
(making a distinction between WPA/execute and WPA/read_summary) that
the distinct cgraph_nodes are merged. Would it be correct to say that
a more faithful representation of reality is that non-prevailing_p
nodes are eliminated while the other ones remain?) However,
cgraph_nodes which represent the same function, but have been
specialized will be marked as prevailing_p. Is this correct? (Here, I
am not sure about the internals of the LTO, because in some sense, the
points-to analysis hasn't run, but it is possible that other analysis
have already run their WPA/execute stage and have said that some
function bodies need to be specialized but at the moment they are
still virtual clones? Related question, do virtual clones have
cgraph_node?)

I did a little experiment yesterday where I had the following control group:
1. encoded a cgraph_node during LGEN/write_summary
2. decoded a cgraph_node during WPA/read_summary and printed cnode->name ()

and compared it against the following experimental group:
1. encoded a cgraph_node during LGEN/write_summary
2. decoded a cgraph_node during WPA/read
3. during WPA/execute I printed cnode->name ()

What I found was that during the run of the "control group" I was able
to print all cnodes' names. However, during the run of the
"experimental group" only some of the names were printed before a
segmentation fault occurred. Again, this might have been because those
cgraph_node's were deleted. My theory is that these are
non-prevailing_p cgraph_nodes but I haven't confirmed it
experimentally, is this the case? I also do not know if all data being
pointed to by these cgraph_node* is corrupted or if only some parts of
the cgraph_node* have been removed from memory (like the name). Would
cgraph_node* during WPA/execute in the experimental run have some
valid fields or should it all be considered invalid and not even
accessed outside of WPA/read?

Looking at the definition of non-prevailing_p, it seems that all
functions without a gimple body will be marked as non-prevailing_p.
What does this mean though? There are definitely calls to external
functions and so having a call to a non-prevailing_p just means that
you are calling a function with no defined body. But what does that
mean for functions that were "merged" or removed because they are
duplicates? Can you have a cgraph_edge to a non-prevailing_p
cgraph_node whose function body was once available at LGEN/lwrite but
it is no longer available during WPA/execute? If that's the case how
does one know the target of the call?

Sorry if these are too many questions, I do greatly appreciate all the
support given to me in the mailing list.

In the meanwhile, I'll continue looking into how ipa-cp works to see
what I can learn from other sources.

Thanks
-Erick

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-07-23 11:46 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-23 11:46 Can someone help me understand cgraph_nodes & cgraph_edges during WPA Erick Ochoa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).