public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Erick Ochoa <eochoa@gcc.gnu.org>
To: Richard Biener <richard.guenther@gmail.com>
Cc: Jan Hubicka <hubicka@ucw.cz>, GCC Development <gcc@gcc.gnu.org>
Subject: Re: tree decl stored during LGEN does not map to a symtab_node during WPA
Date: Wed, 14 Jul 2021 15:56:33 +0200	[thread overview]
Message-ID: <CAJ_nqziRehp6oODtvGhLU72kR06eZMvKn0ZwMHhddMX8MzC54w@mail.gmail.com> (raw)
In-Reply-To: <CAFiYyc23ekuBco6nN7aOBWpAxSSbmG8LWsMS=Ouhx8v5b69wqg@mail.gmail.com>

> I guess the way to encode SSA trees would be to use sth like a
> <function-encoder>, SSA-version tuple much like PTA internally
> uses the varinfo array index as identifier for the variables in the
> constraints.  For local decls (as opposed to SSA names) it's a bit
> more difficult - you'd have to devise your own encoding here.
>
> What you can rely on I think is that for local variables UID relations
> are preserved, so you could sort cfun->local_decls and use the
> position in this array as encoding (in fact I see local_decls is
> streamed literally, so you don't even need to sort that for the
> start - but we could likely do that without harm to make searching
> for a UID O(log n)).

At the moment I am generating a unique id for each constraint variable
generated. I have assigned a unique LGEN number to each variable and
during WPA I have merged duplicates. The duplication of equivalent
gimple variables in distinct LGEN partitions happens for global
variables (as we have discussed before). Do you know if there are
other cases of duplication that might happen? For example, could a
single function be analyzed in different LGEN partitions?

I followed your example here and I am "encoding" the constraint
variables that relate to SSA variables by looking at the cgraph_node
and the SSA-version. The tree is not stored but at WPA we know the
SSA-version and the cgraph_node and I think this is enough to relate
back to the SSA variable in the gimple source.

You mention that I need to devise my own "encoder", but I am not sure
if we are conflating two notions:

1. encoding tree variables to constraint variables (i.e., a mapping of
some tuple (cgraph_node x symtab_node x ssa-version) to an integer
that represents the constraint variable)
2. encoding as an implementation of a data structure used during LTO
to stream in and stream out trees/symbols to and from partitions.
(e.g., lto_symtab_encoder_t).

So to be clear, when you say I need to devise my own "encoder" you are
referring to definition number 1, not definition number 2, right? And
at LTRANS using the relation (cgraph_node x symtab_node x ssa-version)
x constraint-variable-id one should be able to map to the interesting
pointer/pointee from the constraint variable id.

I am thinking a little bit ahead, but I will need a way to relate
memory allocation sites (e.g., malloc's) to some constraint variable
and perhaps generalize this to expressions (I would like to say that a
variable is pointing to a STRING_CST for example). Do you have an idea
on how to go and encode using the first definition of encoding tree
expressions? I have seen some papers that use instruction-id's
(potentially an integer that corresponds as a unique identifier for
the instruction) but I am unsure if there is something similar to this
in GCC. If what you meant is the second definition, can someone
elaborate on the precise steps for making my own encoder? While I am
somewhat familiar with using the LTO framework I am unfamiliar with
potentially extending it in these sorts of ways.

Thanks! Any help is appreciated.

  reply	other threads:[~2021-07-14 13:56 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-07  9:27 Erick Ochoa
2021-07-09  7:51 ` Erick Ochoa
2021-07-09  9:49   ` Richard Biener
2021-07-12 10:55     ` Erick Ochoa
2021-07-13  9:21       ` Erick Ochoa
2021-07-13  9:41         ` Richard Biener
2021-07-13 10:49           ` Erick Ochoa
2021-07-13 12:55             ` Richard Biener
2021-07-14 13:56               ` Erick Ochoa [this message]
2021-07-15  7:23                 ` Richard Biener
2021-07-21 16:55                   ` Erick Ochoa
2021-07-22 11:40                     ` Richard Biener
2021-07-22 12:04                       ` Erick Ochoa
2021-07-22 12:08                         ` Erick Ochoa
2021-07-22 12:23                         ` Richard Biener
2021-07-22 12:33                           ` Erick Ochoa
2021-07-22 12:48                             ` Richard Biener
2021-07-22 14:32                               ` Erick Ochoa
2021-07-28 10:35                                 ` Richard Biener
2021-07-13 11:56           ` Erick Ochoa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJ_nqziRehp6oODtvGhLU72kR06eZMvKn0ZwMHhddMX8MzC54w@mail.gmail.com \
    --to=eochoa@gcc.gnu.org \
    --cc=gcc@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).