From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-f50.google.com (mail-lf1-f50.google.com [209.85.167.50]) by sourceware.org (Postfix) with ESMTPS id D45D5398544E for ; Wed, 14 Jul 2021 13:56:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D45D5398544E Received: by mail-lf1-f50.google.com with SMTP id 8so3730586lfp.9 for ; Wed, 14 Jul 2021 06:56:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=cPGfxKb9RnMQyDUmTT8efxTpABjHapPQ5m1KIWeFwcA=; b=f70QiJv7wmjgcbOnDFTLyClvdFCuIGx1pVmJ7bV0bPWvw7QPcII3OgeVdyIcHWJskt uTM+zGAgXW3oWIPDAgtQkVyghMMuQnFJQfDM3S/0Wk+M4jSuKQEqJPWacf0CHKAQwvdB 9Ekci8C4wRSyO2rGgyy8rzMQj67cdBd1ffstOgcry1+bhk9BjMpjvgLBedI1NFEvGNoY CW/PTJESD7rsa2Z8GUobyMuFBE0R7oZ7soHTXSOjhwtkTgLJIWy7AhpYPuDIzTi4f+Yy hJNKw8yORAJN16NU0xJAX8qs4JBHoGT2g5N218/sCibHwUDoswVM6lSAPAr6BEYrhNsj oXVQ== X-Gm-Message-State: AOAM532kjlzuG/uGxzz6Hd7JoAsPSj7JtXfS6uvrmLMpMFSHH0/W16iT pe0Anub22MtWikolrppDvEZjYza5p6OoMg== X-Google-Smtp-Source: ABdhPJz3TytBm2XwZ3Uj3JrnjYqVASmeK9oXG0JmAsZuZJhyfLWR+v/+Zt6bMhsVhBEXZNQsPLW8wQ== X-Received: by 2002:a19:a403:: with SMTP id q3mr7575588lfc.287.1626271005109; Wed, 14 Jul 2021 06:56:45 -0700 (PDT) Received: from mail-lj1-f182.google.com (mail-lj1-f182.google.com. [209.85.208.182]) by smtp.gmail.com with ESMTPSA id p21sm169327lfa.264.2021.07.14.06.56.44 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 14 Jul 2021 06:56:44 -0700 (PDT) Received: by mail-lj1-f182.google.com with SMTP id r16so3498975ljk.9 for ; Wed, 14 Jul 2021 06:56:44 -0700 (PDT) X-Received: by 2002:a2e:9d59:: with SMTP id y25mr9263837ljj.399.1626271004483; Wed, 14 Jul 2021 06:56:44 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Erick Ochoa Date: Wed, 14 Jul 2021 15:56:33 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: tree decl stored during LGEN does not map to a symtab_node during WPA To: Richard Biener Cc: Jan Hubicka , GCC Development Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Jul 2021 13:56:48 -0000 > I guess the way to encode SSA trees would be to use sth like a > , SSA-version tuple much like PTA internally > uses the varinfo array index as identifier for the variables in the > constraints. For local decls (as opposed to SSA names) it's a bit > more difficult - you'd have to devise your own encoding here. > > What you can rely on I think is that for local variables UID relations > are preserved, so you could sort cfun->local_decls and use the > position in this array as encoding (in fact I see local_decls is > streamed literally, so you don't even need to sort that for the > start - but we could likely do that without harm to make searching > for a UID O(log n)). At the moment I am generating a unique id for each constraint variable generated. I have assigned a unique LGEN number to each variable and during WPA I have merged duplicates. The duplication of equivalent gimple variables in distinct LGEN partitions happens for global variables (as we have discussed before). Do you know if there are other cases of duplication that might happen? For example, could a single function be analyzed in different LGEN partitions? I followed your example here and I am "encoding" the constraint variables that relate to SSA variables by looking at the cgraph_node and the SSA-version. The tree is not stored but at WPA we know the SSA-version and the cgraph_node and I think this is enough to relate back to the SSA variable in the gimple source. You mention that I need to devise my own "encoder", but I am not sure if we are conflating two notions: 1. encoding tree variables to constraint variables (i.e., a mapping of some tuple (cgraph_node x symtab_node x ssa-version) to an integer that represents the constraint variable) 2. encoding as an implementation of a data structure used during LTO to stream in and stream out trees/symbols to and from partitions. (e.g., lto_symtab_encoder_t). So to be clear, when you say I need to devise my own "encoder" you are referring to definition number 1, not definition number 2, right? And at LTRANS using the relation (cgraph_node x symtab_node x ssa-version) x constraint-variable-id one should be able to map to the interesting pointer/pointee from the constraint variable id. I am thinking a little bit ahead, but I will need a way to relate memory allocation sites (e.g., malloc's) to some constraint variable and perhaps generalize this to expressions (I would like to say that a variable is pointing to a STRING_CST for example). Do you have an idea on how to go and encode using the first definition of encoding tree expressions? I have seen some papers that use instruction-id's (potentially an integer that corresponds as a unique identifier for the instruction) but I am unsure if there is something similar to this in GCC. If what you meant is the second definition, can someone elaborate on the precise steps for making my own encoder? While I am somewhat familiar with using the LTO framework I am unfamiliar with potentially extending it in these sorts of ways. Thanks! Any help is appreciated.