From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) by sourceware.org (Postfix) with ESMTPS id 6C9063850425; Thu, 15 Jul 2021 07:24:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6C9063850425 Received: by mail-ej1-x630.google.com with SMTP id gb6so7653161ejc.5; Thu, 15 Jul 2021 00:24:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=8shTj7TDdMse27VX2dQtlQ1OGspWGW1VcVyNOnt8QaI=; b=bVYGNz81si+x/T3nwGqmV9cStbeKkrCJ8PGmLKJ2kqqCHTmxJbeGn+aE1DkoHRNjMx JDquHBjRfNcw4eFpsSNlrN5mIPYZsw2HUy5CD+9iBncpBDajcM2klrJj2OJufSH6aRhi BLjD3+JR2tmKZ9Iu+QI2aviU5XhS9dSeIUyu2TCpiMIrjGe8tqCLA90X2kEILUv69pNk AwY/8Ejub4d8S+X3yGU1cvE2wvBOGT0lqLiwLmw/bLca9QXx/9aB/tFBnVEBb8P1V5pM SiHwCeMx1RegPP68dLvbQQcN30nU8Io31PB01RTls2MLwynoFF9vGa/7R6VFnWroSk6s LNSQ== X-Gm-Message-State: AOAM5326cnc8fN9ESuqfOxjJ7pQF6zjXaDzltbTwKvJgdM/520URAwS8 xCympnA2OtbIFqc59sziB3OgY9FTTHip0IYRODrgu/KUhgk= X-Google-Smtp-Source: ABdhPJxmd/eB+2QXCdkSGMTO//xR8g2pd/NuGEpWMH5m2Ib/eddBNtYnWt/B+j8jWV3miYJjBT+sn2w0Cg/bA83b4cc= X-Received: by 2002:a17:906:f9c5:: with SMTP id lj5mr3853269ejb.482.1626333840015; Thu, 15 Jul 2021 00:24:00 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Thu, 15 Jul 2021 09:23:49 +0200 Message-ID: Subject: Re: tree decl stored during LGEN does not map to a symtab_node during WPA To: Erick Ochoa Cc: Jan Hubicka , GCC Development Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Jul 2021 07:24:03 -0000 On Wed, Jul 14, 2021 at 3:56 PM Erick Ochoa wrote: > > > I guess the way to encode SSA trees would be to use sth like a > > , SSA-version tuple much like PTA internally > > uses the varinfo array index as identifier for the variables in the > > constraints. For local decls (as opposed to SSA names) it's a bit > > more difficult - you'd have to devise your own encoding here. > > > > What you can rely on I think is that for local variables UID relations > > are preserved, so you could sort cfun->local_decls and use the > > position in this array as encoding (in fact I see local_decls is > > streamed literally, so you don't even need to sort that for the > > start - but we could likely do that without harm to make searching > > for a UID O(log n)). > > At the moment I am generating a unique id for each constraint variable > generated. I have assigned a unique LGEN number to each variable and > during WPA I have merged duplicates. The duplication of equivalent > gimple variables in distinct LGEN partitions happens for global > variables (as we have discussed before). Do you know if there are > other cases of duplication that might happen? For example, could a > single function be analyzed in different LGEN partitions? A single source representation of inline functions and template instantiations can be analyzed in different LGEN partitions, yes. Those are merged as well. > I followed your example here and I am "encoding" the constraint > variables that relate to SSA variables by looking at the cgraph_node > and the SSA-version. The tree is not stored but at WPA we know the > SSA-version and the cgraph_node and I think this is enough to relate > back to the SSA variable in the gimple source. Yes, I think so. > You mention that I need to devise my own "encoder", but I am not sure > if we are conflating two notions: > > 1. encoding tree variables to constraint variables (i.e., a mapping of > some tuple (cgraph_node x symtab_node x ssa-version) to an integer > that represents the constraint variable) > 2. encoding as an implementation of a data structure used during LTO > to stream in and stream out trees/symbols to and from partitions. > (e.g., lto_symtab_encoder_t). I meant 1) and streaming using the LTO cgraph encoder for the cgraph part and simply using the SSA version for the second part. > So to be clear, when you say I need to devise my own "encoder" you are > referring to definition number 1, not definition number 2, right? And > at LTRANS using the relation (cgraph_node x symtab_node x ssa-version) > x constraint-variable-id one should be able to map to the interesting > pointer/pointee from the constraint variable id. > > I am thinking a little bit ahead, but I will need a way to relate > memory allocation sites (e.g., malloc's) to some constraint variable > and perhaps generalize this to expressions (I would like to say that a > variable is pointing to a STRING_CST for example). Do you have an idea > on how to go and encode using the first definition of encoding tree > expressions? The easiest is probably to hook it up to things you already encode, like for malloc it would be the SSA def of the resulting pointer. We do preserve the order of stmts in basic-blocks and basic-block indices, we also use stmt UIDs (but re-number them at streaming time, so you can't directly use them), so using a "stmt number" would be possible as well. The LTO streaming uses this to map back the callgraph edge -> gimple stmt reference (lto-streamer-in.c:fixup_call_stmt_edges), the code also calls execute_all_ipa_stmt_fixups which presumably is more "generic" code - I'm not familiar with it but you can dig whether it would suit your needs. It's not that the generic LTO / IPA mechanisms cannot be extended. I have seen some papers that use instruction-id's > (potentially an integer that corresponds as a unique identifier for > the instruction) but I am unsure if there is something similar to this > in GCC. If what you meant is the second definition, can someone > elaborate on the precise steps for making my own encoder? While I am > somewhat familiar with using the LTO framework I am unfamiliar with > potentially extending it in these sorts of ways. > > Thanks! Any help is appreciated.