From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.theobroma-systems.com (vegas.theobroma-systems.com [144.76.126.164]) by sourceware.org (Postfix) with ESMTPS id 9C772385E830 for ; Wed, 6 May 2020 19:25:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9C772385E830 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=theobroma-systems.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=erick.ochoa@theobroma-systems.com Received: from mail.linser.at ([80.109.168.170]:51824 helo=Martins-MacBook-Air.local) by mail.theobroma-systems.com with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from ) id 1jWPfj-0005ni-DC; Wed, 06 May 2020 21:25:35 +0200 Subject: Re: Question about alias or points-to analysis To: Richard Biener Cc: GCC Development , Philipp Tomsich , =?UTF-8?Q?Christoph_M=c3=bcllner?= References: <27adfbd4-8b65-00bf-d5f9-89fafa423e10@theobroma-systems.com> From: Erick Ochoa Message-ID: <8300b207-bc40-41e6-8902-a6f5dd959574@theobroma-systems.com> Date: Wed, 6 May 2020 21:25:33 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 May 2020 19:25:39 -0000 On 06/05/2020 18:40, Richard Biener wrote: > On Wed, May 6, 2020 at 3:04 PM Erick Ochoa > wrote: >> >> >> >> On 06/05/2020 14:25, Richard Biener wrote: >>> On Wed, May 6, 2020 at 12:26 PM Erick Ochoa >>> wrote: >>>> >>>> Hi, >>>> >>>> I am trying to find out how to use the alias and/or points-to analysis >>>> in GCC. Ideally, I would like to find a function that given an >>>> allocation site, the return value is a set of pointers which may point >>>> to memory allocated from that allocation site. >>>> >>>> For example: >>>> >>>> int >>>> main(int argc, char** argv) >>>> { >>>> int a; >>>> int *b = argc > 2 ? &a : NULL; >>>> int *c = b; >>>> } >>>> >>>> Here, querying the allocation site corresponding to the declaration of >>>> local variable "a", should return { "b", "c" }. >>> >>> So that's a "reverse query" to that you are asking for below ... >>> >>>> I've found the following documentation on Alias-Analysis [0] and two >>>> source files[1][2] which seem to implement some (distinct?) alias analysis. >>>> >>>> I am trying to keep the discussion a bit high level, otherwise I would >>>> have a lot of questions, but given this C example, **how would someone >>>> be able to use any of the alias analyses in GCC to determine that "b" >>>> and "c" may-alias "a"?** >>> >>> ... here? Otherwise for a pointer "b" you can query whether it may-alias >>> "a" by using ptr_deref_may_alias_decl_p (b, a) or of 'a' is not a decl >>> but a general reference there is ptr_deref_may_alias_ref_p_1 >>> (not exported - there wasn't any need sofar). >> >> Thanks Richard. I'll look into the Tree alias-oracle API. >> >>> >>>> I compiled my example and placed an pass to experiment with alias >>>> analysis at link time. (I include the patch at the end). This is the >>>> gimple produced by the example above. >>>> >>>> main (int argc, char * * argv) >>>> { >>>> int * c; >>>> int * b; >>>> int a; >>>> int D.4170; >>>> int * iftmp.0; >>>> int * iftmp.0_1; >>>> int * iftmp.0_3; >>>> int * iftmp.0_4; >>>> int _9; >>>> >>>> : >>>> if (argc_2(D) > 2) >>>> goto ; [INV] >>>> else >>>> goto ; [INV] >>>> >>>> : >>>> iftmp.0_4 = &a; >>>> goto ; [INV] >>>> >>>> : >>>> iftmp.0_3 = 0B; >>>> >>>> : >>>> # iftmp.0_1 = PHI >>>> b_5 = iftmp.0_1; >>>> c_6 = b_5; >>>> a ={v} {CLOBBER}; >>>> _9 = 0; >>>> >>>> : >>>> : >>>> return _9; >>>> >>>> } >>>> >>>> I include this example because looking at the Alias Analysis [0] >>>> section, it mentions memory SSA form. But I do not see any #.MEM_n >>>> assignments. >>> >>> You need to dump with the -vops modifier (to get virtual operands dumped). >>> And you can use the -alias modifier to dump points-to results. >>> >> Furthermore, I made an equivalent code to the example of Memory SSA form >>>> and I still don't see any Memory SSA forms: >>>> >>>> ```c >>>> int i; >>>> int foo() >>>> { >>>> i = 1; >>>> return i; >>>> } >>>> ``` >>>> >>>> ```gimple >>>> foo () >>>> { >>>> int D.4164; >>>> int _3; >>>> >>>> : >>>> i = 1; >>>> _3 = i; >>>> >>>> : >>>> : >>>> return _3; >>>> >>>> } >>>> ``` >>>> >>>> So, I am not sure how the gimple shown on the Alias-analysis page is >>>> produced. **Does anyone know why the gimple produced is not showing the >>>> virtual SSA names?** >>>> >>>> Afterwards, instead of looking at the virtual SSA names, I then focused >>>> on finding out whether SSA_NAME_PTR_INFO but I found that it was not >>>> set. **Do I need I need to run something to make sure that >>>> SSA_NAME_PTR_INFO is set?** Maybe the example I chose for compilation >>>> did not trigger the path for setting SSA_NAME_PTR_INFO. What would be an >>>> example of some code that does set SSA_NAME_PTR_INFO? >>> >>> SSA_NAME_PTR_INFO is computed by points-to analysis, for a simple >>> IPA pass run at LTRANS time that will not be computed yet (it is not >>> streamed into the IL because it's not in a convenient form and it can be >>> and is re-computed early enough - just not for you ;)). Without LTO >>> the info should be still there from the early optimization pipeline computation. >>> >> >> So, does this mean that there's no alias information available at LTO? >> Or are you saying that I should store alias information at LGEN time and >> use it at WPA time to make my transformation plan and finally transform >> at LTRANS time? > > There is no points-to information available during WPA. There is no > points-to information during LTRANS until pass_build_alias is run > which is before the first user (if you exclude your simple IPA pass). > > If you need points-to information at WPA you either need to compute it > (it looks like you need to stream in function bodies anyway to use it, > so you could simply call compute_may_aliases on each function) or > what seems to be a better strathegy try to compute as much as > possible during IPA analysis and do final verification at LTRANS stage. > If that's possible of course depends on the exact things you need to do. > > Richard. Thanks Richard! I understood one of the two approaches you discuss here. It is indeed simple to call compute_may_aliases on each of the functions. However, I do not understand what you mean by "compute as much as possible during IPA analaysis". Can you please elaborate? For "compute" do you mean computing the may-points to set on my own? Perhaps using some of GCC's infrastructure as a foundation? Or do you also mean calling compute_may_aliases on each function? By "during IPA analysis" which of the three stages of IPA analysis do you mean? Do you mean WPA or LGEN? If I understand correctly, the final verification at LTRANS might be needed because earlier IPA passes might have transformed the functions. And our points-to information might be outdated. Is this correct? Also, what about ipa_pta? > >>> Hope this helps, >>> Richard. >>> >>>> Here is the patch, and in order to compile an example and dump the log: >>>> >>>> /path/to/gcc -fdump-ipa-hello-world -fipa-hello-world a.c >>>> >>>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in >>>> index 543b477ff18..bc1af09cbf8 100644 >>>> --- a/gcc/Makefile.in >>>> +++ b/gcc/Makefile.in >>>> @@ -1399,6 +1399,7 @@ OBJS = \ >>>> incpath.o \ >>>> init-regs.o \ >>>> internal-fn.o \ >>>> + ipa-hello-world.o \ >>>> ipa-cp.o \ >>>> ipa-sra.o \ >>>> ipa-devirt.o \ >>>> diff --git a/gcc/common.opt b/gcc/common.opt >>>> index d33383b523c..09cabeb114d 100644 >>>> --- a/gcc/common.opt >>>> +++ b/gcc/common.opt >>>> @@ -3408,4 +3408,8 @@ fipa-ra >>>> Common Report Var(flag_ipa_ra) Optimization >>>> Use caller save register across calls if possible. >>>> >>>> +fipa-hello-world >>>> +Common Report Var(flag_ipa_hello_world) Optimization >>>> +TBD >>>> + >>>> ; This comment is to ensure we retain the blank line above. >>>> diff --git a/gcc/ipa-hello-world.c b/gcc/ipa-hello-world.c >>>> new file mode 100644 >>>> index 00000000000..00e276a4bd7 >>>> --- /dev/null >>>> +++ b/gcc/ipa-hello-world.c >>>> @@ -0,0 +1,126 @@ >>>> +#include "config.h" >>>> +#include "system.h" >>>> +#include "coretypes.h" >>>> +#include "backend.h" >>>> +#include "tree.h" >>>> +#include "gimple-expr.h" >>>> +#include "predict.h" >>>> +#include "alloc-pool.h" >>>> +#include "tree-pass.h" >>>> +#include "cgraph.h" >>>> +#include "diagnostic.h" >>>> +#include "fold-const.h" >>>> +#include "gimple-fold.h" >>>> +#include "symbol-summary.h" >>>> +#include "tree-vrp.h" >>>> +#include "ipa-prop.h" >>>> +#include "tree-pretty-print.h" >>>> +#include "tree-inline.h" >>>> +#include "ipa-fnsummary.h" >>>> +#include "ipa-utils.h" >>>> +#include "tree-ssa-ccp.h" >>>> +#include "stringpool.h" >>>> +#include "attribs.h" >>>> +#include "tree-ssa-alias.h" >>>> +#include "tree-ssanames.h" >>>> +#include "gimple.h" >>>> +#include "cfg.h" >>>> +#include "gimple-iterator.h" >>>> +#include "gimple-ssa.h" >>>> + >>>> +void inline >>>> +log(const char* const fmt, ...) >>>> +{ >>>> + if (!dump_file) return; >>>> + >>>> + va_list args; >>>> + va_start(args, fmt); >>>> + vfprintf(dump_file, fmt, args); >>>> + va_end(args); >>>> +} >>>> + >>>> +static void >>>> +alias_experiment_gimple_body(const cgraph_node *cnode) >>>> +{ >>>> + gcc_assert(cnode); >>>> + >>>> + function *func = DECL_STRUCT_FUNCTION(cnode->decl); >>>> + >>>> + // We are looking first into SSA becaues of >>>> + // this documentation... >>>> + // Points-to and escape analysis. >>>> + // Points-to analysis builds a set of constraints from the GIMPLE SSA IL >>>> + // representing all pointer operations and facts we do or do not know >>>> + // about pointers. Solving this set of constraints yields a >>>> conservatively >>>> + // correct solution for each pointer variable in the program (though >>>> we are >>>> + // only interested in SSA name pointers) as to what it may possibly >>>> point to. >>>> + // https://gcc.gnu.org/onlinedocs/gccint/Alias-analysis.html >>>> + >>>> + size_t j = 0; >>>> + tree var_decl = NULL; >>>> + FOR_EACH_LOCAL_DECL(func, j, var_decl) >>>> + { >>>> + const_tree identifier_node = DECL_NAME(var_decl); >>>> + if (!identifier_node) continue; >>>> + const char* const identifier_pointer = >>>> IDENTIFIER_POINTER(identifier_node); >>>> + log("var_decl = %s\n", identifier_pointer); >>>> + if (POINTER_TYPE_P(TREE_TYPE(var_decl))) break; >>>> + } >>>> + >>>> + size_t i = 0; >>>> + tree ssa_name = NULL; >>>> + push_cfun(func); >>>> + FOR_EACH_SSA_NAME(i, ssa_name, cfun) >>>> + { >>>> + struct ptr_info_def *pi = SSA_NAME_PTR_INFO(ssa_name); >>>> + if (!pi) continue; >>>> + log("i have a pi"); >>>> + pt_solution_includes(&pi->pt, var_decl); >>>> + } >>>> + pop_cfun(); >>>> + >>>> + >>>> +} >>>> + >>>> +static unsigned int >>>> +iphw_execute() >>>> +{ >>>> + cgraph_node *node = NULL; >>>> + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY(node) >>>> + { >>>> + alias_experiment_gimple_body (node); >>>> + } >>>> + return 0; >>>> +} >>>> + >>>> +namespace { >>>> +const pass_data pass_data_ipa_hello_world = >>>> +{ >>>> + SIMPLE_IPA_PASS, >>>> + "hello-world", >>>> + OPTGROUP_NONE, >>>> + TV_NONE, >>>> + (PROP_cfg | PROP_ssa), >>>> + 0, >>>> + 0, >>>> + TODO_rebuild_alias, >>>> + 0, >>>> +}; >>>> + >>>> +class pass_ipa_hello_world : public simple_ipa_opt_pass >>>> +{ >>>> +public: >>>> + pass_ipa_hello_world (gcc::context *ctx) >>>> + : simple_ipa_opt_pass(pass_data_ipa_hello_world, ctx) >>>> + {} >>>> + >>>> + virtual bool gate(function*) { return flag_ipa_hello_world; } >>>> + virtual unsigned execute (function*) { return iphw_execute(); } >>>> +}; >>>> +} // anon namespace >>>> + >>>> +simple_ipa_opt_pass* >>>> +make_pass_ipa_hello_world (gcc::context *ctx) >>>> +{ >>>> + return new pass_ipa_hello_world (ctx); >>>> +} >>>> diff --git a/gcc/passes.def b/gcc/passes.def >>>> index 2bf2cb78fc5..66f333f81dc 100644 >>>> --- a/gcc/passes.def >>>> +++ b/gcc/passes.def >>>> @@ -149,6 +149,7 @@ along with GCC; see the file COPYING3. If not see >>>> NEXT_PASS (pass_ipa_profile); >>>> NEXT_PASS (pass_ipa_icf); >>>> NEXT_PASS (pass_ipa_devirt); >>>> + NEXT_PASS (pass_ipa_hello_world); >>>> NEXT_PASS (pass_ipa_cp); >>>> NEXT_PASS (pass_ipa_sra); >>>> NEXT_PASS (pass_ipa_cdtor_merge); >>>> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h >>>> index a1207a20a3c..377dda689cc 100644 >>>> --- a/gcc/tree-pass.h >>>> +++ b/gcc/tree-pass.h >>>> @@ -501,6 +501,7 @@ extern ipa_opt_pass_d *make_pass_ipa_fn_summary >>>> (gcc::context *ctxt); >>>> extern ipa_opt_pass_d *make_pass_ipa_inline (gcc::context *ctxt); >>>> extern simple_ipa_opt_pass *make_pass_ipa_free_lang_data (gcc::context >>>> *ctxt); >>>> extern simple_ipa_opt_pass *make_pass_ipa_free_fn_summary >>>> (gcc::context *ctxt); >>>> +extern simple_ipa_opt_pass *make_pass_ipa_hello_world (gcc::context *ctxt); >>>> extern ipa_opt_pass_d *make_pass_ipa_cp (gcc::context *ctxt); >>>> extern ipa_opt_pass_d *make_pass_ipa_sra (gcc::context *ctxt); >>>> extern ipa_opt_pass_d *make_pass_ipa_icf (gcc::context *ctxt); >>>> >>>> >>>> >>>> [0] https://gcc.gnu.org/onlinedocs/gccint/Alias-analysis.html >>>> [1] tree-ssa-alias.c >>>> [2] tree-ssa-structalias.c