From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by sourceware.org (Postfix) with ESMTPS id 523A0385781A; Wed, 17 Mar 2021 10:51:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 523A0385781A Received: by mail-ej1-x62d.google.com with SMTP id c10so1801571ejx.9; Wed, 17 Mar 2021 03:51:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=knusXMzlJjtv747L/YfUGOhEQFIjHzCtNn89eClxjrc=; b=gxEZRW34QrVac0bLGdSosyh2Spj6m4jvI33RjfOGbCDB3PEtc3ihqkT0waSdFTqZ9+ S/2jyCbtaM1Q6lbn1OSgnKCEr54/dCuM7s6HzYRP+tgIxZ2Z5xMEpOpdZu0eiEDzAWbM PsoPYHtgPGgGsvDeN5h/fPdomQBQqwaZhjRRfh/prN2m2tmvAFer317/+XsY9Fo2evn3 9lWQzTn3Err3YfOgJW7jvVyWfoFIsaM1ij3QEQev+Rv49QPHuAJaQgw4d9u5BQIXub3K OYGXITFrLcroIcSuovYuebANw4xaiZkCnh1+W0BAJzgqCAdg8eplmUtWLXGTUgCN9Q7e UhrA== X-Gm-Message-State: AOAM531qsDXmUWbKRlWXonXC8Ht10bgcpOnAyKBjKDQITVn0YSPWZ9BM VBylKDa3LwPNTyRYaBjgTfSnbXu/iw4oB+gtmTzwuP7g X-Google-Smtp-Source: ABdhPJzMblJZCqczlz0LOdGY+RYB9Li2V05po6w8NyhSTMAWTFVTdfLTW6WfZa45DjS3Tf5odjSmN7U+QqtG7+W7J9I= X-Received: by 2002:a17:906:8a65:: with SMTP id hy5mr35351280ejc.250.1615978273167; Wed, 17 Mar 2021 03:51:13 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Wed, 17 Mar 2021 11:51:01 +0100 Message-ID: Subject: Re: More questions on points-to analysis To: Erick Ochoa Cc: GCC Development Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Mar 2021 10:51:16 -0000 On Wed, Mar 17, 2021 at 11:34 AM Erick Ochoa via Gcc wrote: > > Hello, > > I'm still trying to compare the solution generated from the > intraprocedural points-to analysis in GCC against an external solver. > > Yesterday it was pointed out that "NULL is not conservatively > correctly represented in the constraints". Can someone expand on this? > To me this sounds like a couple of things: > * even though malloc may return NULL, NULL is not added to the > points-to sets of whatever variable is on the left hand side of the > malloc call. > * the process in GCC that generates the constraints for NULL somehow > does not generate enough constraints to treat NULL conservatively and > therefore there might be points-to sets which should contain NULL but > don't. (However, doesn't this mean that feeding the constraints to an > external solver should still give the same answers?) Yes, there is an unknown number of places that get this "wrong". Getting it "right" wasn't needed until we started using pt->null for anything. > * the process in GCC that generates the constraints for NULL works > fine (i.e., feeding the constraints generated by GCC to an external > solver should yield a conservatively correct answer) but the process > that solves the constraints relaxes the solutions for the NULL > constraint variable (i.e., GCC has deviated from the constraint > solving algorithm somehow) No, that part should work OK. > Also, "at some point we decided to encode optimistic info into > pt->null which means points-to now has to compute a conservatively > correct pt->null." Doesn't this contradict itself? How is a pt->null > first optimistically and now conservatively? Is what this is trying to > say that: > > * NULL constraints were conservative first > * pt->null optimistic first > * Then conversion to SSA happened and NULL constraints became not > conservatively represented in the constraints (effectively becoming > somewhat optimistic) > * To avoid NULL and pt->null be both unsafe, pt->null was changed to > be conservative The SSA points-to solution is what gets used, it is now populated not only by PTA but also by range analysis which eventually sets pt->null to false. PTA now simply sets pt->null to true as a conservative measure because it cannot guarantee that when !pt->null the pointer is actually never NULL (because of the above issues). > I've been looking at find_what_vars_points_to and have changed my code > which verifies the constraint points-to sets. Basically, I now find > which variables have been collapsed and only for "real" constraint > pointer variables I take a look at the points to solution struct. > Before looking into vars, I take a look at the fields and compare the > null, anything, escape, etc, against the id of the pointee-variable. > Checking vars is slightly confusing for me at the moment, since it > appears that there are at least 3 plausible ways of validating the > solution (I haven't actually gotten there because assertions are being > triggered). > > ``` > for (auto &output : *orel) { > int from_i; > int to_i; > > // Since find_what_var_points_to > // doesn't change the solution for collapsed > // variables, only verify the answer for the real ones. > varinfo_t from_var = get_varinfo(from_i); > varinfo_t vi = get_varinfo (find (from_i)); > if (from_var->id != vi->id) continue; > if (!from_var->may_have_pointers) continue; > > // compute the pt_solution > pt_solution solution = find_what_var_points_to (cfun->decl, from_var); > > // pointee variable according to external analysis > varinfo_t vi_to = get_varinfo(to_i); > > // Since some artificial variables are stored in fields instead > of the bitset > // assert based on field values. > // However you can see that I already had to disable some of the > assertions. > if (vi_to->is_artificial_var) > { > if (vi_to->id == nothing_id) > { > gcc_assert(solution.null && vi_to->id == nothing_id); > continue; > } > else if (vi_to->id == escaped_id) > { > if (in_ipa_mode) > { > gcc_assert(solution.ipa_escaped && vi_to->id == escaped_id); > } > else > { > //gcc_assert(solution.escaped && vi_to->id == escaped_id); > } > continue; > /* Expand some special vars of ESCAPED in-place here. ??*/ > } > // More... > } > > if (solution.anything) continue; > > bitmap vars = solution.vars; > if (!vars) continue; > > > if (dump_file) fprintf(dump_file, "SAME = %s\n", > bitmap_bit_p(vars, DECL_PT_UID(vi_to->decl)) ? "true" : "false"); > if (dump_file) fprintf(dump_file, "SAME2 = %s\n", > bitmap_bit_p(vars, to_i) ? "true" : "false"); > if (dump_file) fprintf(dump_file, "SAME3 = %s\n", > bitmap_bit_p(from_var->solution, to_i) ? "true" : "false"); > ``` > > Can someone help me figure out why even though I have a "real" > variable and I compute its solution with the "find_what_var_points_to" > method the solution does not have the fields that I expect to be set? > (I would expect solution.escaped to be escaped if the pointee variable > vi_to has an id = escaped_id). solution.escaped is 1 if the pointer variable may point to everything that escaped. That is, solution.vars is really solution.vars | cfun->gimple_df->escaped (there is also ipa_escaped). Have a look at pt_solution_includes_1 for an outline how to interpret the fields when asking whether DECL is a member of the points-to solution. > > And also, how is the DECL_PT_UID different from the varinfo id field? > Shouldn't they be the same? It seems that during > "find_what_var_points_to" DECL_PT_UID is being used to set the bit in > the bitmap, but in previous instances it was the varinfo id offset? The "SSA" points-to sets contain DECL_UIDs, not varinfo IDs which are only transitional during points-to computation. DECL_PT_UID maintains the original UID also after inlining duplicates decls so the points-to solutions remain valid even after inlining (or cloning, or IPA ICF, etc.). Richard.