From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oo1-f45.google.com (mail-oo1-f45.google.com [209.85.161.45]) by sourceware.org (Postfix) with ESMTPS id 6F928385781A for ; Wed, 17 Mar 2021 10:31:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 6F928385781A Received: by mail-oo1-f45.google.com with SMTP id i25-20020a4aa1190000b02901bbd9429832so443267ool.0 for ; Wed, 17 Mar 2021 03:31:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=X3i4JSmw/3plgORuHhcsa3NQuiGvUwWJ7UJbOUYFm/0=; b=Tbocc3XTN28UNzWmhrM1YnN2VfNVFWCMO5oAdit+pXR9VL6FFnd4LMpyQFuMEPFHFa chfG6I+L9mdGMct6o0imkBzXgdmSBXGGvr/AM/LW3t5lrAgVOmFiD2h7XsAlHyMW/gVy KmcQZeQTpHpld9PhmDy7aHMXwc9i3BJ5Ivun34d/VKjw/5N5R1XMwB6941DtemHsiSx8 cJExADHKGeFDz3MzXVaEBFrkV0EJ7O7oIPfe2WUcHfnARjoaafaOghbwPdU8M0fjRjI4 hY0eXkJpPsu8V+0hqCx1GXsF3+o00VzQGh72TWfpyyxz9qWbn3qvglWuDHKv5ohyVlL2 R67A== X-Gm-Message-State: AOAM533/l+yHHQJWKtEWCQCYoWUBGUUa/z1lJXgNXobOd12F9dWULefO oV5WcE8Q6qgfDf191Dl0weTEyoQUHtgi+Q== X-Google-Smtp-Source: ABdhPJz6RZ3iJaeCL36asD8UNymjWPC7pxcezkEMfPox5obGC0+6q7R5/ktnj02/ZzBE341DkEtG6g== X-Received: by 2002:a4a:d0ce:: with SMTP id u14mr2748355oor.36.1615977102530; Wed, 17 Mar 2021 03:31:42 -0700 (PDT) Received: from mail-oo1-f52.google.com (mail-oo1-f52.google.com. [209.85.161.52]) by smtp.gmail.com with ESMTPSA id x23sm6209829ood.28.2021.03.17.03.31.42 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 17 Mar 2021 03:31:42 -0700 (PDT) Received: by mail-oo1-f52.google.com with SMTP id r17-20020a4acb110000b02901b657f28cdcso428019ooq.6 for ; Wed, 17 Mar 2021 03:31:42 -0700 (PDT) X-Received: by 2002:a4a:dd14:: with SMTP id m20mr2707130oou.47.1615977101891; Wed, 17 Mar 2021 03:31:41 -0700 (PDT) MIME-Version: 1.0 From: Erick Ochoa Date: Wed, 17 Mar 2021 11:31:31 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: More questions on points-to analysis To: gcc@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-0.8 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Mar 2021 10:31:45 -0000 Hello, I'm still trying to compare the solution generated from the intraprocedural points-to analysis in GCC against an external solver. Yesterday it was pointed out that "NULL is not conservatively correctly represented in the constraints". Can someone expand on this? To me this sounds like a couple of things: * even though malloc may return NULL, NULL is not added to the points-to sets of whatever variable is on the left hand side of the malloc call. * the process in GCC that generates the constraints for NULL somehow does not generate enough constraints to treat NULL conservatively and therefore there might be points-to sets which should contain NULL but don't. (However, doesn't this mean that feeding the constraints to an external solver should still give the same answers?) * the process in GCC that generates the constraints for NULL works fine (i.e., feeding the constraints generated by GCC to an external solver should yield a conservatively correct answer) but the process that solves the constraints relaxes the solutions for the NULL constraint variable (i.e., GCC has deviated from the constraint solving algorithm somehow) Also, "at some point we decided to encode optimistic info into pt->null which means points-to now has to compute a conservatively correct pt->null." Doesn't this contradict itself? How is a pt->null first optimistically and now conservatively? Is what this is trying to say that: * NULL constraints were conservative first * pt->null optimistic first * Then conversion to SSA happened and NULL constraints became not conservatively represented in the constraints (effectively becoming somewhat optimistic) * To avoid NULL and pt->null be both unsafe, pt->null was changed to be conservative I've been looking at find_what_vars_points_to and have changed my code which verifies the constraint points-to sets. Basically, I now find which variables have been collapsed and only for "real" constraint pointer variables I take a look at the points to solution struct. Before looking into vars, I take a look at the fields and compare the null, anything, escape, etc, against the id of the pointee-variable. Checking vars is slightly confusing for me at the moment, since it appears that there are at least 3 plausible ways of validating the solution (I haven't actually gotten there because assertions are being triggered). ``` for (auto &output : *orel) { int from_i; int to_i; // Since find_what_var_points_to // doesn't change the solution for collapsed // variables, only verify the answer for the real ones. varinfo_t from_var = get_varinfo(from_i); varinfo_t vi = get_varinfo (find (from_i)); if (from_var->id != vi->id) continue; if (!from_var->may_have_pointers) continue; // compute the pt_solution pt_solution solution = find_what_var_points_to (cfun->decl, from_var); // pointee variable according to external analysis varinfo_t vi_to = get_varinfo(to_i); // Since some artificial variables are stored in fields instead of the bitset // assert based on field values. // However you can see that I already had to disable some of the assertions. if (vi_to->is_artificial_var) { if (vi_to->id == nothing_id) { gcc_assert(solution.null && vi_to->id == nothing_id); continue; } else if (vi_to->id == escaped_id) { if (in_ipa_mode) { gcc_assert(solution.ipa_escaped && vi_to->id == escaped_id); } else { //gcc_assert(solution.escaped && vi_to->id == escaped_id); } continue; /* Expand some special vars of ESCAPED in-place here. ??*/ } // More... } if (solution.anything) continue; bitmap vars = solution.vars; if (!vars) continue; if (dump_file) fprintf(dump_file, "SAME = %s\n", bitmap_bit_p(vars, DECL_PT_UID(vi_to->decl)) ? "true" : "false"); if (dump_file) fprintf(dump_file, "SAME2 = %s\n", bitmap_bit_p(vars, to_i) ? "true" : "false"); if (dump_file) fprintf(dump_file, "SAME3 = %s\n", bitmap_bit_p(from_var->solution, to_i) ? "true" : "false"); ``` Can someone help me figure out why even though I have a "real" variable and I compute its solution with the "find_what_var_points_to" method the solution does not have the fields that I expect to be set? (I would expect solution.escaped to be escaped if the pointee variable vi_to has an id = escaped_id). And also, how is the DECL_PT_UID different from the varinfo id field? Shouldn't they be the same? It seems that during "find_what_var_points_to" DECL_PT_UID is being used to set the bit in the bitmap, but in previous instances it was the varinfo id offset?