From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-f177.google.com (mail-oi1-f177.google.com [209.85.167.177]) by sourceware.org (Postfix) with ESMTPS id 2263A384B0C0; Wed, 17 Mar 2021 15:17:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 2263A384B0C0 Received: by mail-oi1-f177.google.com with SMTP id x135so37625898oia.9; Wed, 17 Mar 2021 08:17:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=PDEBtJ5rFgBNyLeD8vb32Y+X+NbhEVzcE4YrXg9fA2Y=; b=AM2LQ/48xvqYxYV2ZSFWcHXRox5TNCgZ/OHflThzfWxd2hIn8d5UDxIbTCcPfavyfM ucD+wLdEcisuoWwwDImZu88V2jPsTkgeNDeIwT7tui1/znEVo5svCJkBem32raDbINmB ntTJsnoomFYMa+0K8LuG8EMi3NHsXBxhdWm9oZEsVsT7tb5PAqNdSTh9sbJDJplLSLXn TdNiya0geQOVDU97AtUR2+clYM2IeoC+W3AVOHUrAsywkpcuMuzRbiVgjQPmUDa6shUW 6BTwrzju3W+HlRTp+cqfExrc/OXsEiYyXFyxotfgv1DQA93U18Mw1UQskUnEn4/5EZW6 auHQ== X-Gm-Message-State: AOAM530uimaORt7pxY/b3wAorgoWVQqq6uMdIem8nzopjw8cIXSzpWfh 8mrALUkbYZw/bQRjPmFB7sSd9NaNsn3A+Q== X-Google-Smtp-Source: ABdhPJxz70HBE6G9D2+NaRLvO0YBY0EURaJhl5jk7QLB6oWpIGZUHinKzU9fWhfUNmS91Kl5q385zA== X-Received: by 2002:aca:3cd5:: with SMTP id j204mr3077081oia.29.1615994220305; Wed, 17 Mar 2021 08:17:00 -0700 (PDT) Received: from mail-oi1-f170.google.com (mail-oi1-f170.google.com. [209.85.167.170]) by smtp.gmail.com with ESMTPSA id h59sm8768578otb.29.2021.03.17.08.17.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 17 Mar 2021 08:17:00 -0700 (PDT) Received: by mail-oi1-f170.google.com with SMTP id w195so35500734oif.11; Wed, 17 Mar 2021 08:17:00 -0700 (PDT) X-Received: by 2002:aca:ed94:: with SMTP id l142mr3084203oih.177.1615994219860; Wed, 17 Mar 2021 08:16:59 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Erick Ochoa Date: Wed, 17 Mar 2021 16:16:48 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: More questions on points-to analysis To: Richard Biener Cc: Erick Ochoa , GCC Development Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Mar 2021 15:17:02 -0000 Hi Richard, I think I misunderstood yesterday's answer and deviated a little bit. But now I want to focus on this: > > * the process in GCC that generates the constraints for NULL works > > fine (i.e., feeding the constraints generated by GCC to an external > > solver should yield a conservatively correct answer) but the process > > that solves the constraints relaxes the solutions for the NULL > > constraint variable (i.e., GCC has deviated from the constraint > > solving algorithm somehow) > > No, that part should work OK. > So, let's ignore the other solver for now and instead focus on the concrete example I presented on the previous email. If GCC is solving these constraints: ``` ANYTHING = &ANYTHING ESCAPED = *ESCAPED ESCAPED = ESCAPED + UNKNOWN *ESCAPED = NONLOCAL NONLOCAL = &NONLOCAL NONLOCAL = &ESCAPED INTEGER = &ANYTHING ISRA.4 = &NONLOCAL derefaddrtmp(9) = &NULL *ISRA.4 = derefaddrtmp(9) i = NONLOCAL i = &NONLOCAL ESCAPED = &NONLOCAL _2 = *ISRA.4 ``` What would a hand calculated solution gives us vs what was the solution given by GCC? I am following the algorithm stated on Section 3.3 of Structure Aliasing in GCC, and I will be ignoring the ESCAPED = ESCAPED + UNKNOWN constraint since there isn't any other field offset that needs to be calculated. First, I want to make some adjustments. I am going to be using "=" to signify the \supseteq symbol and I will be adding curly braces to specify the element in a set as opposed to the whole set. Therefore the constraints will now become (ordered slightly differently): ``` ====direct contraints======== ANYTHING = { ANYTHING } ESCAPED = { NONLOCAL } NONLOCAL = { NONLOCAL } NONLOCAL = { ESCAPED } INTEGER = { ANYTHING } ISRA.4 = { NONLOCAL } derefaddrtmp(9) = { NULL } i = { NONLOCAL } ====complex constraints====== ESCAPED = *ESCAPED *ESCAPED = NONLOCAL *ISRA.4 = derefaddrtmp(9) _2 = *ISRA.4 ===== copy-constraints====== ESCAPED = ESCAPED // again ignoring the + UNKNOWN since I don't think it will matter... i = NONLOCAL ``` Solution sets are basically the direct constraints at the moment. Let's now create the graph 1. node ESCAPED has an edge going to itself (due to the copy constraint) 2. node ISRA.4 has no outgoing copy edges 3. node derefaddrtmp(9) has no outgoing edges 4. node _2 has no outgoing edges 5. node i has an outgoing edge to NONLOCAL (due to the copy constraint) 6. node NONLOCAL has no outgoing edge Now, we can iterate over this set of nodes 1. Walking over node ESCAPED. Sol(ESCAPED) = {NONLOCAL}. There are no edges, but it has complex-constraints. Let's modify the graph. 1. Looking at ESCAPED = *ESCAPED we note that we need to add a copy edge from ESCAPED to NONLOCAL. 2. Looking at *ESCAPED = NONLOCAL we note that we need to add a copy edge from NONLOCAL to NONLOCAL The graph is now transformed to 1. node ESCAPED has an edge going to ESCAPED and NONLOCAL 2. node ISRA.4 has no outgoing copy edges 3. node derefaddrtmp(9) has no outgoing edges 4. node _2 has no outgoing edges 5. node i has an outgoing edge to NONLOCAL (due to the copy constraint) 6. node NONLOCAL has an edge going to NONLOCAL The solution set of escaped is now Sol(ESCAPED) = Sol(ESCAPED) U Sol(NONLOCAL) = {NONLOCAL, ESCAPED} Now we continue walking 2. Walking over node ISRA.4. It has the solution set { NONLOCAL }. There are no edges, but it has complex-constraints. Let's modify the graph. 1. Looking at *ISRA.4 = derefaddrtmp(9), we note that we need to add a copy edge from NONLOCAL to derefaddrtmp(9). The graph is now transformed to 1. node ESCAPED has an edge going to ESCAPED and NONLOCAL 2. node ISRA.4 has no outgoing copy edges 3. node derefaddrtmp(9) has no outgoing edges 4. node _2 has no outgoing edges 5. node i has an outgoing edge to NONLOCAL (due to the copy constraint) 6. node NONLOCAL has an edge going to NONLOCAL, derefaddrtmp(9) The Sol(NONLOCAL) = Sol(NONLOCAL) U Sol(derefaddrtmp(9)) = {NONLOCAL, ESCAPED, NULL}. Now I could continue, but here is already something that is not shown in the points-to sets in the dump. It shows that NONLOCAL = {NONLOCAL, ESCAPED, NULL} Looking at the data that I showed yesterday: ``` NONLOCAL = { ESCAPED NONLOCAL } same as i ``` we see that NULL is not in the solution set of NONLOCAL. Now, yesterday you said that NULL is not conservatively correctly represented in the constraints. You also said today the points-to analysis should be solving the constraints fine. What I now understand from this is that while NULL may be pointed to by some constraints, it doesn't mean that not being on the set means that a pointer will not point to NULL. However, it should still be shown in the dumps where the points-to sets are shown for the constraint variables since it is solved using the same analysis? Is this correct? Am I doing the points to analysis by hand wrong somehow? Why would NULL not be in Sol(NONLOCAL) if it is solving the same constraints that I am solving by hand?