From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 6BE103858D35 for ; Sat, 9 Oct 2021 15:04:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6BE103858D35 Received: from mail-lf1-f70.google.com (mail-lf1-f70.google.com [209.85.167.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-276-huVV1NxZPOi2fPhYoq5JTw-1; Sat, 09 Oct 2021 11:04:26 -0400 X-MC-Unique: huVV1NxZPOi2fPhYoq5JTw-1 Received: by mail-lf1-f70.google.com with SMTP id bt36-20020a056512262400b003fd7e6a96e8so619200lfb.19 for ; Sat, 09 Oct 2021 08:04:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=OEZwYvWmDubQIcJQDlwPyTSP2Je6LRMjBFP+E7ziu04=; b=Du6aWt62VSobpoM+N2do9MQyIVXJJpJtBhzco1Q7+X43qX4MNXtkXfzicoZjpUq7TL O2j8dHtq1Bus1H/g0Wlh53FwzZoaK5jv4+MSBu/ixut6x8/kV224TszXUe5gsHGbFvRu DTj6jr/NRSYz4fp3r0eRk8XCOajAO79Pje+GwYSzBac1hQckrSNPHI4USA/cY54PK1SH yHLjdGMTdCrUnrp1I52rIfoPAXMd/zcuctb5HOlS45S4raRAu7Yygo/XAno2cP6ChTOT ow+7+i4Zx+78rCM0b7LOgkQY5rHq/az6i8aBksRsvhP1tgPpBSic/GzzDF8Su8uGU81I KyWw== X-Gm-Message-State: AOAM530w4yzLRE2HnEywlEHST+QrB8JWT1ol2lMpXyolhToJYy4I55Qp pp0cLiiEn3ShUf6XI/y/rml6dCmtNOV1prUYwC7gux+hioUCn5MrO8A+MhdXWdjwiKzlA6Un/gJ 2LpgxqFVOZxg8UGYKOgpuEx2rWZjZQapvHQ== X-Received: by 2002:a2e:8799:: with SMTP id n25mr10937097lji.174.1633791865361; Sat, 09 Oct 2021 08:04:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyF9llgfKr2l0D+gQeVwBpsJ9PAaPg/xVjoz4MFtdXuQ4xW5YvxaP1bSiHbchSYu6lx6xUVCVjYWofIIahGS+Y= X-Received: by 2002:a2e:8799:: with SMTP id n25mr10937062lji.174.1633791865071; Sat, 09 Oct 2021 08:04:25 -0700 (PDT) MIME-Version: 1.0 References: <20211008151222.37790-1-aldyh@redhat.com> In-Reply-To: From: Aldy Hernandez Date: Sat, 9 Oct 2021 17:04:14 +0200 Message-ID: Subject: Re: [PATCH] Convert strlen pass from evrp to ranger. To: Martin Sebor Cc: Jakub Jelinek , Martin Sebor , GCC patches X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Oct 2021 15:04:35 -0000 On Fri, Oct 8, 2021 at 6:52 PM Martin Sebor wrote: > > On 10/8/21 9:12 AM, Aldy Hernandez via Gcc-patches wrote: > > The following patch converts the strlen pass from evrp to ranger, > > leaving DOM as the last remaining user. > > Thanks for doing this. I know I said I'd work on it but I'm still > bogged down in my stage 1 work that's not going so great :( I just > have a few minor comments/questions on the strlen change (inline) > but I am a bit concerned about the test failure. > > > > > No additional cleanups have been done. For example, the strlen pass > > still has uses of VR_ANTI_RANGE, and the sprintf still passes around > > pairs of integers instead of using a proper range. Fixing this > > could further improve these passes. > > > > As a further enhancement, if the relevant maintainers deem useful, > > the domwalk could be removed from strlen. That is, unless the pass > > needs it for something else. > > > > With ranger we are now able to remove the range calculation from > > before_dom_children entirely. Just working with the ranger on-demand > > catches all the strlen and sprintf testcases with the exception of > > builtin-sprintf-warn-22.c which is due to a limitation of the sprintf > > code. I have XFAILed the test and documented what the problem is. > > builtin-sprintf-warn-22.c is a regression test for a false positive > in Glibc. If it fails we'll have to deal with the Glibc failure > again, which I would rather avoid. Have you checked to see if > Glibc is affected by the change? > > > > > It looks like the same problem in the sprintf test triggers a false > > positive in gimple-ssa-warn-access.cc so I have added > > -Wno-format-overflow until it can be fixed. > > > > I can expand on the false positive if necessary, but the gist is that > > this: > > > > _17 = strlen (_132); > > _18 = strlen (_136); > > _19 = _18 + _17; > > if (_19 > 75) > > goto ; [0.00%] > > else > > goto ; [100.00%] > > > > ...dominates the sprintf in BB61. This means that ranger can figure > > out that the _17 and _18 are [0, 75]. On the other hand, evrp > > returned a range of [0, 9223372036854775805] which presumably the > > sprintf code was ignoring as a false positive here: > > This is a feature designed to avoid false positives when the sprintf > pass doesn't know anything about the strings (i.e., their lengths > are unconstrained by either the sizes of the arrays they're stored > in or any expressions like asserts involving their lengths). > > It sounds like the strlen/ranger improvement partially propagates > constraints from subsequent expressions into the strlen results > but it doesn't go far enough for them to actually fully satisfy > the constraint, which is what in turn triggers the warning. > > I.e., in the test: > > void g (char *s1, char *s2) > { > char b[1025]; > size_t n = __builtin_strlen (s1), d = __builtin_strlen (s2); > if (n + d + 1 >= 1025) > return; > > sprintf (b, "%s.%s", s1, s2); // { dg-bogus "\\\[-Wformat-overflow" } > > the range of n and d is [0, INF] and so the sprintf call doesn't > trigger a warning. With your change, because their range is > [0, 1023] each (and there's no way to express that their sum > is less than 1025), the warning triggers because it considers > the worst case scenario (the upper bounds of both). I agree with Andrew. If this doesn't work with more than 1 string we shouldn't even attempt it, as it's bound to be riddled with false positives, which you'll get more of with enhanced ranges at this point. > > @@ -269,7 +270,7 @@ compare_nonzero_chars (strinfo *si, unsigned HOST_WIDE_INT off, > > return -1; > > > > value_range vr; > > - if (!rvals->range_of_expr (vr, si->nonzero_chars, si->stmt)) > > + if (!rvals->range_of_expr (vr, si->nonzero_chars, stmt)) > > We need stmt rather than si->stmt because the latter may be different > or null when the former will always refer to the current statement, > correct? Yes. > I think there still are quite a few calls to get_addr_stridx() and > get_addr() that don't pass in rvals. I started doing this back in > the GCC 11 cycle but didn't finish the job. Eventually, rvals > should be passed to all their callers (or they made class members > with a ranger member). I mention this in case it has an effect on > the range propagation (since the pass also sets ranges). Definitely. You'll get even better ranges, though perhaps more false positives as discussed above :-/. Aldy