From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14308 invoked by alias); 19 Sep 2018 14:19:21 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 14278 invoked by uid 89); 19 Sep 2018 14:19:20 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=built-ins X-HELO: mail-qt0-f173.google.com Received: from mail-qt0-f173.google.com (HELO mail-qt0-f173.google.com) (209.85.216.173) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 19 Sep 2018 14:19:18 +0000 Received: by mail-qt0-f173.google.com with SMTP id k38-v6so5191857qtk.11 for ; Wed, 19 Sep 2018 07:19:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=3F6nnz7NqSYBgBrbBc1WykmueaAfOKo2GVuXrmaelts=; b=YGQ6KuM8eNhJYGEX77pAq8b57eJbLUlAcTODhFLGt6drkumdz6vJK9QNY7E2XD7g2S mWzG5Njdq3Usw22iAtyuxiv94IIzXpC6dcdIfQuIcdQw89YTepEkff724gu/nkvhBaJm Xko9fJ/C8BvAVMfmfvoNYOxNcuhTVpgcw78G/36dmgwR8fU8uzOnl3Yib8q+frevm3F8 h4KlvzlVJyFfJTruJnzlNCkqvQNYvt6bQO0WrGirF6LixfTBi3lWNJE8qXdDmHr1loTw iMKhsNssMrtzYGQs5mYR7JOiXI2Dw90J/jkJv540Jwnq5eanYKWV3ppMoSOY9AkVfdTF Pv0A== Return-Path: Received: from localhost.localdomain (97-118-105-75.hlrn.qwest.net. [97.118.105.75]) by smtp.gmail.com with ESMTPSA id b3-v6sm15795861qtb.80.2018.09.19.07.19.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Sep 2018 07:19:14 -0700 (PDT) Subject: Re: [PATCH] look harder for MEM_REF operand equality to avoid -Wstringop-truncation (PR 84561) To: Jeff Law , Richard Biener References: <6c2126fa-32b3-693b-e3da-cf70391710bf@gmail.com> <346c0950-ac9e-b7b4-799f-a8b661f7d55b@gmail.com> <9ca1c7dc-041a-3e15-a191-34d16a68aa2f@gmail.com> <1555c8f2-019a-1164-d1cc-650b78230c63@redhat.com> <26c29db3-6e0c-7df4-c779-cf01e958c654@gmail.com> <51ebb626-7207-edf8-9a70-c1ec525d24f3@gmail.com> <16290c2f-6203-7709-e7ec-1d56de6c20a2@redhat.com> <3911c594-d8f8-d2d5-0d6f-70c62275ce12@redhat.com> Cc: GCC Patches From: Martin Sebor Message-ID: <4aa12e16-f1ab-4311-ac4e-0050ce3c9218@gmail.com> Date: Wed, 19 Sep 2018 14:31:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <3911c594-d8f8-d2d5-0d6f-70c62275ce12@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2018-09/txt/msg01061.txt.bz2 On 09/18/2018 10:23 PM, Jeff Law wrote: > On 9/18/18 1:46 PM, Martin Sebor wrote: >> On 09/18/2018 12:58 PM, Jeff Law wrote: >>> On 9/18/18 11:12 AM, Martin Sebor wrote: >>> >>>>> My bad. Sigh. CCP doesn't track copies, just constants, so there's not >>>>> going to be any data structure you can exploit. And I don't think >>>>> there's a value number you can use to determine the two objects are the >>>>> same. >>>>> >>>>> Hmm, let's back up a bit, what is does the relevant part of the IL look >>>>> like before CCP? Is the real problem here that we have unpropagated >>>>> copies lying around in the IL? Hmm, more likely the IL looksl ike: >>>>> >>>>> _8 = &pb_3(D)->a; >>>>> _9 = _8; >>>>> _1 = _9; >>>>> strncpy (MEM_REF (&pb_3(D)->a), ...); >>>>> MEM[(struct S *)_1].a[n_7] = 0; >>>> >>>> Yes, that is what the folder sees while the strncpy call is >>>> being transformed/folded by ccp. The MEM_REF is folded just >>>> after the strncpy call and that's when it's transformed into >>>> >>>> MEM[(struct S *)_8].a[n_7] = 0; >>>> >>>> (The assignments to _1 and _9 don't get removed until after >>>> the dom walk finishes). >>>> >>>>> >>>>> If we were to propagate the copies out we'd at best have: >>>>> >>>>> _8 = &pb_3(D)->a; >>>>> strncpy (MEM_REF (&pb_3(D)->a), ...); >>>>> MEM[(struct S *)_8].a[n_7] = 0; >>>>> >>>>> >>>>> Is that in a form you can handle? Or would we also need to forward >>>>> propagate the address computation into the use of _8? >>>> >>>> The above works as long as we look at the def_stmt of _8 in >>>> the MEM_REF (we currently don't). That's also what the last >>>> iteration of the loop does. In this case (with _8) it would >>>> be discovered in the first iteration, so the loop could be >>>> replaced by a simple if statement. >>>> >>>> But I'm not sure I understand the concern with the loop. Is >>>> it that we are looping at all, i.e., the cost? Or that ccp >>>> is doing something wrong or suboptimal? (Should have >>>> propagated the value of _8 earlier?) >>> I suspect it's more a concern that things like copies are typically >>> propagated away. So their existence in the IL (and consequently your >>> need to handle them) raises the question "has something else failed to >>> do its job earlier". >>> >>> During which of the CCP passes is this happening? Can we pull the >>> warning out of the folder (even if that means having a distinct warning >>> pass over the IL?) >> >> It happens during the third run of the pass. >> >> The only way to do what you suggest that I could think of is >> to defer the strncpy to memcpy transformation until after >> the warning pass. That was also my earlier suggestion: defer >> both it and the warning until the tree-ssa-strlen pass (where >> the warning is implemented to begin with -- the folder calls >> into it). > If it's happening that late (CCP3) in general, then ISTM we ought to be > able to get the warning out of the folder. We just have to pick the > right spot. > > warn_restrict runs before fold_all_builtins, but after dom/vrp so we > should have the IL in pretty good shape. That seems like about the > right time. > > I wonder if we could generalize warn_restrict to be a more generic > warning pass over the IL and place it right before fold_builtins. The restrict pass doesn't know about string lengths so it can't handle all the warnings about string built-ins (the strlen pass now calls into it to issue some). The strlen pass does so it could handle most if not all of them (the folder also calls into it to issue some warnings). It would work even better if it were also integrated with the object size pass. We're already working on merging strlen with sprintf. It seems to me that the strlen pass would benefit not only from that but also from integrating with object size and warn-restrict. With that, -Wstringop-overflow could be moved from builtins.c into it as well (and also benefit not only from accurate string lengths but also from the more accurate object size info). What do you think about that? Martin PS I don't think I could do more than merger strlen and sprintf before stage 1 ends (if even that much) so this would be a longer term goal.