From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6751 invoked by alias); 21 Sep 2018 17:13:48 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 6739 invoked by uid 89); 21 Sep 2018 17:13:48 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.4 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=belong, perfectly, propagation, copied X-HELO: mail-qk1-f194.google.com Received: from mail-qk1-f194.google.com (HELO mail-qk1-f194.google.com) (209.85.222.194) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 21 Sep 2018 17:13:45 +0000 Received: by mail-qk1-f194.google.com with SMTP id z78-v6so8642170qka.0 for ; Fri, 21 Sep 2018 10:13:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=xmXTUsqKsK0ORVAck5d4swu4Z25gHG6XPl9F8AaQNQE=; b=PEOqQkfbkSHK1Zhr+q4UeK+7cE9gs9Xb8YJCxOuglO2YCOyKpHyvvTqXyLYlaJmBKE qejmk5qVLGg12Ji0W3MIDoj4KbQHD+pZjShvNVMtfK733gsVg53koDFqhu2nGdZuqdK3 PHDj90chZ8NDwUc1wE7+pKapAXFBRcrs80DCR/eXv4CI5s/oAPWlXY1hZqde4+W3RVjm BNlYJLC4CABvHyGQ+2PdvfDFn0J72aIjue2z1bDh0w+9rVQkwc30GpKYTBaEL7OiLZIQ mOh0jdMRLJwcfeA1tOgLB4M7orHVcAKR6vwt5ZevPlpHVsFD8ePd3q8dJ2WUVc0qEn5o 0nCQ== Return-Path: Received: from localhost.localdomain (97-118-105-75.hlrn.qwest.net. [97.118.105.75]) by smtp.gmail.com with ESMTPSA id t28-v6sm4101789qtc.67.2018.09.21.10.13.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 21 Sep 2018 10:13:42 -0700 (PDT) Subject: Re: [PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028) To: Jeff Law , Richard Biener References: <88de1ee3-6ee4-d8d9-3e57-3a42474a4169@redhat.com> <4e613d61-40ad-f05f-9107-5fd7a5c2fdb6@gmail.com> <6aaa70ca-af7f-2b34-43bc-953a02defe03@gmail.com> Cc: GCC Patches From: Martin Sebor Message-ID: <7579c7b4-6594-9dbc-9dd1-505361aa0608@gmail.com> Date: Fri, 21 Sep 2018 17:40:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2018-09/txt/msg01249.txt.bz2 On 09/17/2018 07:30 PM, Jeff Law wrote: > On 8/28/18 6:12 PM, Martin Sebor wrote: >>>> Sadly, dstbase is the PARM_DECL for d. That's where things are going >>>> "wrong". Not sure why you're getting the PARM_DECL in that case. I'd >>>> debug get_addr_base_and_unit_offset to understand what's going on. >>>> Essentially you're getting different results of >>>> get_addr_base_and_unit_offset in a case where they arguably should be >>>> the same. >>> >>> Probably get_attr_nonstring_decl has the same "mistake" and returns >>> the PARM_DECL instead of the SSA name pointer. So we're comparing >>> apples and oranges here. >> >> Returning the SSA_NAME_VAR from get_attr_nonstring_decl() is >> intentional but the function need not (perhaps should not) >> also set *REF to it. >> >>> >>> Yeah: >>> >>> /* If EXPR refers to a character array or pointer declared attribute >>> nonstring return a decl for that array or pointer and set *REF to >>> the referenced enclosing object or pointer. Otherwise returns >>> null. */ >>> >>> tree >>> get_attr_nonstring_decl (tree expr, tree *ref) >>> { >>> tree decl = expr; >>> if (TREE_CODE (decl) == SSA_NAME) >>> { >>> gimple *def = SSA_NAME_DEF_STMT (decl); >>> >>> if (is_gimple_assign (def)) >>> { >>> tree_code code = gimple_assign_rhs_code (def); >>> if (code == ADDR_EXPR >>> || code == COMPONENT_REF >>> || code == VAR_DECL) >>> decl = gimple_assign_rhs1 (def); >>> } >>> else if (tree var = SSA_NAME_VAR (decl)) >>> decl = var; >>> } >>> >>> if (TREE_CODE (decl) == ADDR_EXPR) >>> decl = TREE_OPERAND (decl, 0); >>> >>> if (ref) >>> *ref = decl; >>> >>> I see a lot of "magic" here again in the attempt to "propagate" >>> a nonstring attribute. >> >> That's the function's purpose: to look for the attribute. Is >> there a better way to do this? >> >>> Note >>> >>> foo (char *p __attribute__(("nonstring"))) >>> { >>> p = "bar"; >>> strlen (p); // or whatever is necessary to call get_attr_nonstring_decl >>> } >>> >>> is perfectly valid and p as passed to strlen is _not_ nonstring(?). >> >> I don't know if you're saying that it should get a warning or >> shouldn't. Right now it doesn't because the strlen() call is >> folded before we check for nonstring. >> >> I could see an argument for diagnosing it but I suspect you >> wouldn't like it because it would mean more warning from >> the folder. I could also see an argument against it because, >> as you said, it's safe. >> >> If you take the assignment to p away then a warning is issued, >> and that's because p is declared with attribute nonstring. >> That's also why get_attr_nonstring_decl looks at SSA_NAME_VAR. >> >>> I think in your code comparing bases you want to look at the _original_ >>> argument to the string function rather than what get_attr_nonstring_decl >>> returned as ref. >> >> I've adjusted get_attr_nonstring_decl() to avoid setting *REF >> to SSA_NAME_VAR. That let me remove the GIMPLE_NOP code from >> the patch. I've also updated the comment above SSA_NAME_VAR >> to clarify its purpose per Jeff's comments. >> >> Attached is an updated revision with these changes. >> >> Martin >> >> gcc-87028.diff >> >> PR tree-optimization/87028 - false positive -Wstringop-truncation strncpy with global variable source string >> gcc/ChangeLog: >> >> PR tree-optimization/87028 >> * calls.c (get_attr_nonstring_decl): Avoid setting *REF to >> SSA_NAME_VAR. >> * gimple-fold.c (gimple_fold_builtin_strncpy): Avoid folding >> when statement doesn't belong to a basic block. >> * tree.h (SSA_NAME_VAR): Update comment. >> * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Simplify. >> >> gcc/testsuite/ChangeLog: >> >> PR tree-optimization/87028 >> * c-c++-common/Wstringop-truncation.c: Remove xfails. >> * gcc.dg/Wstringop-truncation-5.c: New test. >> > >> Index: gcc/calls.c >> =================================================================== >> --- gcc/calls.c (revision 263928) >> +++ gcc/calls.c (working copy) >> @@ -1503,6 +1503,7 @@ tree >> get_attr_nonstring_decl (tree expr, tree *ref) >> { >> tree decl = expr; >> + tree var = NULL_TREE; >> if (TREE_CODE (decl) == SSA_NAME) >> { >> gimple *def = SSA_NAME_DEF_STMT (decl); >> @@ -1515,17 +1516,25 @@ get_attr_nonstring_decl (tree expr, tree *ref) >> || code == VAR_DECL) >> decl = gimple_assign_rhs1 (def); >> } >> - else if (tree var = SSA_NAME_VAR (decl)) >> - decl = var; >> + else >> + var = SSA_NAME_VAR (decl); >> } >> >> if (TREE_CODE (decl) == ADDR_EXPR) >> decl = TREE_OPERAND (decl, 0); >> >> + /* To simplify calling code, store the referenced DECL regardless of >> + the attribute determined below, but avoid storing the SSA_NAME_VAR >> + obtained above (it's not useful for dataflow purposes). */ >> if (ref) >> *ref = decl; >> >> - if (TREE_CODE (decl) == ARRAY_REF) >> + /* Use the SSA_NAME_VAR that was determined above to see if it's >> + declared nonstring. Otherwise drill down into the referenced >> + DECL. */ >> + if (var) >> + decl = var; >> + else if (TREE_CODE (decl) == ARRAY_REF) >> decl = TREE_OPERAND (decl, 0); >> else if (TREE_CODE (decl) == COMPONENT_REF) >> decl = TREE_OPERAND (decl, 1); > The more I look at this the more I think what we really want to be doing > is real propagation of the property either via the alias oracle or a > propagation engine. You can't even guarantee that if you've got an > SSA_NAME that the value it holds has any relation to its underlying > SSA_NAME_VAR -- the value in the SSA_NAME could well have been copied > from a some other SSA_NAME with a different underlying SSA_NAME_VAR. > > I'm not going to insist on it, but I think if we find ourselves > extending this again in a way that is really working around lack of > propagation of the property then we should go back and fix the > propagation problem. We talked about improving this back in the GCC 8 cycle. I've been collecting input (and test cases) from Miguel Ojeda from the adoption of the attribute in the Linux kernel. There are a number of issues I was hoping to get to in stage 1 but that has been derailed by all the strlen back and forth. I'm still hoping to be able to fix some of the false positives here in stage 3 but, IIUC the constraints, a redesign along the lines you suggest would be considered overly intrusive. (If not, I'm willing to look into it.) That said, I had the impression from Richard's comments that implementing the propagation in points-to analysis would come at a cost and have its own downsides: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01954.html So I wasn't sure it was necessarily an endorsement of the approach as the ideal solution or just a passing thought. >> Index: gcc/gimple-fold.c >> =================================================================== >> --- gcc/gimple-fold.c (revision 263925) >> +++ gcc/gimple-fold.c (working copy) >> @@ -1702,6 +1702,11 @@ gimple_fold_builtin_strncpy (gimple_stmt_iterator >> if (tree_int_cst_lt (ssize, len)) >> return false; >> >> + /* Defer warning (and folding) until the next statement in the basic >> + block is reachable. */ >> + if (!gimple_bb (stmt)) >> + return false; >> + >> /* Diagnose truncation that leaves the copy unterminated. */ >> maybe_diag_stxncpy_trunc (*gsi, src, len); > I thought Richi wanted the guard earlier (maybe_fold_stmt) -- it wasn't > entirely clear to me if the subsequent comments about needing to fold > early where meant to raise issues with guarding earlier or not. I'm fine with moving it if that's preferable. Moving the test to maybe_fold_stmt() would, IMO, be the right change to make in general, at least for library built-ins. I have been meaning to suggest it independently of this fix but because of its pervasive impact I've been holding off, expecting it to be controversial. If there is consensus I'm happy to make this change but I would prefer to do it separately since it causes a number of regressions in tests that expect built-ins to be folded very early on (i.e., look for evidence of the folding in the output of -fdump-tree-gimple or -fdump-tree-ccp1). Some of the regression would go away if maybe_fold_stmt() only avoided folding of library built-in functions. Resolving the others would require adjusting the tests to either use optimization or look for the evidence of folding in later passes than gimple or ccp1). I think all that is reasonable and won't impact the efficiency of the emitted object code, but it's obviously a much bigger change than a simple fix for a false positive warning. If that sounds reasonable, is the patch acceptable as is? The latest version is here: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html Martin