From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 61005 invoked by alias); 8 Nov 2019 16:57:58 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 60996 invoked by uid 89); 8 Nov 2019 16:57:58 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: =?ISO-8859-1?Q?No, score=-10.0 required=5.0 tests=AWL,BAYES_80,FREEMAIL_FROM,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=no version=3.3.1 spammy=tested=c2, attached=c2, In, in=c2?= X-HELO: mail-yb1-f193.google.com Received: from mail-yb1-f193.google.com (HELO mail-yb1-f193.google.com) (209.85.219.193) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 08 Nov 2019 16:57:56 +0000 Received: by mail-yb1-f193.google.com with SMTP id y18so3060229ybs.7 for ; Fri, 08 Nov 2019 08:57:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:references:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=WzTaQGL/slN3p7Nmlg1gp4wplpKlsyk6OosYkslBUUc=; b=FUPtu9GG2mBaGCehjX5sG5JnzsMWIe0/a5rge+lQjTRPXHSUrnpeHRfsXytUIvOAvP Veb3m0xiHMDiOUmAwQWKT8q9UBsG24ngRIswvfe3lO3WjslqqOPlKz9fXF2YPE8bI5Y+ f89OnwUHYlKBL2aqfoKOUnYE2K8m/I2sjzCnoi8lLkomgegWAJHkZ10IDNcX+Xkx6fGm DkX42l/uEiWzQki1+8j8j6QTQUDwUTXz/etPF1+y6iaJ6K5TFd03IVIwxSBXB0+D3I2i 4lj+8VZR0sqtwOcp0T4a7o6zK6MrO/+ezp8zdxbKqhVZGDxI0NU8E28puclcSKKJQ6M5 KErA== Return-Path: Received: from [192.168.0.41] (97-118-98-145.hlrn.qwest.net. [97.118.98.145]) by smtp.gmail.com with ESMTPSA id b196sm6580125ywh.8.2019.11.08.08.57.52 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 08 Nov 2019 08:57:53 -0800 (PST) Subject: Re: [PATCH] include size and offset in -Wstringop-overflow From: Martin Sebor To: Jeff Law , gcc-patches References: <9d3210bf-e1bd-81ac-b1a0-5c66c8ec6db7@gmail.com> <6f6bdef6-0d0d-0303-b870-10ff2b036444@redhat.com> <41e6f241-2361-487d-1bb0-c376df10a116@gmail.com> <517e840d-7b24-7bbd-1981-0db5e3c9b036@redhat.com> Message-ID: Date: Fri, 08 Nov 2019 16:57:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2019-11/txt/msg00638.txt.bz2 On 11/6/19 2:06 PM, Martin Sebor wrote: > On 11/6/19 1:39 PM, Jeff Law wrote: >> On 11/6/19 1:27 PM, Martin Sebor wrote: >>> On 11/6/19 11:55 AM, Jeff Law wrote: >>>> On 11/6/19 11:00 AM, Martin Sebor wrote: >>>>> The -Wstringop-overflow warnings for single-byte and multi-byte >>>>> stores mention the amount of data being stored and the amount of >>>>> space remaining in the destination, such as: >>>>> >>>>> warning: writing 4 bytes into a region of size 0 [-Wstringop-overflow=] >>>>> >>>>>     123 |   *p = 0; >>>>>         |   ~~~^~~ >>>>> note: destination object declared here >>>>>      45 |   char b[N]; >>>>>         |        ^ >>>>> >>>>> A warning like this can take some time to analyze.  First, the size >>>>> of the destination isn't mentioned and may not be easy to tell from >>>>> the sources.  In the note above, when N's value is the result of >>>>> some non-trivial computation, chasing it down may be a small project >>>>> in and of itself.  Second, it's also not clear why the region size >>>>> is zero.  It could be because the offset is exactly N, or because >>>>> it's negative, or because it's in some range greater than N. >>>>> >>>>> Mentioning both the size of the destination object and the offset >>>>> makes the existing messages clearer, are will become essential when >>>>> GCC starts diagnosing overflow into allocated buffers (as my >>>>> follow-on patch does). >>>>> >>>>> The attached patch enhances -Wstringop-overflow to do this by >>>>> letting compute_objsize return the offset to its caller, doing >>>>> something similar in get_stridx, and adding a new function to >>>>> the strlen pass to issue this enhanced warning (eventually, I'd >>>>> like the function to replace the -Wstringop-overflow handler in >>>>> builtins.c).  With the change, the note above might read something >>>>> like: >>>>> >>>>> note: at offset 11 to object ‘b’ with size 8 declared here >>>>>      45 |   char b[N]; >>>>>         |        ^ >>>>> >>>>> Tested on x86_64-linux. >>>>> >>>>> Martin >>>>> >>>>> gcc-store-offset.diff >>>>> >>>>> gcc/ChangeLog: >>>>> >>>>>      * builtins.c (compute_objsize): Add an argument and set it to >>>>> offset >>>>>      into destination. >>>>>      * builtins.h (compute_objsize): Add an argument. >>>>>      * tree-object-size.c (addr_object_size): Add an argument and >>>>> set it >>>>>      to offset into destination. >>>>>      (compute_builtin_object_size): Same. >>>>>      * tree-object-size.h (compute_builtin_object_size): Add an >>>>> argument. >>>>>      * tree-ssa-strlen.c (get_addr_stridx): Add an argument and set it >>>>>      to offset into destination. >>>>>      (maybe_warn_overflow): New function. >>>>>      (handle_store): Call maybe_warn_overflow to issue warnings. >>>>> >>>>> gcc/testsuite/ChangeLog: >>>>> >>>>>      * c-c++-common/Wstringop-overflow-2.c: Adjust text of expected >>>>> messages. >>>>>      * g++.dg/warn/Wstringop-overflow-3.C: Same. >>>>>      * gcc.dg/Wstringop-overflow-17.c: Same. >>>>> >>>> >>>>> Index: gcc/tree-ssa-strlen.c >>>>> =================================================================== >>>>> --- gcc/tree-ssa-strlen.c    (revision 277886) >>>>> +++ gcc/tree-ssa-strlen.c    (working copy) >>>>> @@ -189,6 +189,52 @@ struct laststmt_struct >>>>>    static int get_stridx_plus_constant (strinfo *, unsigned >>>>> HOST_WIDE_INT, tree); >>>>>    static void handle_builtin_stxncpy (built_in_function, >>>>> gimple_stmt_iterator *); >>>>>    +/* Sets MINMAX to either the constant value or the range VAL is in >>>>> +   and returns true on success.  */ >>>>> + >>>>> +static bool >>>>> +get_range (tree val, wide_int minmax[2], const vr_values *rvals = >>>>> NULL) >>>>> +{ >>>>> +  if (tree_fits_uhwi_p (val)) >>>>> +    { >>>>> +      minmax[0] = minmax[1] = wi::to_wide (val); >>>>> +      return true; >>>>> +    } >>>>> + >>>>> +  if (TREE_CODE (val) != SSA_NAME) >>>>> +    return false; >>>>> + >>>>> +  if (rvals) >>>>> +    { >>>>> +      gimple *def = SSA_NAME_DEF_STMT (val); >>>>> +      if (gimple_assign_single_p (def) >>>>> +      && gimple_assign_rhs_code (def) == INTEGER_CST) >>>>> +    { >>>>> +      /* get_value_range returns [0, N] for constant assignments.  */ >>>>> +      val = gimple_assign_rhs1 (def); >>>>> +      minmax[0] = minmax[1] = wi::to_wide (val); >>>>> +      return true; >>>>> +    } >>>> Umm, something seems really off with this hunk.  If the SSA_NAME is set >>>> via a simple constant assignment, then the range ought to be a >>>> singleton >>>> ie [CONST,CONST].   Is there are particular test were this is not true? >>>> >>>> The only way offhand I could see this happening is if originally the >>>> RHS >>>> wasn't a constant, but due to optimizations it either simplified into a >>>> constant or a constant was propagated into an SSA_NAME appearing on the >>>> RHS.  This would have to happen between the last range analysis and the >>>> point where you're making this query. >>> >>> Yes, I think that's right.  Here's an example where it happens: >>> >>>    void f (void) >>>    { >>>      char s[] = "1234"; >>>      unsigned n = strlen (s); >>>      char vla[n];   // or malloc (n) >>>      vla[n] = 0;    // n = [4, 4] >>>      ... >>>    } >>> >>> The strlen call is folded to 4 but that's not propagated to >>> the access until sometime after the strlen pass is done. >> Hmm.  Are we calling set_range_info in that case?  That goes behind the >> back of pass instance of vr_values.  If so, that might argue we want to >> be setting it in vr_values too. > > No, set_range_info is only called for ranges.  In this case, > handle_builtin_strlen replaces the strlen() call with 4: > >   s = "1234"; >   _1 = __builtin_strlen (&s); >   n_2 = (unsigned int) _1; >   a.1_8 = __builtin_alloca_with_align (_1, 8); >   (*a.1_8)[n_2] = 0; > > When the access is made, the __builtin_alloca_with_align call > is found as the destination and the _1 SSA_NAME is used to > get its size.  We get back the range [4, 4]. > > This is only done when we record the allocation statements; > this patch doesn't do that yet.  It's meant be just the simple > infrastructure bits for the follow-up work. Did I answer your question or is there something else? Martin