From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31050 invoked by alias); 2 Oct 2017 22:15:29 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 31040 invoked by uid 89); 2 Oct 2017 22:15:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=tighten, surprises, detection X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 02 Oct 2017 22:15:28 +0000 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8D9337E430; Mon, 2 Oct 2017 22:15:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 8D9337E430 Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=law@redhat.com Received: from localhost.localdomain (ovpn-112-25.rdu2.redhat.com [10.10.112.25]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5542660246; Mon, 2 Oct 2017 22:15:24 +0000 (UTC) Subject: Re: [PATCH 3/4] enhance overflow and truncation detection in strncpy and strncat (PR 81117) To: Martin Sebor , Gcc Patch List References: <13944863-99a8-4144-1703-c6e1a2f36425@gmail.com> <0bbc91cd-fcdb-be61-e1d0-4b230f23b1a9@redhat.com> <4f4fbd4c-cb46-b80d-5749-ebb6bb050bc4@gmail.com> From: Jeff Law Message-ID: <62c4aa45-9bda-b2de-75b8-7d5f6c75df9c@redhat.com> Date: Mon, 02 Oct 2017 22:15:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2017-10/txt/msg00085.txt.bz2 On 08/10/2017 01:29 PM, Martin Sebor wrote: >>> diff --git a/gcc/builtins.c b/gcc/builtins.c >>> index 016f68d..1aa9e22 100644 >>> --- a/gcc/builtins.c >>> +++ b/gcc/builtins.c >> [ ... ] >>> + >>> +  if (TREE_CODE (type) == ARRAY_TYPE) >>> +    { >>> +      /* Return the constant size unless it's zero (that's a >>> zero-length >>> +     array likely at the end of a struct).  */ >>> +      tree size = TYPE_SIZE_UNIT (type); >>> +      if (size && TREE_CODE (size) == INTEGER_CST >>> +      && !integer_zerop (size)) >>> +    return size; >>> +    } >> Q. Do we have a canonical test for the trailing array idiom?   In some >> contexts isn't it size 1?  ISTM This test needs slight improvement. >> Ideally we'd use some canonical test for detect the trailing array idiom >> rather than open-coding it here.  You might look at the array index >> warnings in tree-vrp.c to see if it's got a canonical test you can call >> or factor and use. > > You're right, there is an API for this (array_at_struct_end_p, > as Richard pointed out).  I didn't want to use it because it > treats any array at the end of a struct as a flexible array > member, but simple tests show that that's what -Wstringop- > overflow does now, and it wasn't my intention to tighten up > the checking under this change.  It surprises me that no tests > exposed this. Let me relax the check and think about proposing > to tighten it up separately. > > > >> What might be even better would be to use the immediate uses of the >> memory tag.  For your case there should be only one immediate use and it >> should point to the statement which NUL terminates the destination.  Or >> maybe that would be worse in that you only want to allow this exception >> when the statements are consecutive. > > I'll have to try this to better understand how it might work. It's actually quite simple. Rather than looking at the next statement in the chain via gsi_next_nondebug you follow the def->use chain for the memory tag associated with the string copy statement. /* Get the memory tag that is defined by this statement. */ defvar = gimple_vdef (gsi_stmt (gsi)); imm_use_iterator iter; gimple *use_stmt; if (num_imm_uses (defvar) == 1) { imm_use_terator iter; gimple *use_stmt; /* Iterate over the immediate uses of the memory tag. */ FOR_EACH_IMM_USE_STMT (use_stmt, ui, defvar) { Check if STMT is dst[i] = '\0' } } The check that there is a single immediate use is designed to make sure you get a warning for this scenario: strxncpy read the destination terminate the destination Which I think you'd want to consider non-terminated because of the read of the destination prior to termination. But avoids warnings for strxncpy stuff that doesn't read the destination terminate the destintion >> You still need to rename strlen_optimize_stmt since after your changes >> it does both optimizations and warnings. > > I'm not sure I understand why.  It's a pre-existing function that > just dispatches to the built-in handlers.  We don't rename function > callers each time we improve error/warning detection in some > function they call (case in point: all the expanders in builtins.c) > Why do it here?  And what would be a suitable name?  All that comes > to my mind is awkward variations on strlen_optimize_stmt_and_warn. Actually we often end up renaming functions as their capabilities change. If I was to read that name, I'd think its only purpose was to optimize. Given its static with a single use we should just fix it. Something in compute_objsize I just noticed. When DEST is an SSA_NAME, you follow the use->def chain back to its defining statement, then get a new dest from the RHS of that statement: > + if (TREE_CODE (dest) == SSA_NAME) > + { > + gimple *stmt = SSA_NAME_DEF_STMT (dest); > + if (!is_gimple_assign (stmt)) > + return NULL_TREE; > + > + dest = gimple_assign_rhs1 (stmt); > + } This seems wrong as-written -- you have no idea what the RHS code is. You're just blindly taking the first operand, then digging into its type. It probably works in practice, but it would seem better to verify that gimple_assign_rhs_code is a conversion first (CONVERT_EXPR_CODE_P). If it's not a CONVERT_EXPR_CODE_P, return NULL_TREE. [ Assuming the point here here is to look back through a conversion from an array type to a pointer. ] Jeff