From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5944 invoked by alias); 24 Oct 2019 14:46:21 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 5895 invoked by uid 89); 24 Oct 2019 14:46:17 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=formula, up, so, sas X-HELO: mail-qt1-f196.google.com Received: from mail-qt1-f196.google.com (HELO mail-qt1-f196.google.com) (209.85.160.196) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 24 Oct 2019 14:46:14 +0000 Received: by mail-qt1-f196.google.com with SMTP id t8so20680270qtc.6 for ; Thu, 24 Oct 2019 07:46:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:references:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=5GfVD40MIlNJBKehFdq0WZYiWO3ZY/jjHlIkhAZN72Y=; b=hCcZ6KtseWE3qa1Y02hzHb+caL8XjlpgRV92T61s4trilWDCtFcUKGbtgTyORDbHWl /58MPkWZ0BXk/i7S7LJ4v0rKK8HnGSyxuFx/nJE+XUe+BMR9tzdwMj0JQXov7/2osHZB 2QRfOtI7z3ZAWc5Mim1c26LhOoQk39zIBAT924HAE1/vMfEEGrZ3e7+zHKR3/JVs89qF 9Gy0F22zaqbEikYpMidwtGc5q90X1Plw7ssLdC4MHZxDId+I16Se05m/Gbd7AIPREai5 /cJNEOAI0l2gZD+yfRxf3w9GNdfa5sTnfGD44yb2FJRvig7qvOf/f1GiQcVpxNhJkYgn Cu2g== Return-Path: Received: from [172.21.143.231] (rrcs-50-75-166-42.nys.biz.rr.com. [50.75.166.42]) by smtp.gmail.com with ESMTPSA id x38sm10374554qtc.64.2019.10.24.07.46.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Oct 2019 07:46:11 -0700 (PDT) Subject: [PING][PATCH] bring -Warray-bounds closer to -Wstringop-overflow (PR91647, 91463, 91679) From: Martin Sebor To: Jeff Law , gcc-patches References: <5573d108-2441-8d0f-6c60-514b4fd0db3e@gmail.com> Message-ID: Date: Thu, 24 Oct 2019 14:47:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <5573d108-2441-8d0f-6c60-514b4fd0db3e@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2019-10/txt/msg01749.txt.bz2 Ping: https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00860.html Should I add something like the -Wzero-length-array-bounds option to allow some of the questionable idioms seen in the kernel? On 10/11/2019 10:34 AM, Martin Sebor wrote: > On 9/10/19 4:35 PM, Jeff Law wrote: >> On 9/6/19 1:27 PM, Martin Sebor wrote: >>> Recent enhancements to -Wstringop-overflow improved the warning >>> to the point that it detects a superset of the problems -Warray- >>> bounds is intended detect in character accesses.  Because both >>> warnings detect overlapping sets of problems, and because the IL >>> they work with tends to change in subtle ways from target to >>> targer, tests designed to verify one or the other sometimes fail >>> with a target where the warning isn't robust enough to detect >>> the problem given the IL representation. >>> >>> To reduce these test suite failures the attached patch extends >>> -Warray-bounds to handle some of the same problems -Wstringop- >>> overflow does, pecifically, out-of-bounds accesses to array >>> members of structs, including zero-length arrays and flexible >>> array members of defined objects. >>> >>> In the process of testing the enhancement I realized that >>> the recently added component_size() function doesn't work as >>> intended for non-character array members (see below).  The patch >>> corrects this by reverting back to the original implementation >>> of the function until the better/simpler solution can be put in >>> place as mentioned below. >>> >>> Tested on x86_64-linux. >>> >>> Martin >>> >>> >>> [*] component_size() happens to work for char arrays because those >>> are transformed to STRING_CSTs, but not for arrays that are not. >>> E.g., in >>> >>>    struct S { int64_t i; int16_t j; int16_t a[]; } >>>      s = { 0, 0, { 1, 0 } }; >>> >>> unless called with type set to int16_t[2], fold_ctor_reference >>> will return s.a[0] rather than all of s.a.  But set type to >>> int16_t[2] we would need to know that s.a's initializer has two >>> elements, and that's just what we're using fold_ctor_reference >>> to find out. >>> >>> I think this could probably be made to work somehow by extending >>> useless_type_conversion_p to handle this case as special somehow, >>> but it doesn't seem worth the effort given that there should be >>> an easier way to do it as you noted below. >>> >>> Given the above, the long term solution should be to rely on >>> DECL_SIZE_UNIT(decl) - TYPE_SIZE_UNIT(decl_type) as Richard >>> suggested in the review of its initial implementation. >>> Unfortunately, because of bugs in both the C and C++ front ends >>> (I just opened PR 65403 with the details) the simple formula >>> doesn't give the right answers either.  So until the bugs are >>> fixed, the patch reverts back to the original loopy solution. >>> It's no more costly than the current fold_ctor_reference >>> approach. > ... >> >> So no concerns with the patch itself, just the fallout you mentioned in >> a follow-up message.  Ideally we'd have glibc and the kernel fixed >> before this goes in, but I'd settle for just getting glibc fixed since >> we have more influence there. > > Half of the issues there were due to a bug in the warning.  The rest > are caused by Glibc's use of interior zero-length arrays to access > subsequent members.  It works in simple cases but it's very brittle > because GCC assumes that even such members don't alias. If it's meant > to be a supported feature then aliasing would have to be changed to > take it into account.  But I'd rather encourage projects to move away > from these dangerous hacks and towards cleaner, safer code. > > I've fixed the bug in the attached patch.  The rest can be suppressed > by replacing the zero-length arrays with flexible array members but > that's just trading one misuse for another.  If the code can't be > changed to avoid this (likely not an option since the arrays are in > visible in the public API) I think the best way to deal with them is > to suppress them by #pragma GCC diagnostic ignored.  I opened BZ 25097 > in Glibc Bugzilla to track this. > >> Out of curiosity are the kernel issues you raised due to flexible arrays >> or just cases where we're doing a better job on normal objects?  I'd be >> a bit surprised to find flexible arrays in the kernel. > > I don't think I've come across any flexible arrays in the kernel. > > The patch triggers 94 instances of -Warray-bounds (60 of which > are for distinct code) in 21 .c files.  I haven't looked at all > of them but some of the patterns I noticed are: > > 1) Intentionally using an interior zero-length array to access >    (e.g., memset) one or more subsequent members. E.g., >    _dbgp_external_startup in drivers/usb/early/ehci-dbgp.c and >    quite a few others.  This is pretty pervasive but seems easily >    avoidable. > > 2) Overwriting a member array with more data (e.g., function >    cxio_rdev_open in >    drivers/infiniband/hw/cxgb3/cxio_hal.c or in function >    pk_probe in drivers/hid/hid-prodikeys.c).  At first glance >    some of these look like bugs but with stuff obscured by macros >    and no comments it's hard to tell. > > 3) Uses of the container_of() macro to access one member given >    the address of another.  This is undefined (and again breaks >    the aliasing rules) but the macro is used all over the place >    in the kernel.  I count over 15,000 references to it. > > 4) Uses of one-element arrays as members of other one-element >    arrays (in include/scsi/fc/fc_ms.h).  Was this ever meant >    to be supported by GCC?  (It isn't by _FORTIFY_SOURCE=2.) > > 5) Possible false positives due to the recent loop unrolling >    change. > > It will be a quite a bit of work to clean this up.  To make it > easier we would introduce a new option to control the warning > for some of the most common idioms, such as > -Wzero-length-array-bounds.  I'm not too wild about this because > it would just paper over the problem.  A better solution would > also involve avoiding the aliasing assumptions for overlapping > zero-length member arrays. > > Anyway, attached is the updated patch with just the one fix > I mentioned above, retested on x86_64-linux. > > Martin