From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-il1-x135.google.com (mail-il1-x135.google.com [IPv6:2607:f8b0:4864:20::135]) by sourceware.org (Postfix) with ESMTPS id 9CF933858D35 for ; Mon, 31 Jan 2022 21:49:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9CF933858D35 Received: by mail-il1-x135.google.com with SMTP id x6so133811ilg.9 for ; Mon, 31 Jan 2022 13:49:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=scas0/tsGyxNleHd360cR2DBejD6lERkWwAqK4O+NII=; b=121mR2Vf7Brpcg/++jjZNnpMKcvxBNO3fS7JHXG2B12uE9iRfuHwR7il4br2vHDh0c 1Zd6374KhYt4X1T+GhNbszoKeGbDp4Cv9ajq2kCeXjGF0oD3Lbe2cfn167RvrMyy6l51 lv0T3wR2ZNa5ZX2T6MdAnCYeQ4kxa3r483lHwT8cxHjuZsjyreK8XFfj52Oyopp/+qmW WYRkPoIzOmqOYReiN4T/i/SxuAf6kdTu6XFv0ll+WCwPyLSCrUgtZIpVCAsxxiqwnv4L +0awgtNC29ndENGU4cAN1ODYKnlgMUpiAqGbPw6bs4gEBCfsLU2XhvmHTO7TNTd7IfL9 GSzw== X-Gm-Message-State: AOAM530I6+22trj5+lvw63cS7RxTclfpYoCnW6lidbU7omt/iarGkhDB PepYTNj9vs2A2cy/IL33hLc454dhr3M= X-Google-Smtp-Source: ABdhPJxEpZx5owL7HzaX6ZjRQeXxx8fgKbPefeFk6e4vi5RgnOFOxv0dMEePEQUHGZpo7yPxePhl9w== X-Received: by 2002:a05:6e02:1ba7:: with SMTP id n7mr13908030ili.290.1643665784915; Mon, 31 Jan 2022 13:49:44 -0800 (PST) Received: from [192.168.0.41] (97-118-100-142.hlrn.qwest.net. [97.118.100.142]) by smtp.gmail.com with ESMTPSA id i22sm18526476ila.85.2022.01.31.13.49.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 31 Jan 2022 13:49:44 -0800 (PST) Message-ID: Date: Mon, 31 Jan 2022 14:49:43 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.4.0 Subject: Re: Potentially false-positive -Wstringop-overflow= warning with gcc >= 11.1 Content-Language: en-US To: Dumitru Ceara , Segher Boessenkool Cc: gcc-help@gcc.gnu.org References: <8d42151f-7be2-4d67-5b46-a83ba55d5c1c@redhat.com> <20220128152748.GD614@gate.crashing.org> From: Martin Sebor In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_NUMSUBJECT, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-help@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-help mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Jan 2022 21:49:47 -0000 On 1/28/22 09:16, Dumitru Ceara via Gcc-help wrote: > On 1/28/22 16:27, Segher Boessenkool wrote: >> On Fri, Jan 28, 2022 at 04:01:36PM +0100, Dumitru Ceara via Gcc-help wrote: >>> void *l4data = p.l4_ofs != UINT16_MAX ? (char *) p.base_ + p.l4_ofs : NULL; >>> struct hdr2 *h2 = l4data; >>> >>> memcpy(h2 + 1, &somedata[0], 6); >> >> l4data can be 0, and everything false apart from there on. >> > > In general, yes, l4data can be 0, if p.l4_ofs == UINT16_MAX, which can > only happen if pkt_bar() changed p.base_. > > But the compiler can't know that for sure and the warning makes it sound > like it's a sure thing: > > "warning: ‘memcpy’ writing 6 bytes into a region of size 0 overflows the > destination" The reason why it makes it sound like it's a sure thing is because a) the warning sees an invalid statement in the IL, and b) because GCC warnings work on one IL statement at a time and without regard to the conditions under which they might be executed. An IL statement is not always the same as a source code statement. Some statements are also isolated from the source code even when they don't appear there, or appear with different arguments. That's also what happens in this case: GCC replaces the memcpy call and the subsequent store with two: one pair is valid and another pair of invalid statements. Here's the problem in the IL the warning sees and points out: [count: 0]: memcpy (4B, &somedata[0], 6); <<< -Warray-bounds MEM[(struct hdr2 *)0B].h21 ={v} 0; <<< causes path isolation __builtin_trap (); and trap to be added Before the warning is issued GCC notices the branch of the code where l4data is null is invalid. It splits it into two branches: valid and invalid and adds a trap after the invalid sequence for good measure. The warning (which runs much later) then discovers the isolated code is invalid and points it out. The path isolation can be disabled by compiling with -fno-isolate-erroneous-paths-dereference. That also avoids the warning. This is an example where the invalid code isolation seems to work at cross purposes with the warning (or at least without coordination with it). It's also one that illustrates an inconsistency in the handling of invalid code (undefined behavior) in GCC. Some instances are eliminated (optimized away), others are replaced with a trap, others left in place but followed by a trap (the store at address zero), and others still are just left in place (the invalid memcpy at address 4). More to your point, it's also an example where warnings issued for code that's reachable only under some conditions make it seem like they point out unconditional bugs. All these are known problems and all of them should be tackled somehow. The last one (warning for conditional code) especially has been a sore point because it's in everyone's face. A number of ideas have been thrown out over the years, starting with simply rephrasing the warning to be less affirmative. E.g., instead of "memcpy writing..." print: ‘memcpy’ may write 6 bytes into a region of size 0 The problem with this suggestion is that because most instances of these warnings occur in some conditionally executed code, and because all GCC warnings are meant to be taken as possible (as opposed to definitive) indication of bugs, changing their phrasing wouldn't actually solve anything. All it would do is replace one with the other, without helping you understand why or when the bug might happen. What I think is needed here is to mention the conditions under which the invalid statement is executed. With that, we should be able to print the same context as what the static analyzer prints: warning: dereference of NULL ‘0’ [CWE-476] [-Wanalyzer-null-dereference] 43 | h2->h21 = 0; | ~~~~~~~~^~~ ‘foo’: events 1-3 | | 39 | *l4data = p.l4_ofs != UINT16_MAX ? (char *) p.base_ + p.l4_ofs : NULL; | | ^ | | | | | (1) following ‘false’ branch... |...... | 42 | py(h2 + 1, &somedata[0], 6); | | ~~~~~~ | | | | | (2) ...to here | 43 | h21 = 0; | | ~~~~~~~ | | | | | (3) dereference of NULL ‘’ Martin > > In particular [0], we know for sure that l4data will not be NULL and we > can avoid the warning with an extra assertion. It's just a bit > inconvenient I guess. > > In any case, thanks for the quick response! > > Regards, > Dumitru > > [0] > https://github.com/dceara/ovn/commit/4796a6c480d5d2e35ec2e20ed0ae23ab787fa175 >