From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 825233858C5E for ; Mon, 20 Mar 2023 12:12:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 825233858C5E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 3CFD221B72; Mon, 20 Mar 2023 12:12:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1679314335; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=pzYd3sA5afYGI7XtBvTA1EuYXckyiH+lKccE069l52E=; b=zZUi2VO6YrmahffpdhuVoAPjxxMCV262ap3Lq66/o370MIyfKVOqD/aya0QIx5wo0Q3M1S 2nbd6PoAyBPqcvFmPtMVP5uiQNbsijtzTotnzRbNJlN08HTcW15EgF/D9MX3kHJ/XM62kX QGp9+34+BubI213BAMweOgC3sV8RYxA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1679314335; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=pzYd3sA5afYGI7XtBvTA1EuYXckyiH+lKccE069l52E=; b=tkm+4s8Y/J4R0+nAcyyALe5BIQqUIBtlCQmtVIPVsKEndd9GMZ1bkBcOSSxQMzUAniU0pL G+1fjSvxoLhftNAg== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 14BA02C141; Mon, 20 Mar 2023 12:12:14 +0000 (UTC) Date: Mon, 20 Mar 2023 12:12:14 +0000 (UTC) From: Richard Biener To: Jakub Jelinek cc: gcc-patches@gcc.gnu.org, Jan Hubicka Subject: Re: [PATCH] tree-optimization/109170 - bogus use-after-free with __builtin_expect In-Reply-To: Message-ID: References: <20230317121833.16A961346F@imap2.suse-dmz.suse.de> User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, 20 Mar 2023, Richard Biener wrote: > On Fri, 17 Mar 2023, Jakub Jelinek wrote: > > > On Fri, Mar 17, 2023 at 02:18:52PM +0000, Richard Biener wrote: > > > > And as you show on the testcases, it probably isn't a good idea for > > > > BUILT_IN_EXPECT* either. > > > > > > > > So, perhaps use op_cfn_pass_through_arg1 for the ERF_RETURNS_ARG functions > > > > and BUILT_IN_EXPECT* ? > > > > > > But that already causes the problems (I didn't yet finish testing > > > adding RET1 to BUILT_IN_EXPECT*). Note FRE currently does not use > > > returns_arg but I have old patches that do - but those replace > > > uses after a RET1 function with the return value to also reduce > > > spilling around a call (they of course CSE equal calls). > > > > I meant in your patch drop the builtins.cc hunk and add from your > > other patch > > > > + case CFN_BUILT_IN_EXPECT: > > > > + case CFN_BUILT_IN_EXPECT_WITH_PROBABILITY: > > > > + m_valid = true; > > > > + m_op1 = gimple_call_arg (call, 0); > > > > + m_int = &op_cfn_pass_through_arg1; > > > > + break; > > > > hunk to gimple_range_op_handler::maybe_builtin_call. > > Does that already cause the problems? > > Yes, that's basically what the first variant of the patch did and > the FAILs quoted are from testing that variant. > > There are no extra fails from the second more generic patch also > touching builtins.cc > > > I mean, if we e.g. see that a range of the argument is singleton, > > then it is fine to optimize the __builtin_expect away. > > Yes, it's somewhat odd that we handle the - case but not the +: > > @@ -204,6 +206,7 @@ > _2 = b.0_1 < 0; > # RANGE [irange] long int [0, 1] NONZERO 0x1 > _3 = (long int) _2; > + # RANGE [irange] long int [0, 1] NONZERO 0x1 > _4 = __builtin_expect (_3, 0); > if (_4 != 0) > goto ; [10.00%] > ... > @@ -224,13 +228,14 @@ > goto ; [100.00%] > > [local count: 977105059]: > - # _9 = PHI <_4(3), 0(4)> > + # RANGE [irange] long int [0, 1] NONZERO 0x1 > + # _9 = PHI <1(3), 0(4)> > if (_9 != 0) > - goto ; [10.00%] > + goto ; [50.00%] > else > - goto ; [90.00%] > + goto ; [50.00%] > > > Predictions for bb 5 > - first match heuristics: 10.00% > - combined heuristics: 10.00% > - __builtin_expect heuristics of edge 5->6: 10.00% > + no prediction heuristics: 50.00% > + combined heuristics: 50.00% > > the dumps don't get any hints on where the first matchor > __builtin_expect heuristic came from, but it looks like we > run expr_expected_value_1 on _9 != 0 which recurses for _9 > and runs into the PHI handling which then looks for > a common value into the PHI. In this case _4 is said > to be zero by PRED_BUILTIN_EXPECT and probability 9000. > But that seems wrong - it looks simply at the __builtin_expect > def here, not taking into account the edge this flows through > (the unlikely edge). > > If we look at the recorded edge predictions we instead see > > $28 = {ep_next = 0x43e4c80, ep_edge = 5)>, > ep_predictor = PRED_BUILTIN_EXPECT, ep_probability = 1000} > > so that's the same edge but unlikely now. Of course there's > no value recorded for this. For the other edge into the PHI > we get > > $31 = {ep_next = 0x43c1e90, ep_edge = 5)>, > ep_predictor = PRED_BUILTIN_EXPECT, ep_probability = 9000} > > so to me a more reasonable approach would be to record '0' from > the 2nd edge with a probability of 9000 (or for the '+' case IL > '1' with a probability of 1000). There's possibly a way to > combine predictor + value (here we'd simply take the more > probable value, or the constant for a PHI). I also see that > we handle any PHI this way, not just PHIs defined in the same > BB as the condition which op we're ultimatively interested in. > > So my conclusion is that currently it's pure luck the testcase > "works", and thus adjusting it and opening a bug with the > above findings would be appropriate? > > Honza? I filed PR109210 and updated the patch with adjustment of the two failing testcases. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK? Thanks, Richard. >From aac0ed731c49f74d7f363b80745150943b75bd4c Mon Sep 17 00:00:00 2001 From: Richard Biener Date: Fri, 17 Mar 2023 13:14:49 +0100 Subject: [PATCH] tree-optimization/109170 - bogus use-after-free with __builtin_expect To: gcc-patches@gcc.gnu.org The following adds a missing range-op for __builtin_expect which helps -Wuse-after-free to detect the case a realloc original pointer is used when the result was NULL. The implementation should handle all argument one pass-through builtins we handle in the fnspec machinery. The gcc.dg/tree-ssa/ssa-lim-21.c testcase needs adjustment because for (int j = 0; j < m; j++) if (__builtin_expect (m, 0)) for (int i = 0; i < m; i++) is now correctly optimized to a unconditional jump by EVRP - m cannot be zero when the outer loop is entered. I've adjusted the outer loop to iterate 'n' times which makes us apply store-motion to 'count' and 'q->data1' but only out of the inner loop and as expected not apply store motion to 'q->data' at all. The gcc.dg/predict-20.c testcase relies on broken behavior of profile estimation when trying to handle __builtin_expect values flowing into PHI nodes. I have opened PR109210 and removed the expected matching from the testcase. PR tree-optimization/109170 * gimple-range-op.cc (cfn_pass_through_arg1): New. (gimple_range_op_handler::maybe_builtin_call): Handle __builtin_expect and similar via cfn_pass_through_arg1 and inspecting the calls fnspec. * builtins.cc (builtin_fnspec): Handle BUILT_IN_EXPECT and BUILT_IN_EXPECT_WITH_PROBABILITY. * gcc.dg/Wuse-after-free-pr109170.c: New testcase. * gcc.dg/tree-ssa/ssa-lim-21.c: Adjust. * gcc.dg/predict-20.c: Likewise. --- gcc/builtins.cc | 2 ++ gcc/gimple-range-op.cc | 32 ++++++++++++++++++- .../gcc.dg/Wuse-after-free-pr109170.c | 15 +++++++++ gcc/testsuite/gcc.dg/predict-20.c | 3 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c | 7 ++-- 5 files changed, 54 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/Wuse-after-free-pr109170.c diff --git a/gcc/builtins.cc b/gcc/builtins.cc index 90246e214d6..56545027297 100644 --- a/gcc/builtins.cc +++ b/gcc/builtins.cc @@ -11715,6 +11715,8 @@ builtin_fnspec (tree callee) case BUILT_IN_RETURN_ADDRESS: return ".c"; case BUILT_IN_ASSUME_ALIGNED: + case BUILT_IN_EXPECT: + case BUILT_IN_EXPECT_WITH_PROBABILITY: return "1cX "; /* But posix_memalign stores a pointer into the memory pointed to by its first argument. */ diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc index a5d625387e7..1a00f1690e5 100644 --- a/gcc/gimple-range-op.cc +++ b/gcc/gimple-range-op.cc @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3. If not see #include "range.h" #include "value-query.h" #include "gimple-range.h" +#include "attr-fnspec.h" // Given stmt S, fill VEC, up to VEC_SIZE elements, with relevant ssa-names // on the statement. For efficiency, it is an error to not pass in enough @@ -309,6 +310,26 @@ public: } } op_cfn_constant_p; +// Implement range operator for integral/pointer functions returning +// the first argument. +class cfn_pass_through_arg1 : public range_operator +{ +public: + using range_operator::fold_range; + virtual bool fold_range (irange &r, tree, const irange &lh, + const irange &, relation_trio) const + { + r = lh; + return true; + } + virtual bool op1_range (irange &r, tree, const irange &lhs, + const irange &, relation_trio) const + { + r = lhs; + return true; + } +} op_cfn_pass_through_arg1; + // Implement range operator for CFN_BUILT_IN_SIGNBIT. class cfn_signbit : public range_operator_float { @@ -967,6 +988,15 @@ gimple_range_op_handler::maybe_builtin_call () break; default: - break; + { + unsigned arg; + if (gimple_call_fnspec (call).returns_arg (&arg) && arg == 0) + { + m_valid = true; + m_op1 = gimple_call_arg (call, 0); + m_int = &op_cfn_pass_through_arg1; + } + break; + } } } diff --git a/gcc/testsuite/gcc.dg/Wuse-after-free-pr109170.c b/gcc/testsuite/gcc.dg/Wuse-after-free-pr109170.c new file mode 100644 index 00000000000..fa7dc66d66c --- /dev/null +++ b/gcc/testsuite/gcc.dg/Wuse-after-free-pr109170.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -Wuse-after-free" } */ + +unsigned long bufmax = 0; +unsigned long __open_catalog_bufmax; +void *realloc(void *, __SIZE_TYPE__); +void free(void *); + +void __open_catalog(char *buf) +{ + char *old_buf = buf; + buf = realloc (buf, bufmax); + if (__builtin_expect ((buf == ((void *)0)), 0)) + free (old_buf); /* { dg-bogus "used after" } */ +} diff --git a/gcc/testsuite/gcc.dg/predict-20.c b/gcc/testsuite/gcc.dg/predict-20.c index 31d01835b80..7bb0d411f88 100644 --- a/gcc/testsuite/gcc.dg/predict-20.c +++ b/gcc/testsuite/gcc.dg/predict-20.c @@ -16,8 +16,9 @@ c () break; } int d = b < 0; + /* We fail to apply __builtin_expect heuristics here. Se PR109210. */ if (__builtin_expect (d, 0)) asm(""); } -/* { dg-final { scan-tree-dump-times "__builtin_expect heuristics of edge" 3 "profile_estimate"} } */ +/* { dg-final { scan-tree-dump-times "__builtin_expect heuristics of edge" 2 "profile_estimate" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c index ffe6f8f699d..fe29e841f28 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c @@ -17,7 +17,7 @@ void func (int m, int n, int k, struct obj *a) { struct obj *q = a; - for (int j = 0; j < m; j++) + for (int j = 0; j < n; j++) if (__builtin_expect (m, 0)) for (int i = 0; i < m; i++) { @@ -31,5 +31,6 @@ func (int m, int n, int k, struct obj *a) } } -/* { dg-final { scan-tree-dump-not "Executing store motion of" "lim2" } } */ - +/* { dg-final { scan-tree-dump "Executing store motion of count from loop 2" "lim2" } } */ +/* { dg-final { scan-tree-dump "Executing store motion of \[^ \]*data1 from loop 2" "lim2" } } */ +/* { dg-final { scan-tree-dump-times "Executing store motion of" 2 "lim2" } } */ -- 2.35.3