From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=nuMU=7M=suse.de=rguenther@sourceware.org>
Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29])
	by sourceware.org (Postfix) with ESMTPS id 32FB53858C2B
	for <gcc-patches@gcc.gnu.org>; Mon, 20 Mar 2023 08:21:45 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 32FB53858C2B
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de
Received: from relay2.suse.de (relay2.suse.de [149.44.160.134])
	by smtp-out2.suse.de (Postfix) with ESMTP id 323ED1F88C;
	Mon, 20 Mar 2023 08:21:43 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa;
	t=1679300503; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
	 mime-version:mime-version:content-type:content-type:
	 in-reply-to:in-reply-to:references:references;
	bh=+H1TTaQk7Ctm71Khu54GsbINnsioEQn1glLHzcM+3sc=;
	b=Qqd4Rvoto5kf+s4RI5VsFO8UG/A/FQ7fFqcqSjxxb4/p4tLzNrkz6yd2d7xt9YY8F+WaNn
	Y3DfvYHy1bW6mW/UhO21RKtZ+hheJL8s318kmzONMj5J5PB7/PfhPgNTmYwjMCwbjagZIj
	Yqo+TsnxrloruK/oHgyzv/6XqB4aLuk=
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;
	s=susede2_ed25519; t=1679300503;
	h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
	 mime-version:mime-version:content-type:content-type:
	 in-reply-to:in-reply-to:references:references;
	bh=+H1TTaQk7Ctm71Khu54GsbINnsioEQn1glLHzcM+3sc=;
	b=SliQmwGHKAED6GLL+9g6pZNdugjCtoi36mTNOGWSILZd3KCc2AUD+h9FqPoEpoYOwxWz2R
	pYu+5pBJlzuB4YCw==
Received: from wotan.suse.de (wotan.suse.de [10.160.0.1])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by relay2.suse.de (Postfix) with ESMTPS id 299A72C141;
	Mon, 20 Mar 2023 08:21:42 +0000 (UTC)
Date: Mon, 20 Mar 2023 08:21:42 +0000 (UTC)
From: Richard Biener <rguenther@suse.de>
To: Jakub Jelinek <jakub@redhat.com>
cc: gcc-patches@gcc.gnu.org, Jan Hubicka <hubicka@ucw.cz>
Subject: Re: [PATCH] tree-optimization/109170 - bogus use-after-free with
 __builtin_expect
In-Reply-To: <ZBR+ugrk6d2BLRWZ@tucnak>
Message-ID: <nycvar.YFH.7.77.849.2303200729540.18795@jbgna.fhfr.qr>
References: <20230317121833.16A961346F@imap2.suse-dmz.suse.de> <ZBRghPZG+wB1Xf6e@tucnak> <nycvar.YFH.7.77.849.2303171249341.18795@jbgna.fhfr.qr> <ZBRkFuZap8JIDMaG@tucnak> <nycvar.YFH.7.77.849.2303171328410.18795@jbgna.fhfr.qr> <ZBRzQljPnrVxbOQQ@tucnak>
 <nycvar.YFH.7.77.849.2303171405380.18795@jbgna.fhfr.qr> <ZBR+ugrk6d2BLRWZ@tucnak>
User-Agent: Alpine 2.22 (LSU 394 2020-01-19)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Fri, 17 Mar 2023, Jakub Jelinek wrote:

> On Fri, Mar 17, 2023 at 02:18:52PM +0000, Richard Biener wrote:
> > > And as you show on the testcases, it probably isn't a good idea for
> > > BUILT_IN_EXPECT* either.
> > > 
> > > So, perhaps use op_cfn_pass_through_arg1 for the ERF_RETURNS_ARG functions
> > > and BUILT_IN_EXPECT* ?
> > 
> > But that already causes the problems (I didn't yet finish testing
> > adding RET1 to BUILT_IN_EXPECT*).  Note FRE currently does not use
> > returns_arg but I have old patches that do - but those replace
> > uses after a RET1 function with the return value to also reduce
> > spilling around a call (they of course CSE equal calls).
> 
> I meant in your patch drop the builtins.cc hunk and add from your
> other patch
> > > +    case CFN_BUILT_IN_EXPECT:
> > > +    case CFN_BUILT_IN_EXPECT_WITH_PROBABILITY:
> > > +      m_valid = true;
> > > +      m_op1 = gimple_call_arg (call, 0);
> > > +      m_int = &op_cfn_pass_through_arg1;
> > > +      break;
> 
> hunk to gimple_range_op_handler::maybe_builtin_call.
> Does that already cause the problems?

Yes, that's basically what the first variant of the patch did and
the FAILs quoted are from testing that variant.

There are no extra fails from the second more generic patch also
touching builtins.cc

> I mean, if we e.g. see that a range of the argument is singleton,
> then it is fine to optimize the __builtin_expect away.

Yes, it's somewhat odd that we handle the - case but not the +:

@@ -204,6 +206,7 @@
   _2 = b.0_1 < 0;
   # RANGE [irange] long int [0, 1] NONZERO 0x1
   _3 = (long int) _2;
+  # RANGE [irange] long int [0, 1] NONZERO 0x1
   _4 = __builtin_expect (_3, 0);
   if (_4 != 0)
     goto <bb 5>; [10.00%]
...
@@ -224,13 +228,14 @@
   goto <bb 3>; [100.00%]
 
   <bb 5> [local count: 977105059]:
-  # _9 = PHI <_4(3), 0(4)>
+  # RANGE [irange] long int [0, 1] NONZERO 0x1
+  # _9 = PHI <1(3), 0(4)>
   if (_9 != 0)
-    goto <bb 6>; [10.00%]
+    goto <bb 6>; [50.00%]
   else
-    goto <bb 7>; [90.00%]
+    goto <bb 7>; [50.00%]


 Predictions for bb 5
-  first match heuristics: 10.00%
-  combined heuristics: 10.00%
-  __builtin_expect heuristics of edge 5->6: 10.00%
+  no prediction heuristics: 50.00%
+  combined heuristics: 50.00%

the dumps don't get any hints on where the first matchor
__builtin_expect heuristic came from, but it looks like we
run expr_expected_value_1 on _9 != 0 which recurses for _9
and runs into the PHI handling which then looks for
a common value into the PHI.  In this case _4 is said
to be zero by PRED_BUILTIN_EXPECT and probability 9000.
But that seems wrong - it looks simply at the __builtin_expect
def here, not taking into account the edge this flows through
(the unlikely edge).

If we look at the recorded edge predictions we instead see

$28 = {ep_next = 0x43e4c80, ep_edge = <edge 0x7ffff67e0f00 (3 -> 5)>, 
  ep_predictor = PRED_BUILTIN_EXPECT, ep_probability = 1000}

so that's the same edge but unlikely now.  Of course there's
no value recorded for this.  For the other edge into the PHI
we get

$31 = {ep_next = 0x43c1e90, ep_edge = <edge 0x7ffff67e0f60 (4 -> 5)>, 
  ep_predictor = PRED_BUILTIN_EXPECT, ep_probability = 9000}

so to me a more reasonable approach would be to record '0' from
the 2nd edge with a probability of 9000 (or for the '+' case IL
'1' with a probability of 1000).  There's possibly a way to
combine predictor + value (here we'd simply take the more
probable value, or the constant for a PHI).  I also see that
we handle any PHI this way, not just PHIs defined in the same
BB as the condition which op we're ultimatively interested in.

So my conclusion is that currently it's pure luck the testcase
"works", and thus adjusting it and opening a bug with the
above findings would be appropriate?

Honza?

As for the LIM testcase, I'll try massaging it further, possibly
turning it into a GIMPLE testcase with fixed counts will make it
work.

Richard.