From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 50CAB385C6E0; Fri, 28 Jul 2023 07:22:22 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 50CAB385C6E0
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1690528942;
	bh=LFKd9tmpMu3jgqk7y+2VzFo/mnTf3etROvbGwxceKQo=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=R7xb/mxCXqhCW4FpMtOs1xPLYwyV+mycpLqYPyeTht7QPCdPUmfZVtYyAIysEsRvE
	 Phw4CJvLm6cD6ZqyiuE4KV5qCtv/nx233vB09BDRaPGjnu9iywjsAiKpF20uE545co
	 dRvceyDTHNL0r2dCxUxEEM8PbedsuPDMFKkmgLbE=
From: "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/106293] [13/14 Regression] 456.hmmer at
 -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022
Date: Fri, 28 Jul 2023 07:22:19 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 13.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenther at suse dot de
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 13.3
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-106293-4-UdlbHEaNfr@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-106293-4@http.gcc.gnu.org/bugzilla/>
References: <bug-106293-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106293
--- Comment #17 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 27 Jul 2023, hubicka at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106293
>=20
> --- Comment #15 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
>    if (bb_loop_depth (best_bb) =3D=3D bb_loop_depth (early_bb)
>        /* If result of comparsion is unknown, prefer EARLY_BB.
>          Thus use !(...>=3D..) rather than (...<...)  */
> -      && !(best_bb->count * 100 >=3D early_bb->count * threshold))
> +      && !(best_bb->count * 100 > early_bb->count * threshold))
>      return best_bb;
>=20
> Comparing loop depths seems ceartainly odd.=20=20
> If we want to test best_bb and early_bb to be in same loop, we want to te=
st
> loop_father.  What is a benefit of testing across loop nests?

This heuristic wants to catch

  <sink stmt>
  if (foo) abort ();
  <place to sink>

and avoid sinking "too far" across a path with "similar enough"
execution count (I think the original motivation was to fix some
spilling / register pressure issue).  The loop depth test
should be !(bb_loop_depth (best_bb) < bb_loop_depth (early_bb))
so we shouldn't limit sinking to a more outer nest.  As we rule
out > before this becomes =3D=3D.

It looks tempting to sink to the earliest place with the same
execution count rather than the latest but the above doesn't
really achive that (it doesn't look "upwards" but simply fails).
With a guessed profile it's also going to be hard.

And it in no way implements register pressure / spilling sensitivity
(see also Ajits attempts at producing a patch that avoids sinking
across a call).  All these are ultimatively doomed unless we at least
consider a group of stmts together.=