From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 0AD3D385771F; Thu, 27 Jul 2023 18:01:56 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0AD3D385771F
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1690480917;
	bh=iUA+ybcwZHONo9XHCPVaLxe8B/AOul/zDskFYVUKn7I=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=XWsbDoYEyvotXhsi9j5XefZtf+Lwh1d7OntUoE7YZcS6FUNuma6eEN8CFDPC/ZRYT
	 jmQ4Jyrc1vydDo23ZUOmLmQphCfwmNiIxbhGKHkAr0eHLRvxWKdA0I7P6CsHuzykr8
	 hSY5j8/jhUwJAZiBKx+zs7vGgVBedIbHfZL2qv9A=
From: "hubicka at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/106293] [13/14 Regression] 456.hmmer at
 -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022
Date: Thu, 27 Jul 2023 18:01:54 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 13.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: hubicka at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 13.3
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-106293-4-1ZP5o66wDt@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-106293-4@http.gcc.gnu.org/bugzilla/>
References: <bug-106293-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106293
--- Comment #15 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
   if (bb_loop_depth (best_bb) =3D=3D bb_loop_depth (early_bb)
       /* If result of comparsion is unknown, prefer EARLY_BB.
         Thus use !(...>=3D..) rather than (...<...)  */
-      && !(best_bb->count * 100 >=3D early_bb->count * threshold))
+      && !(best_bb->count * 100 > early_bb->count * threshold))
     return best_bb;

Comparing loop depths seems ceartainly odd.=20=20
If we want to test best_bb and early_bb to be in same loop, we want to test
loop_father.  What is a benefit of testing across loop nests?

Profile report here claims:
dump id |static mismat|dynamic mismatch                                    =
 |=20=20=20
        |in count     |in count                  |time                     =
 |=20=20=20
lsplit  |      5    +5|   8151850567  +8151850567| 531506481006       +57.9=
%|=20
ldist   |      9    +4|  15345493501  +7193642934| 606848841056       +14.2=
%|=20
ifcvt   |     10    +1|  15487514871   +142021370| 689469797790       +13.6=
%|=20
vect    |     35   +25|  17558425961  +2070911090| 517375405715       -25.0=
%|=20
cunroll |     42    +7|  16898736178   -659689783| 452445796198        -4.9=
%|=20=20
loopdone|     33    -9|   2678017188 -14220718990| 330969127663            =
 |=20=20=20
tracer  |     34    +1|   2678018710        +1522| 330613415364        +0.0=
%|=20=20
fre     |     33    -1|   2676980249     -1038461| 330465677073        -0.0=
%|=20=20
expand  |     28    -5|   2497468467   -179511782|-------------------------=
-|

so looks like loop splitting, distribution and vectorizer does disturb prof=
ile
signficantly.=20
(Ifcft does so by design and the damage is undone later.)
Not sure if that is the real problem though.=