From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id 7287E385802B; Mon, 29 Nov 2021 14:22:42 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7287E385802B
From: "aldyh at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/103409] [12 Regression] 18% SPEC2017 WRF
 compile-time regression with -O2 -flto since
 r12-5228-gb7a23949b0dcc4205fcc2be6b84b91441faa384d
Date: Mon, 29 Nov 2021 14:22:42 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 12.0
X-Bugzilla-Keywords: compile-time-hog
X-Bugzilla-Severity: normal
X-Bugzilla-Who: aldyh at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 12.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-103409-4-DzwIZu09Kx@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-103409-4@http.gcc.gnu.org/bugzilla/>
References: <bug-103409-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Nov 2021 14:22:42 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103409
--- Comment #9 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
There's definitely something in the threader, but I'm not sure it's the cau=
se
of all the regression.

For the record, I've reproduced on ppc64le with a spec .cfg file having:

OPTIMIZE    =3D -O2 -flto=3D100 -save-temps -ftime-report -v -fno-checking

The slow wrf_r.ltransNN.o files that dominate the compilation and are taking
more than 2-3 seconds are (42, 76, and 24).  I've distilled -ftime-report f=
or
VRP and jump threading, which usually go hand in hand now that VRP2 runs wi=
th
ranger:

dumping.42: tree VRP                           :  13.70 (  3%)   0.08 (  2%=
)=20
13.73 (  3%)    45M (  4%)
dumping.42: backwards jump threading           :  26.68 (  5%)   0.00 (  0%=
)=20
26.72 (  5%)  3609k (  0%)
dumping.42: TOTAL                              : 524.00          3.31=20=20=
=20=20=20=20=20
527.30         1277M
dumping.76: tree VRP                           :  38.30 ( 13%)   0.03 (  2%=
)=20
38.31 ( 13%)    19M (  2%)
dumping.76: backwards jump threading           :  47.38 ( 17%)   0.01 (  1%=
)=20
47.37 ( 16%)  1671k (  0%)
dumping.76: TOTAL                              : 286.03          1.79=20=20=
=20=20=20=20=20
287.82         1173M
dumping.24: tree VRP                           :  87.43 (  8%)   0.07 (  2%=
)=20
87.53 (  8%)    58M (  3%)
dumping.24: backwards jump threading           : 129.81 ( 12%)   0.00 (  0%)
129.81 ( 12%)  8986k (  0%)
dumping.24: TOTAL                              :1042.37          3.58=20=20=
=20=20=20=20
1045.93         2325M

Threading is usually more expensive than VRP because it tries candidates ov=
er
and over, but it's not meant to be orders of magnitude slower.  Prior to the
bisected patch in r12-5228, we had:

dumping.42: tree VRP                           :  14.58 (  3%)   0.07 (  2%=
)=20
14.62 (  3%)    45M (  4%)
dumping.42: backwards jump threading           :  13.88 (  3%)   0.00 (  0%=
)=20
13.89 (  3%)  3609k (  0%)
dumping.42: TOTAL                              : 484.12          3.06=20=20=
=20=20=20=20=20
487.18         1277M
dumping.76: tree VRP                           :  37.68 ( 13%)   0.04 (  2%=
)=20
37.79 ( 13%)    19M (  2%)
dumping.76: backwards jump threading           :  45.50 ( 15%)   0.03 (  2%=
)=20
45.52 ( 15%)  1671k (  0%)
dumping.76: TOTAL                              : 293.74          1.81=20=20=
=20=20=20=20=20
295.55         1173M
dumping.24: tree VRP                           :  94.27 (  9%)   0.11 (  3%=
)=20
94.39 (  9%)    58M (  3%)
dumping.24: backwards jump threading           : 102.63 ( 10%)   0.02 (  0%)
102.67 ( 10%)  8986k (  0%)
dumping.24: TOTAL                              :1021.66          4.28=20=20=
=20=20=20=20
1025.92         2325M

So at least for ltrans42, there's a big slowdown with this patch.  Before,
threading was 4.80% faster than VRP, whereas now it's 94.7% slower.

I have a patch for the above slowdown, but I wouldn't characterize the above
difference as a "compile hog".  When I add up the 3 ltrans unit totals (whi=
ch
are basically the entire compilation), the difference is a 3% slowdown.

If this PR is for a larger than 3-4% slowdown, I think we should look
elsewhere.  I could be wrong though ;-).=