From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id 49FF53858429; Thu, 10 Mar 2022 14:01:21 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 49FF53858429
From: "amacleod at redhat dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/102943] [12 Regression] Jump threader
 compile-time hog with 521.wrf_r
Date: Thu, 10 Mar 2022 14:01:20 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 12.0
X-Bugzilla-Keywords: compile-time-hog
X-Bugzilla-Severity: normal
X-Bugzilla-Who: amacleod at redhat dot com
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 12.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-102943-4-F1kM4wQyjZ@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-102943-4@http.gcc.gnu.org/bugzilla/>
References: <bug-102943-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Mar 2022 14:01:21 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102943
--- Comment #41 from Andrew Macleod <amacleod at redhat dot com> ---

>=20
> so it's still by far jump-threading/VRP dominating compile-times (I wonder
> if we should separate "old" and "new" [E]VRP timevars).  Given that VRP
> shows up as well it's more likely the underlying ranger infrastructure?

Yeah, Id be tempted to just label them vrp1 (evrp)  vrp2 (current vrp1)  and
vrp3 (current vrp2) and track them separately.  I have noticed significant
behaviour differences from the code we see at VRP2 time vs EVRP.


>=20
> perf thrown on ltrans22 shows
>=20
> Samples: 302K of event 'cycles', Event count (approx.): 331301505627=20=
=20=20=20=20=20=20=20
>=20
> Overhead       Samples  Command      Shared Object     Symbol=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20
>=20
>   10.34%         31299  lto1-ltrans  lto1              [.]
> bitmap_get_aligned_chunk
>    7.44%         22540  lto1-ltrans  lto1              [.] bitmap_bit_p
>    3.17%          9593  lto1-ltrans  lto1              [.]

>=20
> callgraph info in perf is a mixed bag, but maybe it helps to pinpoint thi=
ngs:
>=20
> -   10.20%    10.18%         30364  lto1-ltrans  lto1              [.]
> bitmap_get_aligned_chunk=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20
> #
>    - 10.18% 0xffffffffffffffff=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20
> #
>       + 9.16% ranger_cache::propagate_cache=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
> #
>       + 1.01% ranger_cache::fill_block_cache=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20
>=20

I am currently looking at reworking the cache again so that the propagation=
 is
limited only to actual changes.  It can still currently get out of hand in
massive CFGs, and thats already using the sparse representation.  There may=
 be
some minor tweaks that can make a big difference here.    I'll have a look =
over
the next couple of days.

Its probably safe to assume the threading performance is directly related to
this as well.


.=