From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 525183858407; Mon, 22 Jan 2024 12:00:19 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 525183858407
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1705924819;
	bh=cp6VZPA2PhUTE8bZPx2XBOjqSvO2EwTruaff2mqQZZ0=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=oVWC3n4YY0etPCrZqa8Oi0N8aI98x0NrrC/3C+kUOu7slSyf9o/V+5QD81dpUf6HY
	 oRMlz1YjHnibmOgp5caQ9lrCupvtMwh6P2SOGm2yciiKmkdVXGGwtFMjk/Jsx188n6
	 m2HVruqx38FI5Bnr9wjpkaD3T0R7xPjDfAwPWASc=
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/113495] RISC-V: Time and memory awful
 consumption of SPEC2017 wrf benchmark
Date: Mon, 22 Jan 2024 12:00:18 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: rtl-optimization
X-Bugzilla-Version: 14.0
X-Bugzilla-Keywords: compile-time-hog, memory-hog
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-113495-4-7gPuQ3Najb@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-113495-4@http.gcc.gnu.org/bugzilla/>
References: <bug-113495-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113495
--- Comment #29 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to rguenther@suse.de from comment #26)
> On Fri, 19 Jan 2024, juzhe.zhong at rivai dot ai wrote:
>=20
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113495
> >=20
> > --- Comment #22 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
> > (In reply to Richard Biener from comment #21)
> > > I once tried to avoid df_reorganize_refs and/or optimize this with the
> > > blocks involved but failed.
> >=20
> > I am considering whether we should disable LICM for RISC-V by default i=
f vector
> > is enabled ?
> > Since the compile time explode 10 times is really horrible.
>=20
> I think that's a bad idea.  It only explodes for some degenerate cases.
> The best would be to fix invariant motion to keep DF up-to-date so
> it can stop using df_analyze_loop and instead analyze the whole function.
> Or maybe change it to use the rtl-ssa framework instead.
>=20
> There's already param_loop_invariant_max_bbs_in_loop:
>=20
>   /* Process the loops, innermost first.  */
>   for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
>     {
>       curr_loop =3D loop;
>       /* move_single_loop_invariants for very large loops is time=20
> consuming
>          and might need a lot of memory.  For -O1 only do loop invariant
>          motion for very small loops.  */
>       unsigned max_bbs =3D param_loop_invariant_max_bbs_in_loop;
>       if (optimize < 2)
>         max_bbs /=3D 10;
>       if (loop->num_nodes <=3D max_bbs)
>         move_single_loop_invariants (loop);
>     }
>=20
> it might be possible to restrict invariant motion to innermost loops
> when the overall number of loops is too large (with a new param
> for that).  And when the number of innermost loops also exceeds
> the limit avoid even that?  The above also misses a
> optimize_loop_for_speed_p (loop) check (probably doesn't make
> a difference, but you could try).

Ah, sorry - I was mis-matching LICM to invariant motion above, still
invariant motion is the biggest offender (might be due to DF checking
if you enabled that).

As for sbitmap vs. bitmap it's a difficult call.  When there's big
profile hits on individual bit operations (bitmap_bit_p, bitmap_set_bit)
it might may off to use bitmap but with tree view.  There's also
sparseset but that requires even more memory.=