From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 9FA483858D32; Fri, 14 Apr 2023 07:17:24 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9FA483858D32
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1681456644;
	bh=G8RmNaqY6mzWyTSGVQbSuZoH5PPEDPlD0lA3yY/hARI=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=JxDs4rLOp9OLLjpAQlqlskDrx75ES5myjWKLoVPQNqjmzhHsJsUWBQ3UCIqQWFrE4
	 wMTLOFt76doblajz0b8li2fupnZhUtldm0Q3xXCbyYsqYE9d6yzPexwXKBwaAv8lEz
	 71snWzCv25OXtLc5iGa0JGa6PwhFF4IyKu00qQHo=
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/109491] [11/12 Regression] Segfault in
 tree-ssa-sccvn.cc:expressions_equal_p()
Date: Fri, 14 Apr 2023 07:17:23 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 13.0
X-Bugzilla-Keywords: ice-on-valid-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 11.4
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-109491-4-6cFmsHAeAg@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-109491-4@http.gcc.gnu.org/bugzilla/>
References: <bug-109491-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109491
--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Chip Kerchner from comment #12)
> > having always_inline across a deep call stack can exponentially increas=
e compile-time
>=20
> Do you think it would be worth requesting a feature to reduce the
> compilation times in situations like this?  Ideally exponentially is not a
> good thing.

Well, suppose you have

static __attribute__((always_inline)) inline void large_leaf () { /* large =
*/ }

static __attribute__((always_inline)) inline void inter1 () { large_leaf ()=
; }

static __attribute__((always_inline)) inline void inter2 () { inter1 (); in=
ter1
(); }

static __attribute__((always_inline)) inline void inter3 () { inter2 (); in=
ter2
(); }

void final () { inter3 (); inter3 (); }

then of course you end up with 8 copies of large_leaf in 'final' (you asked
for it).  Now, implementation wise it gets worse because we also fully
materialize the intermediate inter1, inter2 and inter3 with one and two
and four copies.  That's "only" double of the work but if it's single
call chains the overhead is larger.

There are specific cases where we could do better and IIRC some intermediate
updating of the costs blows up here as well (we build a "fat" callgraph
with inlined edges and inlined node clones).

In the end it requires somebody to sit down and see where to improve things
algorithmically - eventually eschewing the simple topological processing
for all inline candidates in favor of first resolving always-inlines in
the most optimal way, taking advantage of the fact that in principle
we do not need their bodies anymore.

I wasn't able to find a bug tracking this very specific issue so I created
one.  I have opened PR109509 for this.=