From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 9FA483858D32; Fri, 14 Apr 2023 07:17:24 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9FA483858D32 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1681456644; bh=G8RmNaqY6mzWyTSGVQbSuZoH5PPEDPlD0lA3yY/hARI=; h=From:To:Subject:Date:In-Reply-To:References:From; b=JxDs4rLOp9OLLjpAQlqlskDrx75ES5myjWKLoVPQNqjmzhHsJsUWBQ3UCIqQWFrE4 wMTLOFt76doblajz0b8li2fupnZhUtldm0Q3xXCbyYsqYE9d6yzPexwXKBwaAv8lEz 71snWzCv25OXtLc5iGa0JGa6PwhFF4IyKu00qQHo= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/109491] [11/12 Regression] Segfault in tree-ssa-sccvn.cc:expressions_equal_p() Date: Fri, 14 Apr 2023 07:17:23 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: ice-on-valid-code X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org X-Bugzilla-Target-Milestone: 11.4 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109491 --- Comment #13 from Richard Biener --- (In reply to Chip Kerchner from comment #12) > > having always_inline across a deep call stack can exponentially increas= e compile-time >=20 > Do you think it would be worth requesting a feature to reduce the > compilation times in situations like this? Ideally exponentially is not a > good thing. Well, suppose you have static __attribute__((always_inline)) inline void large_leaf () { /* large = */ } static __attribute__((always_inline)) inline void inter1 () { large_leaf ()= ; } static __attribute__((always_inline)) inline void inter2 () { inter1 (); in= ter1 (); } static __attribute__((always_inline)) inline void inter3 () { inter2 (); in= ter2 (); } void final () { inter3 (); inter3 (); } then of course you end up with 8 copies of large_leaf in 'final' (you asked for it). Now, implementation wise it gets worse because we also fully materialize the intermediate inter1, inter2 and inter3 with one and two and four copies. That's "only" double of the work but if it's single call chains the overhead is larger. There are specific cases where we could do better and IIRC some intermediate updating of the costs blows up here as well (we build a "fat" callgraph with inlined edges and inlined node clones). In the end it requires somebody to sit down and see where to improve things algorithmically - eventually eschewing the simple topological processing for all inline candidates in favor of first resolving always-inlines in the most optimal way, taking advantage of the fact that in principle we do not need their bodies anymore. I wasn't able to find a bug tracking this very specific issue so I created one. I have opened PR109509 for this.=