From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 8655E3858D34; Wed, 27 Mar 2024 19:49:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8655E3858D34 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1711568991; bh=YDWGzruyPqMmawlscImrJH6Yv5AtZ+JhuVX9HIbNxSc=; h=From:To:Subject:Date:In-Reply-To:References:From; b=ycnEHKhYfc+VuO2o4X9iXGJQS7Sbfa7KAYGGBGdx/OlFXrEORZfseL7OQE0MJmow2 C76SmdKut4AkaNdn+Tnf9tHqtyDH+dKSx/EWBQXyJlsHK6gaHHQmiphPQZ+o59S9MY Y0FsfYdEWd73kY8u9NAwE1lsg0pZDofOalVRR69Y= From: "vmakarov at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/114480] g++: internal compiler error: Segmentation fault signal terminated program cc1plus Date: Wed, 27 Mar 2024 19:49:50 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: 11.4.0 X-Bugzilla-Keywords: compile-time-hog, memory-hog, ra X-Bugzilla-Severity: normal X-Bugzilla-Who: vmakarov at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114480 --- Comment #11 from Vladimir Makarov --- My finding is that RA is not a problem for GCC speed with -O1 and up. RA in -O0 does really consume a big portion of GCC compiler time. The biggest part of RA in -O0 is actually spent in life analysis. It is difficult to implement a modest RA w/o life analysis as it will results in huge stack slot generation (not knowing pseudo lives basically means allocating stack slot for each pseudo). The problem with the test is a huge number of pseudos (or IRA objects). This results in a big sparse set (which can be hardly placed in L3 cache) and bad cache behaviour. I tried to use a bitmap instead of sparse set, but GCC crashed after allocating 48GB memory. Sbitmap works better and improves IRA time by 12%. But it works worse for other more frequently use cases. So I don't think that RA behaviour can be improved for this case.=