From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 12C3C3870930; Tue, 25 Jun 2024 22:25:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 12C3C3870930 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1719354331; bh=uQjm0ETv45LoF3dquf9yjGy+mQSDGOytPqDTER/MUCM=; h=From:To:Subject:Date:In-Reply-To:References:From; b=HZwWL8lVEMlWENAL7UAd4SgIh7Otl9JSCeYbhVnMLxWfbnnPbCimlhuC774pSrk0p FEHGV2UvIA2j26UrRrRNfNLZmrcK6vfeEWWeKKv/R5hjHAQo3UsinpWOO36JyKYr/N ZrblKeJBxxbEAyvitBdL/5bHiA5+IChYiwNbEMqk= From: "hubicka at ucw dot cz" To: gcc-bugs@gcc.gnu.org Subject: [Bug ipa/114531] Feature proposal for an `-finline-functions-aggressive` compiler option Date: Tue, 25 Jun 2024 22:25:30 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: ipa X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: hubicka at ucw dot cz X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114531 --- Comment #18 from Jan Hubicka --- > different issue from the one that is raised in the PR. (Unless we think = that > -O2 and -O3 should always have the same inlining heuristics henceforward,= but > that seems unlikely.) Yes, I think point of -O3 is to let compiler to be more aggressive than what seems desirable for your average distro build defaults (which needs to balance speed and size). >=20 > At the moment, -O3 is essentially -O2 + some -f options + some --param op= tions. > Users who want to pick & chose some of the -f options can do so, and can= add > them to stable build systems. Normally, obsolete -f options are turned i= nto > no-ops rather than removed. But users can't pick & choose the --params, = and > add them to stable build systems, because we reserve the right to remove > --params without warning. Moreover those --params are slowly chaning their meaning in time. I need to retune inliner when early inlining gets smarter. >=20 > So IMO, we should have an -f option that represents =E2=80=9Cthe inlining= parameters > enabled by -O3=E2=80=9D, whatever they happen to be for a given release. = It's OK if > the set is empty. >=20 > For such a change, it doesn't really matter whether the current --params = are > the right ones. It just matters that the --params are the ones that we > currently use. If the --params are changed later, the -f option and -O3 = will > automatically stay in sync. I am trying to understand how useful this is. I am basically worried about two things 1) we have other optimization passes that behave differently at -O2 and -O3 (vectorizer, unrolling etc.) and I think we may want to have more. We also have -Os and -O1. So perhaps we want kind of more systmatic solution. We already have -fvect-cost-model that is kind of vectorizer version of the proposed inliner option. 2) inliner is already quite painful to tune. Especially since=20 one really needs to benchmark packages significantly bigger than SPECs which tends to be bit hard to set up and benchmark meaningfully. I usually do at least Firefox and clang where the first is always quite some work to get working well with latest GCC. We SUSE's LNT we also run "C++ behchmarks" which were initially collected as kind of inliner tests with higher abstraction penalty (tramp3d etc.). For many years I benchmarked primarily -O3 and -O3 + profile feedbcak on x86-64 only with ocassional look at -O2 and -Os behaviour which were generally more stable. I also tested other targets (poer and aarch64) but just sporadically, which is not good. After GCC5 I doubled testing to include both lto/non-lto variant. Since GCC10 -O2 started to envolve and needed re-testing too (lto/nonlto). One metric I know I ought to tune is -O2 -flto and FDO which used to be essentially -O3 before the optimization level --params were introduced, but now -O2 + FDO inlining is more conservative which hurts, for example, profiledbootstrapped GCC. So naturally I am bit worried to introduce even more combinations that needs testing and maintenance. If we add user friendly way to tweak this, we also make a promise to keep it sane.=