From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id E75C3388E821; Wed, 6 May 2020 13:01:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E75C3388E821 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1588770089; bh=qc4m+FnhjgZX8pnsEgIuoxNKTgsdN75biPJ5umXCLB4=; h=From:To:Subject:Date:In-Reply-To:References:From; b=SIxx6hVSsKICWPuRvBWUWGpHIVurF8ffGq3aOUdPeowrMLrXL6bTrKdfBzg4gUhGX KLSS0BEHIkqqmciCDTdkmWpkq3nTPg5lHYcY7+e/0sF0iVVSy/xIhaqA8dgumba36n oyNE7kfhBsywUdMR5ywV8XssNgIZXihy0H6CmzkU= From: "erich.keane at intel dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/94960] extern template prevents inlining of standard library objects Date: Wed, 06 May 2020 13:01:28 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: 9.1.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: erich.keane at intel dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 May 2020 13:01:30 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D94960 --- Comment #6 from Erich Keane --- (In reply to Jonathan Wakely from comment #5) > (In reply to Erich Keane from comment #3) > > As you know, "extern template" is a hint to the compiler that we don't = need > > to emit the template as a way to save on compile time. > >=20 > > Both GCC and clang will NOT instantiate these templates in O0 mode.=20 > > However, in O1+ modes, both will actually still instantiate the templat= es in > > the frontend, BUT only for 'inline' functions. Basically, we're using > > 'inline' as a heuristic that there is benefit in sending these function= s to > > the optimizer (basically, sacrificing the compile time gained by 'extern > > template' in exchange for a better inlining experience). >=20 > Hmm, I've seen different behaviours for clang and g++ in this respect, wi= th > clang inlining a lot more of std::string's members. So I'm surprised they > use the same heuristic. >=20 > Do they both instantiate the function templates marked 'inline' even at -= O1? > Presumably not at -O0. My understanding of Clang is based on a brief debugging session. My understanding of GCC's behavior here is a brief amount of time messing arou= nd on godbolt. I could very well be incorrect. >=20 > > In the submitter's case, the std::string constructor calls "_M_construc= t".=20 > > The constructor is inlined, but _M_construct is not, since it never get= s to > > the optimizer. > >=20 > > libc++ uses an __init function to do the same thing as _M_construct, ho= wever > > IT is marked inline, and thus doesn't have the problem. > >=20 > > I believe the submitter wants to have you mark more of the functions in > > extern-templated classes 'inline' so that it matches the heuristic bett= er. >=20 > And that's what I don't want to do. I think it's wrong for the human to s= ay > "inline this!" because humans are stupid (well, I am anyway). And I don't > want to have to examine the GIMPLE/asm again for every new GCC release to > decide whether 'inline' is still in the right places (and whether the ans= wer > should be different for every different version of Clang or ICC!) >=20 > And when I say "I don't want to" I mean "I am never ever going to". >=20 > > I don't think that there is a good way to change the compiler itself wi= thout > > making 'extern template' absolutely meaningless. >=20 > I absolutely disagree. >=20 > It would still give a reduction in object file size for cases where the > compiler decides not to inline, and still make compilation much faster for > -O0 and -O1. That is fair, I guess it would slightly reduce 'link' time because of that.= I doubt people would be willing to put up with the STL compiling that much sl= ower though (which seems to be the major user of this feature in my experience). > One property of -O2 and -O3 is that we try to optimize aggressively even = if > that takes a long time to compile. So we could instantiate things that ha= ve > an explicit instantiation declaration (thus doing "redundant" work) to see > if inlining them would be beneficial. That would take longer to compile, = but > might produce faster code. If the heuristics decide the instantiation ends > up too big to inline, it could just discard it (because we know there's a > definition elsewhere). That is essentially what the frontends DO, except only with the 'inline' functions. If the inliner chooses to not inline it, it gets thrown out (si= nce we've marked it 'available externally'). > If the only way to get that is to mark every function as 'inline' (and th= en > "trick" the compiler into doing all that extra work even at -O1?) then we > might as well add 'inline' to every single function template in = and > , , etc. so they're all potential candiates > for inlining. >=20 > And if we have to mark every single function as 'inline' then maybe the > compiler shouldn't be using it as a hint. I don't think the idea is to mark EVERY function 'inline', simply ones that= are pretty tiny and really good candidates for inlining.=