public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/94960] New: extern template prevents inlining of standard library objects
@ 2020-05-05 20:09 krzysio.kurek at wp dot pl
  2020-05-05 21:02 ` [Bug libstdc++/94960] " pinskia at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: krzysio.kurek at wp dot pl @ 2020-05-05 20:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960

            Bug ID: 94960
           Summary: extern template prevents inlining of standard library
                    objects
           Product: gcc
           Version: 9.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: krzysio.kurek at wp dot pl
  Target Milestone: ---

Consider this example
void foo()
{
  std::string(1, 0);
}
(https://godbolt.org/z/AlkBBJ)
This function creates a string using the `basic_string(size_t, CharT)`
constructor and then discards it. This particular constructor uses _M_construct
internally, which is declared as an out of line member function. Because of
this, and because the function isn't marked as `inline`, when the compiler
reaches the `extern template class basic_string<char>;`, it foregoes trying to
find the definition for _M_construct, instead generating a call to it, causing
foo() to fully instantiate a string object and then delete it, since the
compiler can't find _M_construct within its own translation unit.

This problem applies to every member function of any class which has an extern
template, is defined out of line and is not marked as `inline`.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libstdc++/94960] extern template prevents inlining of standard library objects
  2020-05-05 20:09 [Bug libstdc++/94960] New: extern template prevents inlining of standard library objects krzysio.kurek at wp dot pl
@ 2020-05-05 21:02 ` pinskia at gcc dot gnu.org
  2020-05-05 21:08 ` redi at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-05-05 21:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
g:1a289fa36294627c252492e4c18d7877a7c80dc1 changed that.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libstdc++/94960] extern template prevents inlining of standard library objects
  2020-05-05 20:09 [Bug libstdc++/94960] New: extern template prevents inlining of standard library objects krzysio.kurek at wp dot pl
  2020-05-05 21:02 ` [Bug libstdc++/94960] " pinskia at gcc dot gnu.org
@ 2020-05-05 21:08 ` redi at gcc dot gnu.org
  2020-05-06  0:19 ` erich.keane at intel dot com
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: redi at gcc dot gnu.org @ 2020-05-05 21:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2020-05-05
     Ever confirmed|0                           |1
           Keywords|                            |missed-optimization

--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Please provide complete testcases, not just URLs, as required by
https://gcc.gnu.org/bugs

#include <string>

int main()
{
    std::string(size_t(0), 0);
}


I still think it's wrong for GCC to treat the 'inline' specifier as an inlining
hint. The compiler should be a better judge of inlining decisions than the
developer.

(In reply to Andrew Pinski from comment #1)
> g:1a289fa36294627c252492e4c18d7877a7c80dc1 changed that.

Well that commit just meant that the explicit instantiations are declared for
C++17 as well, where previously they were only declared for < C++17. It didn't
add the explicit instantiations.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libstdc++/94960] extern template prevents inlining of standard library objects
  2020-05-05 20:09 [Bug libstdc++/94960] New: extern template prevents inlining of standard library objects krzysio.kurek at wp dot pl
  2020-05-05 21:02 ` [Bug libstdc++/94960] " pinskia at gcc dot gnu.org
  2020-05-05 21:08 ` redi at gcc dot gnu.org
@ 2020-05-06  0:19 ` erich.keane at intel dot com
  2020-05-06  7:05 ` [Bug c++/94960] " rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: erich.keane at intel dot com @ 2020-05-06  0:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960

Erich Keane <erich.keane at intel dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |erich.keane at intel dot com

--- Comment #3 from Erich Keane <erich.keane at intel dot com> ---
(In reply to Jonathan Wakely from comment #2)
> Please provide complete testcases, not just URLs, as required by
> https://gcc.gnu.org/bugs
> 
> #include <string>
> 
> int main()
> {
>     std::string(size_t(0), 0);
> }
> 
> 
> I still think it's wrong for GCC to treat the 'inline' specifier as an
> inlining hint. The compiler should be a better judge of inlining decisions
> than the developer.
> 
> (In reply to Andrew Pinski from comment #1)
> > g:1a289fa36294627c252492e4c18d7877a7c80dc1 changed that.
> 
> Well that commit just meant that the explicit instantiations are declared
> for C++17 as well, where previously they were only declared for < C++17. It
> didn't add the explicit instantiations.

Hi Jon!
I helped the submitter in #llvm debug this a little, so I perhaps have a better
understanding of his issue:

As you know, "extern template" is a hint to the compiler that we don't need to
emit the template as a way to save on compile time.

Both GCC and clang will NOT instantiate these templates in O0 mode.  However,
in O1+ modes, both will actually still instantiate the templates in the
frontend, BUT only for 'inline' functions.  Basically, we're using 'inline' as
a heuristic that there is benefit in sending these functions to the optimizer
(basically, sacrificing the compile time gained by 'extern template' in
exchange for a better inlining experience).

In the submitter's case, the std::string constructor calls "_M_construct".  The
constructor is inlined, but _M_construct is not, since it never gets to the
optimizer.

libc++ uses an __init function to do the same thing as _M_construct, however IT
is marked inline, and thus doesn't have the problem.

I believe the submitter wants to have you mark more of the functions in
extern-templated classes 'inline' so that it matches the heuristic better.

I don't think that there is a good way to change the compiler itself without
making 'extern template' absolutely meaningless.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/94960] extern template prevents inlining of standard library objects
  2020-05-05 20:09 [Bug libstdc++/94960] New: extern template prevents inlining of standard library objects krzysio.kurek at wp dot pl
                   ` (2 preceding siblings ...)
  2020-05-06  0:19 ` erich.keane at intel dot com
@ 2020-05-06  7:05 ` rguenth at gcc dot gnu.org
  2020-05-06  9:11 ` redi at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-05-06  7:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org,
                   |                            |jason at gcc dot gnu.org
          Component|libstdc++                   |c++

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
I guess the C++ FE could honor -finline-functions and consider all functions
having the 'inline' hint in that case.  I'm not sure how wide-spread
explicit instantiations are and what compile-time (and size?) hit we get
when doing the instantiations always.

That is, is the middle-end smart enough to not emit out-of-line instances
for the inline instantiated extern template parts?  Off the top of my head
I'm not aware of any middle-end flagging of this?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/94960] extern template prevents inlining of standard library objects
  2020-05-05 20:09 [Bug libstdc++/94960] New: extern template prevents inlining of standard library objects krzysio.kurek at wp dot pl
                   ` (3 preceding siblings ...)
  2020-05-06  7:05 ` [Bug c++/94960] " rguenth at gcc dot gnu.org
@ 2020-05-06  9:11 ` redi at gcc dot gnu.org
  2020-05-06 13:01 ` erich.keane at intel dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: redi at gcc dot gnu.org @ 2020-05-06  9:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960

--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Erich Keane from comment #3)
> As you know, "extern template" is a hint to the compiler that we don't need
> to emit the template as a way to save on compile time.
> 
> Both GCC and clang will NOT instantiate these templates in O0 mode. 
> However, in O1+ modes, both will actually still instantiate the templates in
> the frontend, BUT only for 'inline' functions.  Basically, we're using
> 'inline' as a heuristic that there is benefit in sending these functions to
> the optimizer (basically, sacrificing the compile time gained by 'extern
> template' in exchange for a better inlining experience).

Hmm, I've seen different behaviours for clang and g++ in this respect, with
clang inlining a lot more of std::string's members. So I'm surprised they use
the same heuristic.

Do they both instantiate the function templates marked 'inline' even at -O1?
Presumably not at -O0.

> In the submitter's case, the std::string constructor calls "_M_construct". 
> The constructor is inlined, but _M_construct is not, since it never gets to
> the optimizer.
> 
> libc++ uses an __init function to do the same thing as _M_construct, however
> IT is marked inline, and thus doesn't have the problem.
> 
> I believe the submitter wants to have you mark more of the functions in
> extern-templated classes 'inline' so that it matches the heuristic better.

And that's what I don't want to do. I think it's wrong for the human to say
"inline this!" because humans are stupid (well, I am anyway). And I don't want
to have to examine the GIMPLE/asm again for every new GCC release to decide
whether 'inline' is still in the right places (and whether the answer should be
different for every different version of Clang or ICC!)

And when I say "I don't want to" I mean "I am never ever going to".

> I don't think that there is a good way to change the compiler itself without
> making 'extern template' absolutely meaningless.

I absolutely disagree.

It would still give a reduction in object file size for cases where the
compiler decides not to inline, and still make compilation much faster for -O0
and -O1.

One property of -O2 and -O3 is that we try to optimize aggressively even if
that takes a long time to compile. So we could instantiate things that have an
explicit instantiation declaration (thus doing "redundant" work) to see if
inlining them would be beneficial. That would take longer to compile, but might
produce faster code. If the heuristics decide the instantiation ends up too big
to inline, it could just discard it (because we know there's a definition
elsewhere).

If the only way to get that is to mark every function as 'inline' (and then
"trick" the compiler into doing all that extra work even at -O1?) then we might
as well add 'inline' to every single function template in <string> and
<istream>, <ostream>, <streambuf> etc. so they're all potential candiates for
inlining.

And if we have to mark every single function as 'inline' then maybe the
compiler shouldn't be using it as a hint.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/94960] extern template prevents inlining of standard library objects
  2020-05-05 20:09 [Bug libstdc++/94960] New: extern template prevents inlining of standard library objects krzysio.kurek at wp dot pl
                   ` (4 preceding siblings ...)
  2020-05-06  9:11 ` redi at gcc dot gnu.org
@ 2020-05-06 13:01 ` erich.keane at intel dot com
  2022-02-17 17:01 ` jason at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: erich.keane at intel dot com @ 2020-05-06 13:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960

--- Comment #6 from Erich Keane <erich.keane at intel dot com> ---
(In reply to Jonathan Wakely from comment #5)
> (In reply to Erich Keane from comment #3)
> > As you know, "extern template" is a hint to the compiler that we don't need
> > to emit the template as a way to save on compile time.
> > 
> > Both GCC and clang will NOT instantiate these templates in O0 mode. 
> > However, in O1+ modes, both will actually still instantiate the templates in
> > the frontend, BUT only for 'inline' functions.  Basically, we're using
> > 'inline' as a heuristic that there is benefit in sending these functions to
> > the optimizer (basically, sacrificing the compile time gained by 'extern
> > template' in exchange for a better inlining experience).
> 
> Hmm, I've seen different behaviours for clang and g++ in this respect, with
> clang inlining a lot more of std::string's members. So I'm surprised they
> use the same heuristic.
> 
> Do they both instantiate the function templates marked 'inline' even at -O1?
> Presumably not at -O0.

My understanding of Clang is based on a brief debugging session. My
understanding of GCC's behavior here is a brief amount of time messing around
on godbolt. I could very well be incorrect.


> 
> > In the submitter's case, the std::string constructor calls "_M_construct". 
> > The constructor is inlined, but _M_construct is not, since it never gets to
> > the optimizer.
> > 
> > libc++ uses an __init function to do the same thing as _M_construct, however
> > IT is marked inline, and thus doesn't have the problem.
> > 
> > I believe the submitter wants to have you mark more of the functions in
> > extern-templated classes 'inline' so that it matches the heuristic better.
> 
> And that's what I don't want to do. I think it's wrong for the human to say
> "inline this!" because humans are stupid (well, I am anyway). And I don't
> want to have to examine the GIMPLE/asm again for every new GCC release to
> decide whether 'inline' is still in the right places (and whether the answer
> should be different for every different version of Clang or ICC!)
> 
> And when I say "I don't want to" I mean "I am never ever going to".
> 
> > I don't think that there is a good way to change the compiler itself without
> > making 'extern template' absolutely meaningless.
> 
> I absolutely disagree.
> 
> It would still give a reduction in object file size for cases where the
> compiler decides not to inline, and still make compilation much faster for
> -O0 and -O1.

That is fair, I guess it would slightly reduce 'link' time because of that. I
doubt people would be willing to put up with the STL compiling that much slower
though (which seems to be the major user of this feature in my experience).

> One property of -O2 and -O3 is that we try to optimize aggressively even if
> that takes a long time to compile. So we could instantiate things that have
> an explicit instantiation declaration (thus doing "redundant" work) to see
> if inlining them would be beneficial. That would take longer to compile, but
> might produce faster code. If the heuristics decide the instantiation ends
> up too big to inline, it could just discard it (because we know there's a
> definition elsewhere).

That is essentially what the frontends DO, except only with the 'inline'
functions.  If the inliner chooses to not inline it, it gets thrown out (since
we've marked it 'available externally').

> If the only way to get that is to mark every function as 'inline' (and then
> "trick" the compiler into doing all that extra work even at -O1?) then we
> might as well add 'inline' to every single function template in <string> and
> <istream>, <ostream>, <streambuf> etc. so they're all potential candiates
> for inlining.
> 
> And if we have to mark every single function as 'inline' then maybe the
> compiler shouldn't be using it as a hint.

I don't think the idea is to mark EVERY function 'inline', simply ones that are
pretty tiny and really good candidates for inlining.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/94960] extern template prevents inlining of standard library objects
  2020-05-05 20:09 [Bug libstdc++/94960] New: extern template prevents inlining of standard library objects krzysio.kurek at wp dot pl
                   ` (5 preceding siblings ...)
  2020-05-06 13:01 ` erich.keane at intel dot com
@ 2022-02-17 17:01 ` jason at gcc dot gnu.org
  2022-02-17 23:54 ` redi at gcc dot gnu.org
  2022-02-18  0:53 ` erich.keane at intel dot com
  8 siblings, 0 replies; 10+ messages in thread
From: jason at gcc dot gnu.org @ 2022-02-17 17:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960

--- Comment #7 from Jason Merrill <jason at gcc dot gnu.org> ---
C++17 and below said, 

Except for inline functions and variables, declarations with types deduced from
their initializer or return value (10.1.7.4), const variables of literal types,
variables of reference types, and class template specializations, explicit
instantiation declarations have the effect of suppressing the implicit
instantiation of the entity to which they refer. [ Note: The intent is that an
inline function that is the subject of an explicit instantiation declaration
will still be implicitly instantiated when odr-used (6.2) so that the body can
be considered for inlining, but that no out-of-line copy of the inline function
would be generated in the translation unit. — end note ]

This wording was changed in C++20 by P1815, a modules paper, but I believe the
replacement wording still says that the a function is not implicitly
instantiated  after an explicit instantiation declaration unless it is inline
or has a deduced return type.  And if it isn't instantiated, it can't be
inlined.

So if you want an *explicit instantiation declaration* to still be considered
for inlining, you need to declare it inline.  Most templates don't need to be
declared inline, only those that have implicit instantiations suppressed by
'extern template'.

So, I think the string case specifically is a library isssue.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/94960] extern template prevents inlining of standard library objects
  2020-05-05 20:09 [Bug libstdc++/94960] New: extern template prevents inlining of standard library objects krzysio.kurek at wp dot pl
                   ` (6 preceding siblings ...)
  2022-02-17 17:01 ` jason at gcc dot gnu.org
@ 2022-02-17 23:54 ` redi at gcc dot gnu.org
  2022-02-18  0:53 ` erich.keane at intel dot com
  8 siblings, 0 replies; 10+ messages in thread
From: redi at gcc dot gnu.org @ 2022-02-17 23:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960

--- Comment #8 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Erich Keane from comment #6)
> (In reply to Jonathan Wakely from comment #5)
> > And if we have to mark every single function as 'inline' then maybe the
> > compiler shouldn't be using it as a hint.
> 
> I don't think the idea is to mark EVERY function 'inline', simply ones that
> are pretty tiny and really good candidates for inlining.

But that's exactly what we do. _M_construct isn't tiny, it has two loops (and
until quite recently, a try-catch block, but that's been replaced). There are
some functions in <bits/basic_string.tcc> which are probably small enough to be
marked 'inline', so I should review those. Not for GCC 12 though.

But in C++20 every function is 'constexpr' now, so every function is inline
anyway, right? Even the large functions that aren't good candidates for
inlining (see also PR 93008). So The 'inline' keyword has lost all meaning in
<string> now.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/94960] extern template prevents inlining of standard library objects
  2020-05-05 20:09 [Bug libstdc++/94960] New: extern template prevents inlining of standard library objects krzysio.kurek at wp dot pl
                   ` (7 preceding siblings ...)
  2022-02-17 23:54 ` redi at gcc dot gnu.org
@ 2022-02-18  0:53 ` erich.keane at intel dot com
  8 siblings, 0 replies; 10+ messages in thread
From: erich.keane at intel dot com @ 2022-02-18  0:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960

--- Comment #9 from Erich Keane <erich.keane at intel dot com> ---
> But in C++20 every function is 'constexpr' now, so every function is inline
> anyway, right? Even the large functions that aren't good candidates for
> inlining (see also PR 93008). So The 'inline' keyword has lost all meaning
> in <string> now.

Do you mean 'every function in std::string'?  If so, you'd know better than I.
In a general case, every function is NOT 'constexpr', and that didn't pass EWG.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-02-18  0:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-05 20:09 [Bug libstdc++/94960] New: extern template prevents inlining of standard library objects krzysio.kurek at wp dot pl
2020-05-05 21:02 ` [Bug libstdc++/94960] " pinskia at gcc dot gnu.org
2020-05-05 21:08 ` redi at gcc dot gnu.org
2020-05-06  0:19 ` erich.keane at intel dot com
2020-05-06  7:05 ` [Bug c++/94960] " rguenth at gcc dot gnu.org
2020-05-06  9:11 ` redi at gcc dot gnu.org
2020-05-06 13:01 ` erich.keane at intel dot com
2022-02-17 17:01 ` jason at gcc dot gnu.org
2022-02-17 23:54 ` redi at gcc dot gnu.org
2022-02-18  0:53 ` erich.keane at intel dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).