public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* KDE: C++ and shared libraries
@ 2001-09-11 11:33 Lubos Lunak
  2001-09-11 14:40 ` Jakub Jelinek
  0 siblings, 1 reply; 6+ messages in thread
From: Lubos Lunak @ 2001-09-11 11:33 UTC (permalink / raw)
  To: gcc, binutils

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 825 bytes --]

 Hello,

 could you please read this ?

 http://dforce.sh.cvut.cz/~seli/en/linking2/

 I think I've discovered several inefficiencies in gcc and GNU ld that are 
causing serious memory overhead in KDE applications. The tests were mainly 
done with gcc-2.95.3 and gcc-2.96, but I also tried some of the tests with 
gcc3, without much difference, sometimes it was actually even worse (e.g. the 
gcc3 version of the conflicts files referenced in the document is here 
http://dforce.sh.cvut.cz/~seli/en/linking2/conflicts_gcc3.txt.bz2 and the 
number of the conflicts is doubled ).

 I'm subscribed to both these mailing lists, and I'd like to know your 
opinion (and of course, I'd also like to get the problems fixed).

 Thank you

 Lubos Lunak
-- 
 l.lunak@email.cz ; l.lunak@kde.org
  http://dforce.sh.cvut.cz/~seli

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: KDE: C++ and shared libraries
  2001-09-11 11:33 KDE: C++ and shared libraries Lubos Lunak
@ 2001-09-11 14:40 ` Jakub Jelinek
  2001-09-13 12:46   ` Lubos Lunak
  0 siblings, 1 reply; 6+ messages in thread
From: Jakub Jelinek @ 2001-09-11 14:40 UTC (permalink / raw)
  To: Lubos Lunak; +Cc: gcc, binutils

On Tue, Sep 11, 2001 at 08:33:18PM +0200, Lubos Lunak wrote:
>  http://dforce.sh.cvut.cz/~seli/en/linking2/

a) prelink is still in development, and although it already does some
minimal optimizations to minimize number of conflicts (like e.g. removes
conflicts in provably unused virtual tables or typeinfo structures), there
are still other optimizations I plan to implement in the future, see below.

b) the number of conflict lines in LD_DEBUG_PRELINKING output is not equal
to the number of conflicts, so it really does not matter how many lines for
the same symbol it prints - what matters how many relocations against the
symbols mentioned in LD_DEBUG_PRELINKING are in the libs

c) concerning .data section, appart from .gnu.linkonce.d stuff (virtual
tables, typeinfo, etc.) it could probably help if gcc differentiated things
into .data, .data.reloc, .data.local_reloc sections (the latter two for
things which are RW only because they contain relocations, the last one if
all the relocations are against static symbols only) if this happens often
(this would need some statistics). I write .data.* because then it means no
changes in the linker are necessary.

Optimizations I want to try are:
1) if prelink would run all programs with LD_TRACE_PRELINKING during collect
pass too, some statistics could be gathered about which symbols are often
conflicting and prelink could reorder virtual tables/typeinfo structures
(and eventually if libs are linked with --emit-relocs all other data
too) so that data which will need runtime fixups comes together and likewise
for data which won't need runtime fixups
2) similarly, if conflict statistics is gathered, some virtual table methods
(especially with conflicting symbols which happen in several vtables)
could be lazy bound (leave the first one with normal reloc, the rest
initially pointing to some stub which would copy the function pointer
resp. descriptor from the first one), which would mean the methods which are
never run would not have to be written into
3) implement
http://sources.redhat.com/ml/binutils/2001-07/msg00200.html
under some special option.
Jason is right in that this cannot be generally used because in C++ there is
no way (appart from symbol versioning which is not easy to do in C++) to
specify what is really the exported API of a library and what is just a side
effect of the implementation. But if
a) ld supported the .gnu.linkonce.* sections marked SHF_MERGE specially
   as described in the thread (ie. such thing is not only to be merged
   accross one shared object or app, but can be also deleted entirelly
   if one of its DT_NEEDED libs contains it). Perhaps ld would only support
   this under some special option
b) g++ emitted this if given some special switch, off by default
c) some tightly coupled packages could use this switch
This would make libs/apps compiled/linked with that switch smaller (because
some redundant linkonce stuff could be killed), thus they could contail
fewer relocations and with prelinking contain less conflicts.
I've played with nm, readelf etc. a little bit and the figure is
that this would save ~ 100KB of memory footprint in kmail application on IA-32
in old (2.95/2.96-RH) ABI, not counting the savings in SHT_REL* sections and
far smaller .gnu.conflict section. Dunno how much would it do with gcc3 yet.
The requirements for this would be that if some C++ library against which
something was compiled with this magic -flinkonce-accross-libs is changed,
it needs to ensure it exports at least the same set of
nm -D library | sed -n 's/^[^ ]* W \(.*\)$/\1/p' | LC_ALL=C sort
symbols as it exported before, or anything which was compiled/linked with
-flinkonce-accross-libs requires the newer package.
My understanding is that at least most of the common KDE stuff is usually
built together, in which case this core part could be built with
-flinkonce-accross-libs. If some linkonce symbol goes away from some lib,
it would mean all the stuff built with -flinkonce-accross-libs against such
library needs to be updated too. Programs/libs outside of this core part
(ie. not built with -flinkonce-accross-libs) don't need to care, for them
the core part acts as a blackbox which provides the linkonce symbols they
need.

	Jakub

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: KDE: C++ and shared libraries
  2001-09-11 14:40 ` Jakub Jelinek
@ 2001-09-13 12:46   ` Lubos Lunak
  0 siblings, 0 replies; 6+ messages in thread
From: Lubos Lunak @ 2001-09-13 12:46 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc, binutils

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 8050 bytes --]

Dne út 11. září 2001 23:42 jste napsal(a):
> On Tue, Sep 11, 2001 at 08:33:18PM +0200, Lubos Lunak wrote:
> >  http://dforce.sh.cvut.cz/~seli/en/linking2/
>
> a) prelink is still in development, and although it already does some
> minimal optimizations to minimize number of conflicts (like e.g. removes
> conflicts in provably unused virtual tables or typeinfo structures), there
> are still other optimizations I plan to implement in the future, see below.

 Great.

>
> b) the number of conflict lines in LD_DEBUG_PRELINKING output is not equal
> to the number of conflicts, so it really does not matter how many lines for
> the same symbol it prints - what matters how many relocations against the
> symbols mentioned in LD_DEBUG_PRELINKING are in the libs
>
> c) concerning .data section, appart from .gnu.linkonce.d stuff (virtual
> tables, typeinfo, etc.) it could probably help if gcc differentiated things
> into .data, .data.reloc, .data.local_reloc sections (the latter two for
> things which are RW only because they contain relocations, the last one if
> all the relocations are against static symbols only) if this happens often
> (this would need some statistics). I write .data.* because then it means no
> changes in the linker are necessary.

 What exactly does 'relocations against static symbols' mean ? Symbols in the 
same .o ? If yes, then there are many such cases e.g. in Qt3 moc generated 
sources ( lots of const char* const txt[] etc. ).

>
> Optimizations I want to try are:
> 1) if prelink would run all programs with LD_TRACE_PRELINKING during
> collect pass too, some statistics could be gathered about which symbols are
> often conflicting and prelink could reorder virtual tables/typeinfo
> structures (and eventually if libs are linked with --emit-relocs all other
> data too) so that data which will need runtime fixups comes together and
> likewise for data which won't need runtime fixups
> 2) similarly, if conflict statistics is gathered, some virtual table
> methods (especially with conflicting symbols which happen in several
> vtables) could be lazy bound (leave the first one with normal reloc, the
> rest initially pointing to some stub which would copy the function pointer
> resp. descriptor from the first one), which would mean the methods which
> are never run would not have to be written into
> 3) implement
> http://sources.redhat.com/ml/binutils/2001-07/msg00200.html
> under some special option.
> Jason is right in that this cannot be generally used because in C++ there
> is no way (appart from symbol versioning which is not easy to do in C++) to
> specify what is really the exported API of a library and what is just a
> side effect of the implementation. But if
> a) ld supported the .gnu.linkonce.* sections marked SHF_MERGE specially
>    as described in the thread (ie. such thing is not only to be merged
>    accross one shared object or app, but can be also deleted entirelly
>    if one of its DT_NEEDED libs contains it). Perhaps ld would only support
>    this under some special option
> b) g++ emitted this if given some special switch, off by default
> c) some tightly coupled packages could use this switch
> This would make libs/apps compiled/linked with that switch smaller (because
> some redundant linkonce stuff could be killed), thus they could contail
> fewer relocations and with prelinking contain less conflicts.
> I've played with nm, readelf etc. a little bit and the figure is
> that this would save ~ 100KB of memory footprint in kmail application on
> IA-32 in old (2.95/2.96-RH) ABI, not counting the savings in SHT_REL*
> sections and far smaller .gnu.conflict section. Dunno how much would it do
> with gcc3 yet. The requirements for this would be that if some C++ library
> against which something was compiled with this magic
> -flinkonce-accross-libs is changed, it needs to ensure it exports at least
> the same set of
> nm -D library | sed -n 's/^[^ ]* W \(.*\)$/\1/p' | LC_ALL=C sort
> symbols as it exported before, or anything which was compiled/linked with
> -flinkonce-accross-libs requires the newer package.
> My understanding is that at least most of the common KDE stuff is usually
> built together, in which case this core part could be built with
> -flinkonce-accross-libs. If some linkonce symbol goes away from some lib,
> it would mean all the stuff built with -flinkonce-accross-libs against such
> library needs to be updated too. Programs/libs outside of this core part
> (ie. not built with -flinkonce-accross-libs) don't need to care, for them
> the core part acts as a blackbox which provides the linkonce symbols they
> need.

 The nm line prints more than 1800 lines in libqt, 1300 in libkdeui (gcc3 
compiled). I'm not sure if it wouldn't be too much work to maintain this. It 
would also depend on using/not using -O2 ( i.e. inlining or no inlining ).
On the other hand, many of the symbols are virtual thunks, out-of-line copies 
of virtual inline methods and such things that simply can't disappear from 
the libraries, so maybe if we switched to explicit template instantiation... 
I think we'll consider this option, especially if there's no better idea, and 
IMHO we should use explicit template instantiation anyway.

 But I think this linkonce-across-libs could be done always for some symbols, 
as some symbols simply cannot disappear from a library without breaking 
binary compatibility in the public API. E.g. if there's class A in a library, 
its vtable, typeinfo, thunks and out-of-line copies of virtual inline methods 
simply must be present in the library too (or am I wrong here?).

 Also, I think prelink probably could here play the role of a mysterious 
stranger who comes and saves everybody :). I think it could during prelinking 
the libs do the reverse of what dynamic loader does when doing symbol lookup. 
E.g. if you have this :

a.h:
template< typename T >class Foo { public: virtual void bar() {}; };

a.cpp:
#include "a.h"
template Foo< int >;

b.cpp:
#include "a.h"
template Foo< int >;

 a.cpp is used to create libta, b.cpp is used to create libtb and linked with 
-lta . They both among other things contain also Foo< int >::bar() . Now when
loading libta and libtb, all references to Foo< int >::bar() are made to 
point to the one created in b.o in libtb, which causes a conflict. However, 
both these generated Foo< int >::bar() should be the same, so if prelink 
would make all references point to the one in a.o in libta instead, this 
would avoid the conflict without any functional change. Moreover, if I 
understand the things correctly, if libta would be replace by a new version, 
prelinking would become invalid and therefore some possibly disappearing 
symbols from libta shouldn't break anything. 
 The problem here would be if Foo< int >::bar() would be intentionally 
different in libtb, e.g. in order to fix the broken one in libta. I don't 
think such thing is done with templates in whole KDE, but it might happen (I 
think I'm starting to hate this 'but it might happen'). In this case the 
version in libtb would have to be marked by the programmer somehow, so gcc 
would generate info for prelink that it should for this symbol the version in 
libtb. Does this make sense or would there be a problem with this? I think 
this variant of linkonce-across-libs behaviour would be simpler to use.

 Could also somebody please comment on reducing the number of multiple 
instances of things like vtables or the thunks in gcc ? Improving this could 
greatly reduce the number of conflicts.

 One more thing : Does prelink also help with dlopen-ing ? It's used 
extensively in KDE ( KParts, Control Center modules, and sometimes just in 
order to reduce the high per-process memory overhead ). I tried it, it seemed 
to be a bit faster, but it didn't seem to make much difference. I also don't 
know if dlopen-ing can cause so much unshared memory because of conflicts.

 Thanks

 Lubos Lunak
-- 
 l.lunak@email.cz ; l.lunak@kde.org
 http://dforce.sh.cvut.cz/~seli

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: KDE: C++ and shared libraries
  2001-09-21  9:35 Leon Bottou
  2001-09-21 13:18 ` Lubos Lunak
@ 2001-09-21 13:36 ` Jakub Jelinek
  1 sibling, 0 replies; 6+ messages in thread
From: Jakub Jelinek @ 2001-09-21 13:36 UTC (permalink / raw)
  To: Leon Bottou; +Cc: binutils

On Fri, Sep 21, 2001 at 12:35:16PM -0400, Leon Bottou wrote:
> The correct solution would be to generate the shared objects 
> using  the ld option "--retain-symbols-file" to specify a list of exported symbols.

Or local *; default in version script.
There are 2 problems with this: the linkonce objects are linkonce usually
because C++ semantics requires it, so making them private to shared
libraries may be not compatible with C++ (think e.g. about static variables
in inline functions etc.). Also, when you have the same linkonce routine in
a bunch of shared libraries, it not only wases a lot of memory, but adds
runtime relocs too (think about .got relocations function calls it makes,
etc.). The -flinkonce-accross-libs stuff would not be IMHO overly hard to
use, if you look at how much a typical large C++ library removes linkonce
symbols, you'll notice it is not that common (e.g. both libkdeui and libqt
lost one single linkonce symbol within a year).
I'll say more once I implement this and do some statistics.

> There are two ways to implement this 
> - The relevant linkonce section could bear a flag  (SHF_NO_EXPORT)
>   specifying that the global symbols defined in this section should
>   not be exported from shared objects.
> - There could be a new symbol binding type  STB_GLOBAL_NO_EXPORT
>   specifying that this symbol is global for linking purposes but should
>   not be exported from shared objects.

You mean STV_HIDDEN, right?

	Jakub

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: KDE: C++ and shared libraries
  2001-09-21  9:35 Leon Bottou
@ 2001-09-21 13:18 ` Lubos Lunak
  2001-09-21 13:36 ` Jakub Jelinek
  1 sibling, 0 replies; 6+ messages in thread
From: Lubos Lunak @ 2001-09-21 13:18 UTC (permalink / raw)
  To: binutils; +Cc: Jakub Jelinek, Leon Bottou

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3967 bytes --]

Dne pá 21. zá&rcaron;í 2001 18:35 jste napsal(a):
> Jakub,
>
> Here is a simple suggestion regarding the linkonce issues described in your
> post http://gcc.gnu.org/ml/gcc/2001-09/msg00421.html
> and in the thread initiated by Jason
>  http://sources.redhat.com/ml/binutils/2001-07/msg00200.html .
>
> I believe that there is a way to solve the problem that does not
> require the programmer to manipulate complicated compiler switches.
>
> We all agree that the problem arises because there is no
> simple way to specify what is the exported API of a shared object.
>
> The correct solution would be to generate the shared objects
> using  the ld option "--retain-symbols-file" to specify a list of exported
> symbols. This list of symbols could be generated by processing the
> public header files, that is to say those header files that declare
> the exported API of the shared object.   I easily envision a gcc option
> such as "gcc --generate-exported-symbols-file  include/*.h".
>
> But this would not solve the problems caused by linkonce sections
> containing template instantiations (same for out-of-line copies of inline
> functions). Executables using this shared object often contain linkonce
> sections with conflicting copies of these template instantiations.
> Hence numerous prelinker conflicts that are useless
> since both codes are the same.
>
> The above mentionned thread envisions means to give linking
> priority to the symbol contained in the shared object.
> This reverses the usual linking priority and eliminates the conflict.
> But this is a complicated semantic change and must be turned off
> in certain cases.  This leads to complicated compiler options that
> very few people will use properly...

 Ok, I don't understand all the details, but why this would be so 
complicated? If prelink would do this only for linkonce symbols (templates, 
inlines, vtables), it wouldn't affect the behaviour in almost all cases, as 
the copies would be all the same anyway. The only compiler switch needed 
would simply mark the library for prelink, that it's ok to do this 
optimization. Moreover if there was a possibility to mark certain symbols 
that shouldn't be optimized this way because they are intentionally 
implemented again, it could be even done with all symbols, except those 
marked.

>
> An alternate solution would be to decide that each shared object
> always uses its own copies of the template instantiations and
> never exports them.   This obviously will hide those that are
> not part of the exported API.   This will also hide those that are part
> of the exported API, but this has no consequence since
> these can be regenerated from the public header files.

 I'm afraid this won't work sometimes. You e.g. cannot use two copies of a 
static data member of one instance of a template class, as that could break 
things.

>
> There is some overhead in having duplicate copies of these instantiations.
> This overhead is limited because there is only one copy per
> shared object (and not one copy per object file).
> A simple way to manually control the overhead would be to
> export explicit template instantiations only.
>
> There are two ways to implement this
> - The relevant linkonce section could bear a flag  (SHF_NO_EXPORT)
>   specifying that the global symbols defined in this section should
>   not be exported from shared objects.
> - There could be a new symbol binding type  STB_GLOBAL_NO_EXPORT
>   specifying that this symbol is global for linking purposes but should
>   not be exported from shared objects.
>
> The first solution possibly has less impact on binutils.
> It would entail a change in binutils to understand the NO_EXPORT flag
> and a change in g++ to set the flag on the appropriate sections.
> Then everything would happen without the need for an additional
> compiler option.
>
> Comments ?
>


 Lubos Lunak
-- 
 llunak@suse.cz ; l.lunak@kde.org
 http://dforce.sh.cvut.cz/~seli

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: KDE: C++ and shared libraries
@ 2001-09-21  9:35 Leon Bottou
  2001-09-21 13:18 ` Lubos Lunak
  2001-09-21 13:36 ` Jakub Jelinek
  0 siblings, 2 replies; 6+ messages in thread
From: Leon Bottou @ 2001-09-21  9:35 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: binutils

Jakub,

Here is a simple suggestion regarding the linkonce issues described in your post
 http://gcc.gnu.org/ml/gcc/2001-09/msg00421.html
and in the thread initiated by Jason
 http://sources.redhat.com/ml/binutils/2001-07/msg00200.html .

I believe that there is a way to solve the problem that does not 
require the programmer to manipulate complicated compiler switches.

We all agree that the problem arises because there is no
simple way to specify what is the exported API of a shared object.

The correct solution would be to generate the shared objects 
using  the ld option "--retain-symbols-file" to specify a list of exported symbols.
This list of symbols could be generated by processing the
public header files, that is to say those header files that declare 
the exported API of the shared object.   I easily envision a gcc option 
such as "gcc --generate-exported-symbols-file  include/*.h". 

But this would not solve the problems caused by linkonce sections
containing template instantiations (same for out-of-line copies of inline functions).
Executables using this shared object often contain linkonce sections
with conflicting copies of these template instantiations.
Hence numerous prelinker conflicts that are useless
since both codes are the same.

The above mentionned thread envisions means to give linking
priority to the symbol contained in the shared object.
This reverses the usual linking priority and eliminates the conflict.
But this is a complicated semantic change and must be turned off 
in certain cases.  This leads to complicated compiler options that 
very few people will use properly...

An alternate solution would be to decide that each shared object 
always uses its own copies of the template instantiations and 
never exports them.   This obviously will hide those that are
not part of the exported API.   This will also hide those that are part
of the exported API, but this has no consequence since
these can be regenerated from the public header files.

There is some overhead in having duplicate copies of these instantiations.
This overhead is limited because there is only one copy per
shared object (and not one copy per object file).  
A simple way to manually control the overhead would be to 
export explicit template instantiations only.

There are two ways to implement this 
- The relevant linkonce section could bear a flag  (SHF_NO_EXPORT)
  specifying that the global symbols defined in this section should
  not be exported from shared objects.
- There could be a new symbol binding type  STB_GLOBAL_NO_EXPORT
  specifying that this symbol is global for linking purposes but should
  not be exported from shared objects.

The first solution possibly has less impact on binutils.
It would entail a change in binutils to understand the NO_EXPORT flag
and a change in g++ to set the flag on the appropriate sections.
Then everything would happen without the need for an additional
compiler option.

Comments ?

- Leon Bottou
  <leonb@research.att.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2001-09-21 13:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-11 11:33 KDE: C++ and shared libraries Lubos Lunak
2001-09-11 14:40 ` Jakub Jelinek
2001-09-13 12:46   ` Lubos Lunak
2001-09-21  9:35 Leon Bottou
2001-09-21 13:18 ` Lubos Lunak
2001-09-21 13:36 ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).