public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* KDE: C++ and shared libraries
@ 2001-09-11 11:33 Lubos Lunak
  2001-09-11 14:40 ` Jakub Jelinek
  0 siblings, 1 reply; 3+ messages in thread
From: Lubos Lunak @ 2001-09-11 11:33 UTC (permalink / raw)
  To: gcc, binutils

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 825 bytes --]

 Hello,

 could you please read this ?

 http://dforce.sh.cvut.cz/~seli/en/linking2/

 I think I've discovered several inefficiencies in gcc and GNU ld that are 
causing serious memory overhead in KDE applications. The tests were mainly 
done with gcc-2.95.3 and gcc-2.96, but I also tried some of the tests with 
gcc3, without much difference, sometimes it was actually even worse (e.g. the 
gcc3 version of the conflicts files referenced in the document is here 
http://dforce.sh.cvut.cz/~seli/en/linking2/conflicts_gcc3.txt.bz2 and the 
number of the conflicts is doubled ).

 I'm subscribed to both these mailing lists, and I'd like to know your 
opinion (and of course, I'd also like to get the problems fixed).

 Thank you

 Lubos Lunak
-- 
 l.lunak@email.cz ; l.lunak@kde.org
  http://dforce.sh.cvut.cz/~seli

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: KDE: C++ and shared libraries
  2001-09-11 11:33 KDE: C++ and shared libraries Lubos Lunak
@ 2001-09-11 14:40 ` Jakub Jelinek
  2001-09-13 12:46   ` Lubos Lunak
  0 siblings, 1 reply; 3+ messages in thread
From: Jakub Jelinek @ 2001-09-11 14:40 UTC (permalink / raw)
  To: Lubos Lunak; +Cc: gcc, binutils

On Tue, Sep 11, 2001 at 08:33:18PM +0200, Lubos Lunak wrote:
>  http://dforce.sh.cvut.cz/~seli/en/linking2/

a) prelink is still in development, and although it already does some
minimal optimizations to minimize number of conflicts (like e.g. removes
conflicts in provably unused virtual tables or typeinfo structures), there
are still other optimizations I plan to implement in the future, see below.

b) the number of conflict lines in LD_DEBUG_PRELINKING output is not equal
to the number of conflicts, so it really does not matter how many lines for
the same symbol it prints - what matters how many relocations against the
symbols mentioned in LD_DEBUG_PRELINKING are in the libs

c) concerning .data section, appart from .gnu.linkonce.d stuff (virtual
tables, typeinfo, etc.) it could probably help if gcc differentiated things
into .data, .data.reloc, .data.local_reloc sections (the latter two for
things which are RW only because they contain relocations, the last one if
all the relocations are against static symbols only) if this happens often
(this would need some statistics). I write .data.* because then it means no
changes in the linker are necessary.

Optimizations I want to try are:
1) if prelink would run all programs with LD_TRACE_PRELINKING during collect
pass too, some statistics could be gathered about which symbols are often
conflicting and prelink could reorder virtual tables/typeinfo structures
(and eventually if libs are linked with --emit-relocs all other data
too) so that data which will need runtime fixups comes together and likewise
for data which won't need runtime fixups
2) similarly, if conflict statistics is gathered, some virtual table methods
(especially with conflicting symbols which happen in several vtables)
could be lazy bound (leave the first one with normal reloc, the rest
initially pointing to some stub which would copy the function pointer
resp. descriptor from the first one), which would mean the methods which are
never run would not have to be written into
3) implement
http://sources.redhat.com/ml/binutils/2001-07/msg00200.html
under some special option.
Jason is right in that this cannot be generally used because in C++ there is
no way (appart from symbol versioning which is not easy to do in C++) to
specify what is really the exported API of a library and what is just a side
effect of the implementation. But if
a) ld supported the .gnu.linkonce.* sections marked SHF_MERGE specially
   as described in the thread (ie. such thing is not only to be merged
   accross one shared object or app, but can be also deleted entirelly
   if one of its DT_NEEDED libs contains it). Perhaps ld would only support
   this under some special option
b) g++ emitted this if given some special switch, off by default
c) some tightly coupled packages could use this switch
This would make libs/apps compiled/linked with that switch smaller (because
some redundant linkonce stuff could be killed), thus they could contail
fewer relocations and with prelinking contain less conflicts.
I've played with nm, readelf etc. a little bit and the figure is
that this would save ~ 100KB of memory footprint in kmail application on IA-32
in old (2.95/2.96-RH) ABI, not counting the savings in SHT_REL* sections and
far smaller .gnu.conflict section. Dunno how much would it do with gcc3 yet.
The requirements for this would be that if some C++ library against which
something was compiled with this magic -flinkonce-accross-libs is changed,
it needs to ensure it exports at least the same set of
nm -D library | sed -n 's/^[^ ]* W \(.*\)$/\1/p' | LC_ALL=C sort
symbols as it exported before, or anything which was compiled/linked with
-flinkonce-accross-libs requires the newer package.
My understanding is that at least most of the common KDE stuff is usually
built together, in which case this core part could be built with
-flinkonce-accross-libs. If some linkonce symbol goes away from some lib,
it would mean all the stuff built with -flinkonce-accross-libs against such
library needs to be updated too. Programs/libs outside of this core part
(ie. not built with -flinkonce-accross-libs) don't need to care, for them
the core part acts as a blackbox which provides the linkonce symbols they
need.

	Jakub

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: KDE: C++ and shared libraries
  2001-09-11 14:40 ` Jakub Jelinek
@ 2001-09-13 12:46   ` Lubos Lunak
  0 siblings, 0 replies; 3+ messages in thread
From: Lubos Lunak @ 2001-09-13 12:46 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc, binutils

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 8050 bytes --]

Dne út 11. září 2001 23:42 jste napsal(a):
> On Tue, Sep 11, 2001 at 08:33:18PM +0200, Lubos Lunak wrote:
> >  http://dforce.sh.cvut.cz/~seli/en/linking2/
>
> a) prelink is still in development, and although it already does some
> minimal optimizations to minimize number of conflicts (like e.g. removes
> conflicts in provably unused virtual tables or typeinfo structures), there
> are still other optimizations I plan to implement in the future, see below.

 Great.

>
> b) the number of conflict lines in LD_DEBUG_PRELINKING output is not equal
> to the number of conflicts, so it really does not matter how many lines for
> the same symbol it prints - what matters how many relocations against the
> symbols mentioned in LD_DEBUG_PRELINKING are in the libs
>
> c) concerning .data section, appart from .gnu.linkonce.d stuff (virtual
> tables, typeinfo, etc.) it could probably help if gcc differentiated things
> into .data, .data.reloc, .data.local_reloc sections (the latter two for
> things which are RW only because they contain relocations, the last one if
> all the relocations are against static symbols only) if this happens often
> (this would need some statistics). I write .data.* because then it means no
> changes in the linker are necessary.

 What exactly does 'relocations against static symbols' mean ? Symbols in the 
same .o ? If yes, then there are many such cases e.g. in Qt3 moc generated 
sources ( lots of const char* const txt[] etc. ).

>
> Optimizations I want to try are:
> 1) if prelink would run all programs with LD_TRACE_PRELINKING during
> collect pass too, some statistics could be gathered about which symbols are
> often conflicting and prelink could reorder virtual tables/typeinfo
> structures (and eventually if libs are linked with --emit-relocs all other
> data too) so that data which will need runtime fixups comes together and
> likewise for data which won't need runtime fixups
> 2) similarly, if conflict statistics is gathered, some virtual table
> methods (especially with conflicting symbols which happen in several
> vtables) could be lazy bound (leave the first one with normal reloc, the
> rest initially pointing to some stub which would copy the function pointer
> resp. descriptor from the first one), which would mean the methods which
> are never run would not have to be written into
> 3) implement
> http://sources.redhat.com/ml/binutils/2001-07/msg00200.html
> under some special option.
> Jason is right in that this cannot be generally used because in C++ there
> is no way (appart from symbol versioning which is not easy to do in C++) to
> specify what is really the exported API of a library and what is just a
> side effect of the implementation. But if
> a) ld supported the .gnu.linkonce.* sections marked SHF_MERGE specially
>    as described in the thread (ie. such thing is not only to be merged
>    accross one shared object or app, but can be also deleted entirelly
>    if one of its DT_NEEDED libs contains it). Perhaps ld would only support
>    this under some special option
> b) g++ emitted this if given some special switch, off by default
> c) some tightly coupled packages could use this switch
> This would make libs/apps compiled/linked with that switch smaller (because
> some redundant linkonce stuff could be killed), thus they could contail
> fewer relocations and with prelinking contain less conflicts.
> I've played with nm, readelf etc. a little bit and the figure is
> that this would save ~ 100KB of memory footprint in kmail application on
> IA-32 in old (2.95/2.96-RH) ABI, not counting the savings in SHT_REL*
> sections and far smaller .gnu.conflict section. Dunno how much would it do
> with gcc3 yet. The requirements for this would be that if some C++ library
> against which something was compiled with this magic
> -flinkonce-accross-libs is changed, it needs to ensure it exports at least
> the same set of
> nm -D library | sed -n 's/^[^ ]* W \(.*\)$/\1/p' | LC_ALL=C sort
> symbols as it exported before, or anything which was compiled/linked with
> -flinkonce-accross-libs requires the newer package.
> My understanding is that at least most of the common KDE stuff is usually
> built together, in which case this core part could be built with
> -flinkonce-accross-libs. If some linkonce symbol goes away from some lib,
> it would mean all the stuff built with -flinkonce-accross-libs against such
> library needs to be updated too. Programs/libs outside of this core part
> (ie. not built with -flinkonce-accross-libs) don't need to care, for them
> the core part acts as a blackbox which provides the linkonce symbols they
> need.

 The nm line prints more than 1800 lines in libqt, 1300 in libkdeui (gcc3 
compiled). I'm not sure if it wouldn't be too much work to maintain this. It 
would also depend on using/not using -O2 ( i.e. inlining or no inlining ).
On the other hand, many of the symbols are virtual thunks, out-of-line copies 
of virtual inline methods and such things that simply can't disappear from 
the libraries, so maybe if we switched to explicit template instantiation... 
I think we'll consider this option, especially if there's no better idea, and 
IMHO we should use explicit template instantiation anyway.

 But I think this linkonce-across-libs could be done always for some symbols, 
as some symbols simply cannot disappear from a library without breaking 
binary compatibility in the public API. E.g. if there's class A in a library, 
its vtable, typeinfo, thunks and out-of-line copies of virtual inline methods 
simply must be present in the library too (or am I wrong here?).

 Also, I think prelink probably could here play the role of a mysterious 
stranger who comes and saves everybody :). I think it could during prelinking 
the libs do the reverse of what dynamic loader does when doing symbol lookup. 
E.g. if you have this :

a.h:
template< typename T >class Foo { public: virtual void bar() {}; };

a.cpp:
#include "a.h"
template Foo< int >;

b.cpp:
#include "a.h"
template Foo< int >;

 a.cpp is used to create libta, b.cpp is used to create libtb and linked with 
-lta . They both among other things contain also Foo< int >::bar() . Now when
loading libta and libtb, all references to Foo< int >::bar() are made to 
point to the one created in b.o in libtb, which causes a conflict. However, 
both these generated Foo< int >::bar() should be the same, so if prelink 
would make all references point to the one in a.o in libta instead, this 
would avoid the conflict without any functional change. Moreover, if I 
understand the things correctly, if libta would be replace by a new version, 
prelinking would become invalid and therefore some possibly disappearing 
symbols from libta shouldn't break anything. 
 The problem here would be if Foo< int >::bar() would be intentionally 
different in libtb, e.g. in order to fix the broken one in libta. I don't 
think such thing is done with templates in whole KDE, but it might happen (I 
think I'm starting to hate this 'but it might happen'). In this case the 
version in libtb would have to be marked by the programmer somehow, so gcc 
would generate info for prelink that it should for this symbol the version in 
libtb. Does this make sense or would there be a problem with this? I think 
this variant of linkonce-across-libs behaviour would be simpler to use.

 Could also somebody please comment on reducing the number of multiple 
instances of things like vtables or the thunks in gcc ? Improving this could 
greatly reduce the number of conflicts.

 One more thing : Does prelink also help with dlopen-ing ? It's used 
extensively in KDE ( KParts, Control Center modules, and sometimes just in 
order to reduce the high per-process memory overhead ). I tried it, it seemed 
to be a bit faster, but it didn't seem to make much difference. I also don't 
know if dlopen-ing can cause so much unshared memory because of conflicts.

 Thanks

 Lubos Lunak
-- 
 l.lunak@email.cz ; l.lunak@kde.org
 http://dforce.sh.cvut.cz/~seli

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2001-09-13 12:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-11 11:33 KDE: C++ and shared libraries Lubos Lunak
2001-09-11 14:40 ` Jakub Jelinek
2001-09-13 12:46   ` Lubos Lunak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).