public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* [Slightly OT] Linking speed
@ 2001-05-09  6:59 biswapesh.chattopadhyay
  2001-05-09  7:28 ` Paolo Carlini
  2001-05-09  7:36 ` Andreas Jaeger
  0 siblings, 2 replies; 54+ messages in thread
From: biswapesh.chattopadhyay @ 2001-05-09  6:59 UTC (permalink / raw)
  To: gcc

Hi

Just wondering if this would be useful for GCC developers. This is written
by Waldo Bastan, one of the key KDE developers and contains some interesting
statistics about the GNU linker, etc.

http://www.suse.de/~bastian/Export/linking.txt

Thanks.
Biswa.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Slightly OT] Linking speed
  2001-05-09  6:59 [Slightly OT] Linking speed biswapesh.chattopadhyay
@ 2001-05-09  7:28 ` Paolo Carlini
  2001-05-09  7:36 ` Andreas Jaeger
  1 sibling, 0 replies; 54+ messages in thread
From: Paolo Carlini @ 2001-05-09  7:28 UTC (permalink / raw)
  To: biswapesh.chattopadhyay, gcc

Hi,

yes, perhaps slightly OT, but definitely very interesting (even for non GCC
developers, like me!)
Thanks!

Why don't you CC you message to the binutils (i.e., ld) discussion lists?

    http://sources.redhat.com/binutils/


P.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Slightly OT] Linking speed
  2001-05-09  6:59 [Slightly OT] Linking speed biswapesh.chattopadhyay
  2001-05-09  7:28 ` Paolo Carlini
@ 2001-05-09  7:36 ` Andreas Jaeger
  2001-05-09  7:47   ` Paolo Carlini
  2001-05-09  9:20   ` [I don't think it's off-topic at all] Linking speed for C++ Joe Buck
  1 sibling, 2 replies; 54+ messages in thread
From: Andreas Jaeger @ 2001-05-09  7:36 UTC (permalink / raw)
  To: biswapesh.chattopadhyay; +Cc: gcc, bastian

biswapesh.chattopadhyay@bt.com writes:

> Hi
> 
> Just wondering if this would be useful for GCC developers. This is written
> by Waldo Bastan, one of the key KDE developers and contains some interesting
> statistics about the GNU linker, etc.
> 
> http://www.suse.de/~bastian/Export/linking.txt

It has been discussed already on the libc-alpha list, check the
archives via http://sources.redhat.com/glibc

Andreas
-- 
 Andreas Jaeger
  SuSE Labs aj@suse.de
   private aj@arthur.inka.de
    http://www.suse.de/~aj

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Slightly OT] Linking speed
  2001-05-09  7:36 ` Andreas Jaeger
@ 2001-05-09  7:47   ` Paolo Carlini
  2001-05-09  9:20   ` [I don't think it's off-topic at all] Linking speed for C++ Joe Buck
  1 sibling, 0 replies; 54+ messages in thread
From: Paolo Carlini @ 2001-05-09  7:47 UTC (permalink / raw)
  To: Andreas Jaeger, gcc

Hi,

Andreas Jaeger wrote:

> It has been discussed already on the libc-alpha list, check the
> archives via http://sources.redhat.com/glibc

Thanks!
For the benefit of interested people it is:

    http://sources.redhat.com/ml/libc-alpha/2001-05/msg00025.html

P.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09  7:36 ` Andreas Jaeger
  2001-05-09  7:47   ` Paolo Carlini
@ 2001-05-09  9:20   ` Joe Buck
  2001-05-09 11:09     ` Richard Henderson
  1 sibling, 1 reply; 54+ messages in thread
From: Joe Buck @ 2001-05-09  9:20 UTC (permalink / raw)
  To: Andreas Jaeger; +Cc: biswapesh.chattopadhyay, gcc, bastian

> > Just wondering if this would be useful for GCC developers. This is written
> > by Waldo Bastan, one of the key KDE developers and contains some interesting
> > statistics about the GNU linker, etc.
> > 
> > http://www.suse.de/~bastian/Export/linking.txt
> 
> It has been discussed already on the libc-alpha list, check the
> archives via http://sources.redhat.com/glibc

To be more direct, go to

http://sources.redhat.com/ml/libc-alpha/2001-05/threads.html

and read the thread entitled "Prelinking of shared libraries"

I don't think that this is off-topic at all.  The need for lots of
relocations in C++ code is a consequence of gcc's design (actually,
gcc is just doing what other C++ compilers do, but still ...), so if it isn't
on topic here, what is?  Working around the problem is interesting, but
can we attack the problem more directly?

I've been thinking about this and I think it might be worth experimenting
with a modified vtable implementation.  I AM NOT PROPOSING TO MODIFY THE ABI.
Rather, users could designate that the modified implementation be used
in specific cases.

The idea would be that an attribute, say pic_vtable, could be added to a
base class.  It would specify that the vtable for this class and all
derived classes would use the modified vtable implementation.  It would
be an error to attempt multiple inheritance from classes with different
values of this attribute.

If pic_vtable is specified, the function pointers in the virtual function
table would be replaced by (pointer - vtable_address).  This means that
they would now be PIC.  The penalty is that a virtual function call now
has to do an extra arithmetic operation:

before: assume register 0 has a pointer to the object
	register1 <- *(register0)	   ; get vtable address
	register2 <- *(offset + register1) ; get function address 
	call *register2

after
	register1 <- *(register0)		
	register2 <- *(offset + register1) + register1
	call *register2

On the x86 it might be possible to do this without an extra instruction,
since the add can take a memory operand, but I haven't tried it.

Thoughts?  Flames?

I haven't dealt with exception handling here, that is also a source of
non-PIC code.





^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09  9:20   ` [I don't think it's off-topic at all] Linking speed for C++ Joe Buck
@ 2001-05-09 11:09     ` Richard Henderson
  2001-05-09 11:39       ` Joe Buck
  0 siblings, 1 reply; 54+ messages in thread
From: Richard Henderson @ 2001-05-09 11:09 UTC (permalink / raw)
  To: Joe Buck; +Cc: Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

On Wed, May 09, 2001 at 09:19:59AM -0700, Joe Buck wrote:
> If pic_vtable is specified, the function pointers in the virtual function
> table would be replaced by (pointer - vtable_address).  This means that
> they would now be PIC.

No it doesn't.  If pointer references a globaly visible symbol,
then the vtable is _still_ subject to dynamic relocation.

What you need is a pc-relative relocation to a PLT entry.  Few
targets support this kind of relocation.  Though x86 and Sparc
do, so it's not entirely without merit.

> 	register1 <- *(register0)		
> 	register2 <- *(offset + register1) + register1
> 	call *register2

You'd wind up with

	r1 = *r0
	r2 = r1 + offset
	r3 = *r2 + r2
	call *r3

with the relocations that are available on x86 and sparc.



r~

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 11:09     ` Richard Henderson
@ 2001-05-09 11:39       ` Joe Buck
  2001-05-09 11:45         ` Richard Henderson
  0 siblings, 1 reply; 54+ messages in thread
From: Joe Buck @ 2001-05-09 11:39 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Joe Buck, Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

> On Wed, May 09, 2001 at 09:19:59AM -0700, Joe Buck wrote:
> > If pic_vtable is specified, the function pointers in the virtual function
> > table would be replaced by (pointer - vtable_address).  This means that
> > they would now be PIC.
> 
> No it doesn't.  If pointer references a globaly visible symbol,
> then the vtable is _still_ subject to dynamic relocation.

You are far more expert on these matters than I am.  Still:

Clearly, if the pointer points to a global symbol that is defined
externally, we we still need a relocation.  But it will be common to refer
to a function whose definition comes from the same object file, or the
same .so (meaning that relocations can be eliminated when building the
shared library).  In this case, if we put the vtable in the text section,
pointer - vtable_address is simply a constant, on any platform.  This
would be true for some vtable entries but not others (in some cases, a
separate .so would define a class that is derived from a class whose
implementation is in another .so).

So it seems to me that at least some (though not all) relocations can be
made to disappear on all platforms.

> What you need is a pc-relative relocation to a PLT entry.  Few
> targets support this kind of relocation.  Though x86 and Sparc
> do, so it's not entirely without merit.
> 
> > 	register1 <- *(register0)		
> > 	register2 <- *(offset + register1) + register1
> > 	call *register2
> 
> You'd wind up with
> 
> 	r1 = *r0
> 	r2 = r1 + offset
> 	r3 = *r2 + r2
> 	call *r3
> 
> with the relocations that are available on x86 and sparc.

This would require an extra instruction on the sparc but not the x86
(though there could still be a penalty on the x86 because use of the
ALU affects ILP).

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 11:39       ` Joe Buck
@ 2001-05-09 11:45         ` Richard Henderson
  2001-05-09 12:34           ` Waldo Bastian
  2001-05-09 13:54           ` Joe Buck
  0 siblings, 2 replies; 54+ messages in thread
From: Richard Henderson @ 2001-05-09 11:45 UTC (permalink / raw)
  To: Joe Buck; +Cc: Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

On Wed, May 09, 2001 at 11:39:04AM -0700, Joe Buck wrote:
> Clearly, if the pointer points to a global symbol that is defined
> externally, we we still need a relocation.  But it will be common to refer
> to a function whose definition comes from the same object file, or the
> same .so (meaning that relocations can be eliminated when building the
> shared library).

No, you misunderstand.  If the symbol is _visible_ externally, then a
dynamic relocation is required.  Even if the actual definition comes
from the current dso.  This is just the way ELF dynamic linking works.

> So it seems to me that at least some (though not all) relocations can be
> made to disappear on all platforms.

If you can't get rid of them all, then you can't make them read-only.

> This would require an extra instruction on the sparc but not the x86
> (though there could still be a penalty on the x86 because use of the
> ALU affects ILP).

Yes.  Oh well.  Whaddaya want, something for nothing?  ;-)


r~

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 11:45         ` Richard Henderson
@ 2001-05-09 12:34           ` Waldo Bastian
  2001-05-09 12:48             ` Richard Henderson
  2001-05-09 13:54           ` Joe Buck
  1 sibling, 1 reply; 54+ messages in thread
From: Waldo Bastian @ 2001-05-09 12:34 UTC (permalink / raw)
  To: Richard Henderson, Joe Buck
  Cc: Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

On Wednesday 09 May 2001 11:45, Richard Henderson wrote:
> On Wed, May 09, 2001 at 11:39:04AM -0700, Joe Buck wrote:
> > Clearly, if the pointer points to a global symbol that is defined
> > externally, we we still need a relocation.  But it will be common to
> > refer to a function whose definition comes from the same object file, or
> > the same .so (meaning that relocations can be eliminated when building
> > the shared library).
>
> No, you misunderstand.  If the symbol is _visible_ externally, then a
> dynamic relocation is required.  Even if the actual definition comes
> from the current dso.  This is just the way ELF dynamic linking works.

What might help, but correct me if I have missed something, is if virtual 
function calls would go through the PLT, just like normal function calls. 
That would reduce the number of relocations needed (cause if one function is 
present in two vtables it now requires two relocations, and then only one), 
and it would make it possible to use lazy binding for them.

Someone mentioned that using "-fvtable-thunks=3" (in gcc 2.95.3?) would 
actually do something like that but I haven't been able to verify that.

Cheers,
Waldo
-- 
bastian@kde.org | SuSE Labs KDE Developer | bastian@suse.com

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 12:34           ` Waldo Bastian
@ 2001-05-09 12:48             ` Richard Henderson
  2001-05-09 14:06               ` Joe Buck
  0 siblings, 1 reply; 54+ messages in thread
From: Richard Henderson @ 2001-05-09 12:48 UTC (permalink / raw)
  To: Waldo Bastian
  Cc: Joe Buck, Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

On Wed, May 09, 2001 at 12:29:53PM -0700, Waldo Bastian wrote:
> What might help, but correct me if I have missed something, is if virtual 
> function calls would go through the PLT, just like normal function calls. 
> That would reduce the number of relocations needed (cause if one function is 
> present in two vtables it now requires two relocations, and then only one), 
> and it would make it possible to use lazy binding for them.

Not quite.  It would help, since the vtable would then use a
RELATIVE relocation, which does not require a symbol lookup.

The problem on x86 is that the PIC PLT entry requires that 
%ebx already be set up properly _for that dso_.  Which means
that you can't the PLT entry for a different dso, which is
what you'd be doing if you put the address of the PLT entry
in the vtable.

It would be possible to generate stub functions along with
the vtable instead.  I've not got a clear idea what that 
would do to program size and performance.

> Someone mentioned that using "-fvtable-thunks=3" (in gcc 2.95.3?) would 
> actually do something like that but I haven't been able to verify that.

Incorrect.


r~

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 11:45         ` Richard Henderson
  2001-05-09 12:34           ` Waldo Bastian
@ 2001-05-09 13:54           ` Joe Buck
  2001-05-09 14:21             ` Geoff Keating
                               ` (3 more replies)
  1 sibling, 4 replies; 54+ messages in thread
From: Joe Buck @ 2001-05-09 13:54 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Joe Buck, Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

On Wed, May 09, 2001 at 11:39:04AM -0700, Joe Buck wrote:
> > Clearly, if the pointer points to a global symbol that is defined
> > externally, we we still need a relocation.  But it will be common to refer
> > to a function whose definition comes from the same object file, or the
> > same .so (meaning that relocations can be eliminated when building the
> > shared library).
> 
> No, you misunderstand.  If the symbol is _visible_ externally, then a
> dynamic relocation is required.  Even if the actual definition comes
> from the current dso.  This is just the way ELF dynamic linking works.

It seems that this should be fixable by making the linker smarter.  After
all, for the case of an offset to the same dso, what we have is a
constant.  What will this required dynamic relocation do?  There is
nothing to relocate, nothing to compute!  When we do gcc -shared, we wind
up with an .so that has some relocation expressions in which the
relocations cancel.  Why can't they be constant-folded, so when this .so
is linked to at runtime, no relocation is left?

> > So it seems to me that at least some (though not all) relocations can be
> > made to disappear on all platforms.
> 
> If you can't get rid of them all, then you can't make them read-only.

True, but this started from a discussion that showed that startup time
was proportional to the number of relocations that must be performed
at startup time.  Startup speed increases if the number of needed
relocations decrease.  We don't have to go to zero to see a benefit.

> > This would require an extra instruction on the sparc but not the x86
> > (though there could still be a penalty on the x86 because use of the
> > ALU affects ILP).
> 
> Yes.  Oh well.  Whaddaya want, something for nothing?  ;-)

Well, in this case (for the x86) it seems we can come close, as the
beginning of the called function is going have mostly moves, not ALU use,
for the first few instructions, so the extra addition might well be free
in most cases.



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 12:48             ` Richard Henderson
@ 2001-05-09 14:06               ` Joe Buck
  0 siblings, 0 replies; 54+ messages in thread
From: Joe Buck @ 2001-05-09 14:06 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Waldo Bastian, Joe Buck, Andreas Jaeger, biswapesh.chattopadhyay,
	gcc, bastian

> The problem on x86 is that the PIC PLT entry requires that 
> %ebx already be set up properly _for that dso_.  Which means
> that you can't the PLT entry for a different dso, which is
> what you'd be doing if you put the address of the PLT entry
> in the vtable.
> 
> It would be possible to generate stub functions along with
> the vtable instead.  I've not got a clear idea what that 
> would do to program size and performance.

This approach would mean that if we have class Base in one dso and class
Derived : public Base in another, for each virtual method func of Base
that is not overriden in class Derived, we'd add a stub Derived::func that
would belong to the same dso as Derived, and would do a direct PIC call to
Base::func.

If this were done, then it seems we can now have purely relocatable
vtables using the offset method: every offset is between a function
defined in the current dso and a vtable also defined in the current
dso.  The penalty is an extra call and an extra stub function, though
these stubs will be very small.

The size penalty of the stubs would be traded off against the size
improvement because the vtables can now be shared.  The number of
stubs needed will depend on shared library organization but could be
large if there are many dsos that use the same C++ classes.

Since we can do the extra addition in the virtual function call for
free in most cases on the x86, it seems that this could be a win for
the KDE folks.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 13:54           ` Joe Buck
@ 2001-05-09 14:21             ` Geoff Keating
  2001-05-21 16:53               ` Marc Espie
  2001-05-09 14:26             ` Jakub Jelinek
                               ` (2 subsequent siblings)
  3 siblings, 1 reply; 54+ messages in thread
From: Geoff Keating @ 2001-05-09 14:21 UTC (permalink / raw)
  To: Joe Buck; +Cc: gcc

Joe Buck <jbuck@synopsys.COM> writes:

> On Wed, May 09, 2001 at 11:39:04AM -0700, Joe Buck wrote:
> > > Clearly, if the pointer points to a global symbol that is defined
> > > externally, we we still need a relocation.  But it will be common to refer
> > > to a function whose definition comes from the same object file, or the
> > > same .so (meaning that relocations can be eliminated when building the
> > > shared library).
> > 
> > No, you misunderstand.  If the symbol is _visible_ externally, then a
> > dynamic relocation is required.  Even if the actual definition comes
> > from the current dso.  This is just the way ELF dynamic linking works.
> 
> It seems that this should be fixable by making the linker smarter.  After
> all, for the case of an offset to the same dso, what we have is a
> constant.  What will this required dynamic relocation do?

There's no way to know, for an externally visible symbol, at link
time, which dso it is in, even if there is a definition in the same
dso as it is used, because it could be overridden elsewhere.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 13:54           ` Joe Buck
  2001-05-09 14:21             ` Geoff Keating
@ 2001-05-09 14:26             ` Jakub Jelinek
  2001-05-09 14:38             ` Jeff Sturm
  2001-05-09 15:49             ` Richard Henderson
  3 siblings, 0 replies; 54+ messages in thread
From: Jakub Jelinek @ 2001-05-09 14:26 UTC (permalink / raw)
  To: Joe Buck
  Cc: Richard Henderson, Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

On Wed, May 09, 2001 at 01:53:01PM -0700, Joe Buck wrote:
> It seems that this should be fixable by making the linker smarter.  After
> all, for the case of an offset to the same dso, what we have is a
> constant.  What will this required dynamic relocation do?  There is
> nothing to relocate, nothing to compute!  When we do gcc -shared, we wind
> up with an .so that has some relocation expressions in which the
> relocations cancel.  Why can't they be constant-folded, so when this .so
> is linked to at runtime, no relocation is left?

They are cancelled if you ask the linker to do it (e.g. through symbol
versioning and making some symbols local). If you don't ask for it this way,
the linker cannot do it for you (because then it is e.g. possible to
override that symbol in some other DSO which will come earlier in the search
list).

	Jakub

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 13:54           ` Joe Buck
  2001-05-09 14:21             ` Geoff Keating
  2001-05-09 14:26             ` Jakub Jelinek
@ 2001-05-09 14:38             ` Jeff Sturm
  2001-05-09 15:04               ` Geoff Keating
  2001-05-09 15:49             ` Richard Henderson
  3 siblings, 1 reply; 54+ messages in thread
From: Jeff Sturm @ 2001-05-09 14:38 UTC (permalink / raw)
  To: Joe Buck; +Cc: gcc

On Wed, 9 May 2001, Joe Buck wrote:
> It seems that this should be fixable by making the linker smarter.  After
> all, for the case of an offset to the same dso, what we have is a
> constant.  What will this required dynamic relocation do?  There is
> nothing to relocate, nothing to compute!  When we do gcc -shared, we wind
> up with an .so that has some relocation expressions in which the
> relocations cancel.  Why can't they be constant-folded, so when this .so
> is linked to at runtime, no relocation is left?

As I understand it (and I am not an expert), the ELF dynamic linker
searches the list of dso's to resolve a global symbol.  So there is no
constant offset known at link-time (unless you link with -Bsymbolic, and
I'm not quite sure just what that does).  This is the way ELF works.  Any
global symbol can be overridden in another shared object, to provide for
backlinking.

There was a recent thread on the gcj list on this very topic.  The gcj
runtime, libgcj.so, currently has about 75,000 relocations on GNU/Linux.
As the java language is not C++, there may be some alternatives for
gcj such as preparing vtables at runtime.

For C++ I'm not sure what you can do, especially in the constraints of an
ABI.  Perhaps the vtable entries could initially point to a private
function that would lazily find the appropriate method and write the
address to the vtable slot?  I have no idea if that would be of any
benefit.

Jeff

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 14:38             ` Jeff Sturm
@ 2001-05-09 15:04               ` Geoff Keating
  2001-05-09 17:05                 ` Joe Buck
  0 siblings, 1 reply; 54+ messages in thread
From: Geoff Keating @ 2001-05-09 15:04 UTC (permalink / raw)
  To: Jeff Sturm; +Cc: gcc

Jeff Sturm <jsturm@one-point.com> writes:

> For C++ I'm not sure what you can do, especially in the constraints of an
> ABI.  Perhaps the vtable entries could initially point to a private
> function that would lazily find the appropriate method and write the
> address to the vtable slot?  I have no idea if that would be of any
> benefit.

That's likely to cause problem with comparing pointers to methods.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 13:54           ` Joe Buck
                               ` (2 preceding siblings ...)
  2001-05-09 14:38             ` Jeff Sturm
@ 2001-05-09 15:49             ` Richard Henderson
  2001-05-09 17:01               ` Joe Buck
  3 siblings, 1 reply; 54+ messages in thread
From: Richard Henderson @ 2001-05-09 15:49 UTC (permalink / raw)
  To: Joe Buck; +Cc: Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

On Wed, May 09, 2001 at 01:53:01PM -0700, Joe Buck wrote:
> It seems that this should be fixable by making the linker smarter.

No.

> What will this required dynamic relocation do?

Allow the symbol to be overridden from another DSO earlier in
the symbol resolution search path.


r~

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 15:49             ` Richard Henderson
@ 2001-05-09 17:01               ` Joe Buck
  2001-05-10 10:14                 ` Richard Henderson
  0 siblings, 1 reply; 54+ messages in thread
From: Joe Buck @ 2001-05-09 17:01 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Joe Buck, Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

> > What will this required dynamic relocation do?
> 
> Allow the symbol to be overridden from another DSO earlier in
> the symbol resolution search path.

OK, OK (and thanks to the other 5-6 people who told me this as well).
This feature, unfortunately, is quite expensive.

It seems the only way out that preserves the ability to override symbols,
but that gets rid of the relocations, is to have stubs for all calls.
That is, the function pointer can be replaced by

(stub_address - vtable_address) 

and the stub can, in turn, be a normal PIC call (or, depending on the
details of PIC on the particular processor, just a jump).  This would seem
to get rid of all the relocations for the vtable, but at a subtantial
cost.  Some alternate redesign of the vtable would probably be better.

(Yes, this kind of thing will break the standard ABI).

Or, of course, just break the ability to overload symbols for virtual
functions. :-)  The user waiting for his desktop to start up may not
care.



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 15:04               ` Geoff Keating
@ 2001-05-09 17:05                 ` Joe Buck
  0 siblings, 0 replies; 54+ messages in thread
From: Joe Buck @ 2001-05-09 17:05 UTC (permalink / raw)
  To: Geoff Keating; +Cc: Jeff Sturm, gcc

> Jeff Sturm <jsturm@one-point.com> writes:
> 
> > For C++ I'm not sure what you can do, especially in the constraints of an
> > ABI.  Perhaps the vtable entries could initially point to a private
> > function that would lazily find the appropriate method and write the
> > address to the vtable slot?  I have no idea if that would be of any
> > benefit.

Geoff Keating writes:
> That's likely to cause problem with comparing pointers to methods.

Besides, if the pages get modified on the fly they can't be shared.
The KDE user is going to have maybe 20 processes all using -lqt and the
basic KDE libraries, with lots of vtables in them, probably hundreds
of K worth.  Currently the runtime relocations prevent them from being
shared, but if they can be shared, that recovers a substantial amount
of memory (which then could be used up again with stubs or something
else).


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 17:01               ` Joe Buck
@ 2001-05-10 10:14                 ` Richard Henderson
  2001-05-10 10:45                   ` Tom Tromey
  2001-05-10 11:17                   ` Joe Buck
  0 siblings, 2 replies; 54+ messages in thread
From: Richard Henderson @ 2001-05-10 10:14 UTC (permalink / raw)
  To: Joe Buck; +Cc: Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

On Wed, May 09, 2001 at 05:01:18PM -0700, Joe Buck wrote:
> Or, of course, just break the ability to overload symbols for virtual
> functions. :-)  The user waiting for his desktop to start up may not
> care.

Recently some symbol visibility flags were added to ELF.  It
would be possible to change the compiler such that

  class __attribute__((visibility(protected))) Foo
  {
  ...
  };

marked each member symbol of Foo with STV_PROTECTED, which
prevents this symbol from being overridden by another DSO.

Judicious use of this feature might reduce the number of
symbol relocations (vs relative relocations) substantially.



r~

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-10 10:14                 ` Richard Henderson
@ 2001-05-10 10:45                   ` Tom Tromey
  2001-05-10 10:49                     ` Richard Henderson
  2001-05-10 11:17                   ` Joe Buck
  1 sibling, 1 reply; 54+ messages in thread
From: Tom Tromey @ 2001-05-10 10:45 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Joe Buck, Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

>>>>> "Richard" == Richard Henderson <rth@redhat.com> writes:

Richard> Recently some symbol visibility flags were added to ELF.  It
Richard> would be possible to change the compiler such that
Richard>   class __attribute__((visibility(protected))) Foo
Richard>   {
Richard>   ...
Richard>   };

When would we not want to do this?

In Java there are no attributes.  I'm wondering if it would make sense
to just have the compiler always do this.  I don't see a big value in
being able to override methods and the like from another shared
library.

Tom

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-10 10:45                   ` Tom Tromey
@ 2001-05-10 10:49                     ` Richard Henderson
  0 siblings, 0 replies; 54+ messages in thread
From: Richard Henderson @ 2001-05-10 10:49 UTC (permalink / raw)
  To: Tom Tromey
  Cc: Joe Buck, Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

On Thu, May 10, 2001 at 11:58:45AM -0600, Tom Tromey wrote:
> When would we not want to do this?

When bits wind up in header files, such that bits get compiled into
multiple object files in different dsos.  You may rely on one or the
other definition actually being used, but not both.  This is particularly
important for data members.

I don't know, but suspect, that it's easy to get in trouble here if
you aren't careful.  Particularly with templates.


r~

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-10 10:14                 ` Richard Henderson
  2001-05-10 10:45                   ` Tom Tromey
@ 2001-05-10 11:17                   ` Joe Buck
  2001-05-10 16:23                     ` Richard Henderson
  1 sibling, 1 reply; 54+ messages in thread
From: Joe Buck @ 2001-05-10 11:17 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Joe Buck, Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

I wrote:
> > Or, of course, just break the ability to overload symbols for virtual
> > functions. :-)  The user waiting for his desktop to start up may not
> > care.

Richard Henderson writes:
> Recently some symbol visibility flags were added to ELF.  It
> would be possible to change the compiler such that
> 
>   class __attribute__((visibility(protected))) Foo
>   {
>   ...
>   };
> 
> marked each member symbol of Foo with STV_PROTECTED, which
> prevents this symbol from being overridden by another DSO.

OK, I'll flesh out my original proposal.  We add two attributes,
which I will temporarily call vtable_pic and vtable_stubs until
I think of better names.

The vtable_pic attribute may be applied to a base class (that is, a class
that is not derived from another class).  If present, the vtable is
generated in offset form (function address minus vtable address);
furthermore, the above mentioned attribute, visibility(protected), is
applied to all virtual member functions of the current class and any
derived class.  Multiple inheritance from a vtable_pic class is only
allowed if all base classes that have virtual functions define vtable_pic,
so the compiler always knows how to generate virtual function calls.

The vtable_stubs attribute may be applied to a derived class,
provided that some base class has the vtable_pic attribute.  This
attribute is intended for use when the definitions of the member
functions of the derived class are in a different dso than the
functions of the base class.  What it does is to create a small
stub function for each member function of the base class that is
not overridden in the derived class.  The effect is as if we had
virtual function definitions of the form

// sloppy pseudocode
returntype Derived::method(arglist)
{
	return Base::method(arglist);
}

for all non-overriden functions.  Further-derived classes would have
their vtable entries pointing to the stub function, not the original.

By applying these two attributes correctly, the library designer can wind
up with completely PIC virtual function tables, requiring no relocations
at all.  The vtables can be placed in the text section.

The price is an extra addition in a virtual function call
(which will be free in most cases on x86), plus extra stub functions
for classes defined in one dso that are derived from classes defined
in another dso.

Exception handling is also non-PIC; sorry, I don't know enough about
the current implementation to have any suggestions for reducing the
number of relocations.  Maybe Mike Stump has ideas.  In any case,
the KDE folks already use -fno-exceptions for this reason.



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-10 11:17                   ` Joe Buck
@ 2001-05-10 16:23                     ` Richard Henderson
  2001-05-10 16:42                       ` Joe Buck
  0 siblings, 1 reply; 54+ messages in thread
From: Richard Henderson @ 2001-05-10 16:23 UTC (permalink / raw)
  To: Joe Buck; +Cc: Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

On Thu, May 10, 2001 at 11:16:39AM -0700, Joe Buck wrote:
> The vtable_pic attribute may be applied to a base class (that is, a class
> that is not derived from another class).  If present, the vtable is
> generated in offset form (function address minus vtable address);

Seems ok.

> The vtable_stubs attribute may be applied to a derived class,
> provided that some base class has the vtable_pic attribute.  This
> attribute is intended for use when the definitions of the member
> functions of the derived class are in a different dso than the
> functions of the base class.

I suspect that this will be hard for people to use in practice.

> Exception handling is also non-PIC; sorry, I don't know enough about
> the current implementation to have any suggestions for reducing the
> number of relocations.

This is being addressed in the new eh implementation.


r~

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-10 16:23                     ` Richard Henderson
@ 2001-05-10 16:42                       ` Joe Buck
  0 siblings, 0 replies; 54+ messages in thread
From: Joe Buck @ 2001-05-10 16:42 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Joe Buck, Andreas Jaeger, biswapesh.chattopadhyay, gcc, bastian

I wrote:
> > The vtable_pic attribute may be applied to a base class (that is, a class
> > that is not derived from another class).  If present, the vtable is
> > generated in offset form (function address minus vtable address);

Richard H. writes:
> Seems ok.
> 
> > The vtable_stubs attribute may be applied to a derived class,
> > provided that some base class has the vtable_pic attribute.  This
> > attribute is intended for use when the definitions of the member
> > functions of the derived class are in a different dso than the
> > functions of the base class.
> 
> I suspect that this will be hard for people to use in practice.

I don't think it's so bad.  Let's take an oversimplified example,
where we have -lqt (the QT library), -lkdecore (the core of KDE),
and -lkdeui (higher-level KDE stuff).  Classes in -lqt use vtable_pic.
For any class whose implementation is in kdecore that is derived
from a class in -lqt, use vtable_stubs.  Likewise for any class in
-lkdeui that is derived from a class in one of the other libraries.

An alternative would be to unify the two attributes with something like
vtable_dso("name"), where you give the name of the dso you want the class
to land in, as a string.  The only purpose of the argument is that if the
derived class's dso name differs from that of the base class, stubs are
generated.  So in the above example, we might write the attribute as
vtable_dso("qt") or vtable_dso("kdecore").  The attribute is inherited
from base classes unless overridden.

> > Exception handling is also non-PIC; sorry, I don't know enough about
> > the current implementation to have any suggestions for reducing the
> > number of relocations.
> 
> This is being addressed in the new eh implementation.

Great.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-09 14:21             ` Geoff Keating
@ 2001-05-21 16:53               ` Marc Espie
  2001-05-21 17:23                 ` Richard Henderson
  0 siblings, 1 reply; 54+ messages in thread
From: Marc Espie @ 2001-05-21 16:53 UTC (permalink / raw)
  To: gcc

In article < jmbsp26wwu.fsf@geoffk.org > you write:
>There's no way to know, for an externally visible symbol, at link
>time, which dso it is in, even if there is a definition in the same
>dso as it is used, because it could be overridden elsewhere.

Even though this is the way ELF is supposed to do things, this looks to
me like an utterly stupid design decision.  What this does is ensure that
dynamic linking time is going to go waaay up when the size of libraries
increase.  Obviously, kde is already stumbling into that barrier, as it's
the first majorly large project that uses dynamic linking extensively.

Isn't there at least a simple way to tell ELF to stop being dumb and
just resolve the symbol here & now ?

I mean, all these magic thingies in ELF that provide more than enough rope
to hang oneself's, and it would be missing such a useful practical feature ?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-21 16:53               ` Marc Espie
@ 2001-05-21 17:23                 ` Richard Henderson
  2001-05-22  2:41                   ` Fergus Henderson
  0 siblings, 1 reply; 54+ messages in thread
From: Richard Henderson @ 2001-05-21 17:23 UTC (permalink / raw)
  To: Marc Espie; +Cc: gcc

On Tue, May 22, 2001 at 01:52:54AM +0200, Marc Espie wrote:
> Isn't there at least a simple way to tell ELF to stop being dumb and
> just resolve the symbol here & now ?

Three ways:

  -Bsymbolic

  symbol version scripts

  STV_HIDDEN/STV_PROTECTED (which, admittedly, would do with some
  better support in the compiler proper).


r~

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-21 17:23                 ` Richard Henderson
@ 2001-05-22  2:41                   ` Fergus Henderson
  2001-05-22 12:00                     ` Joe Buck
  0 siblings, 1 reply; 54+ messages in thread
From: Fergus Henderson @ 2001-05-22  2:41 UTC (permalink / raw)
  To: Richard Henderson, Marc Espie, gcc

On 21-May-2001, Richard Henderson <rth@redhat.com> wrote:
> On Tue, May 22, 2001 at 01:52:54AM +0200, Marc Espie wrote:
> > Isn't there at least a simple way to tell ELF to stop being dumb and
> > just resolve the symbol here & now ?
> 
> Three ways:
> 
>   -Bsymbolic
[...]

So is there some reason why the KDE shared libs can't be linked with
`-Bsymbolic'?  Would linking them with `-Bsymbolic' improve startup times?
Has anyone tried this?

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-22  2:41                   ` Fergus Henderson
@ 2001-05-22 12:00                     ` Joe Buck
  2001-05-22 13:12                       ` Richard Henderson
  0 siblings, 1 reply; 54+ messages in thread
From: Joe Buck @ 2001-05-22 12:00 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: Richard Henderson, Marc Espie, gcc

> On 21-May-2001, Richard Henderson <rth@redhat.com> wrote:
> > On Tue, May 22, 2001 at 01:52:54AM +0200, Marc Espie wrote:
> > > Isn't there at least a simple way to tell ELF to stop being dumb and
> > > just resolve the symbol here & now ?
> > 
> > Three ways:
> > 
> >   -Bsymbolic
> [...]
> 
> So is there some reason why the KDE shared libs can't be linked with
> `-Bsymbolic'?  Would linking them with `-Bsymbolic' improve startup times?
> Has anyone tried this?

It won't suffice because a relocation will still be needed: the vtable
entries point to absolute addresses.  That's why I made a proposal for
an optional modified vtable format, which would incorporate the equivalent
of -Bsymbolic.  See

http://gcc.gnu.org/ml/gcc/2001-05/msg00419.html


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-22 12:00                     ` Joe Buck
@ 2001-05-22 13:12                       ` Richard Henderson
  2001-05-22 13:34                         ` Joe Buck
  0 siblings, 1 reply; 54+ messages in thread
From: Richard Henderson @ 2001-05-22 13:12 UTC (permalink / raw)
  To: Joe Buck; +Cc: Fergus Henderson, Marc Espie, gcc

On Tue, May 22, 2001 at 11:54:01AM -0700, Joe Buck wrote:
> > So is there some reason why the KDE shared libs can't be linked with
> > `-Bsymbolic'?  Would linking them with `-Bsymbolic' improve startup times?
> > Has anyone tried this?
> 
> It won't suffice because a relocation will still be needed: the vtable
> entries point to absolute addresses.

While it's true that it does not completely eliminate the
dynamic relocation, it does (or should) eliminate a symbol
lookup -- eg an R_386_32 relocation should be replaced by
an R_386_RELATIVE relocation which is considerably simpler
to process.

> That's why I made a proposal for an optional modified vtable
> format, which would incorporate the equivalent of -Bsymbolic.

Also worth persuing in the 3.1 timeframe...



r~

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-22 13:12                       ` Richard Henderson
@ 2001-05-22 13:34                         ` Joe Buck
  2001-05-29 13:58                           ` Michael Matz
  0 siblings, 1 reply; 54+ messages in thread
From: Joe Buck @ 2001-05-22 13:34 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Joe Buck, Fergus Henderson, Marc Espie, gcc

> 

> > > So is there some reason why the KDE shared libs can't be linked with
> > > `-Bsymbolic'?  Would linking them with `-Bsymbolic' improve startup times?
> > > Has anyone tried this?

I wrote:
> > It won't suffice because a relocation will still be needed: the vtable
> > entries point to absolute addresses.

Richard Henderson wrote:
> While it's true that it does not completely eliminate the
> dynamic relocation, it does (or should) eliminate a symbol
> lookup -- eg an R_386_32 relocation should be replaced by
> an R_386_RELATIVE relocation which is considerably simpler
> to process.

Well, in that case some interested KDE developer might want to try using
-Bsymbolic and see if it improves startup times ...


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-22 13:34                         ` Joe Buck
@ 2001-05-29 13:58                           ` Michael Matz
  2001-05-29 14:41                             ` Richard Henderson
  0 siblings, 1 reply; 54+ messages in thread
From: Michael Matz @ 2001-05-29 13:58 UTC (permalink / raw)
  To: Joe Buck; +Cc: Richard Henderson, gcc

Hi,

On Tue, 22 May 2001, Joe Buck wrote:

> Richard Henderson wrote:
> > While it's true that it does not completely eliminate the
> > dynamic relocation, it does (or should) eliminate a symbol
> > lookup -- eg an R_386_32 relocation should be replaced by
> > an R_386_RELATIVE relocation which is considerably simpler
> > to process.
>
> Well, in that case some interested KDE developer might want to try using
> -Bsymbolic and see if it improves startup times ...

Well, if it only worked ;)  After building some of our bigger libraries by
hand for a test (with -Bsymbolic) the reduction in symbol-based
relocations is really looking good:
libkio.so.3.0.0 : 2346 - 1177
libksycoca.so.3.0.0 : 2090 - 1014
libkdecore.so.3.0.0 : 3902 - 1920
libkdeui.so.3.0.0 : 20059 - 14226
libkhtml.so.3.0.0 : 16583 - 2368

First number normal build, second number with -Bsymbolic.  The number
itself are the output of "objdump -R .so | grep -v '\*ASB\*' | wc -l".

But programs linked against those libs fail sometimes.  After quite some
time I've reduced it to this:
----- lib.h -----
class A {
public:
  A();
  static A* get_the_A() { return TheA; }
private:
  void init();
  static A* TheA;
};
----- lib1.cpp -----
#include <lib.h>
A* A::TheA = 0;

A::A() { init (); };

void A::init ()
{ TheA = this; }
----- prog.cpp -----
#include <lib.h>

int main()
{ A a;
  if (!a.get_the_A())
    abort();
  return 0;
}
--------------------

prog.cpp needs to be build with optimization (so the get_the_A() call is
inlined).  The lib1 can be built in any way.  If prog is dynamically
linked to lib1 and the found lib1 was compiled with -Bsymbolic prog
abort()'s.  This is due to lib1's version of init() setting another TheA
than the one read from in main().  I don't know why exactly this is
happening.  The relevant objdump -R output (shortened) for the normal lib1:
OFFSET   TYPE              VALUE
000007cf R_386_PC32        init__1A
000007d8 R_386_PC32        init__1A
000007f7 R_386_32          _1A.TheA
0000188c R_386_RELATIVE    *ABS*
000018c4 R_386_GLOB_DAT    ___brk_addr

and for -Bsymbolic lib1:
OFFSET   TYPE              VALUE
000007e7 R_386_RELATIVE    *ABS*
0000187c R_386_RELATIVE    *ABS*
000018b4 R_386_GLOB_DAT    ___brk_addr

objdump -d for normal lib (only init(), the call is correctly done):
000007f0 <init__1A>:
 7f0:   55                      push   %ebp
 7f1:   89 e5                   mov    %esp,%ebp
 7f3:   8b 45 08                mov    0x8(%ebp),%eax  # eax <--- this
 7f6:   a3 00 00 00 00          mov    %eax,0x0        # TheA <--- eax
 7fb:   89 ec                   mov    %ebp,%esp
 7fd:   5d                      pop    %ebp
 7fe:   c3                      ret

objdump -d for -Bsymbolic (again only init() ):
000007e0 <init__1A>:
 7e0:   55                      push   %ebp
 7e1:   89 e5                   mov    %esp,%ebp
 7e3:   8b 45 08                mov    0x8(%ebp),%eax  # eax <--- this
 7e6:   a3 84 18 00 00          mov    %eax,0x1884     # TheA <--- eax
 7eb:   89 ec                   mov    %ebp,%esp
 7ed:   5d                      pop    %ebp
 7ee:   c3                      ret

The program itself has for TheA the following relocation:
0804970c R_386_COPY        _1A.TheA
And the testing insn in main() which abort is this:
 8048595:       83 3d 0c 97 04 08 00    cmpl   $0x0,0x804970c

If I all load this into gdb I then indeed see references to two different
locations (with -Bsymbolic libs):  init()
Dump of assembler code for function init__1A:
0x400187e0 <init__1A>:  push   %ebp
0x400187e1 <init__1A+1>:        mov    %esp,%ebp
0x400187e3 <init__1A+3>:        mov    0x8(%ebp),%eax
0x400187e6 <init__1A+6>:        mov    %eax,0x40019884  # TheA <--- this
...
and main():
(gdb) disassemble
Dump of assembler code for function main:
0x8048580 <main>:       push   %ebp
0x8048581 <main+1>:     mov    %esp,%ebp
0x8048583 <main+3>:     sub    $0x18,%esp
0x8048586 <main+6>:     add    $0xfffffff4,%esp
0x8048589 <main+9>:     lea    0xffffffff(%ebp),%eax
0x804858c <main+12>:    push   %eax
0x804858d <main+13>:    call   0x8048454 <__1A>
0x8048592 <main+18>:    add    $0x10,%esp
0x8048595 <main+21>:    cmpl   $0x0,0x804970c         # TheA == 0

And here I'm finally at loss.  Is ld.so wrong, or ld, or as, or gcc?
Anyone?
system: g++ 2.95.2, ld 2.10.91 (with BFD 2.10.0.33), ld.so-2.2,
i386-linux (stock SuSE 7.1)


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 13:58                           ` Michael Matz
@ 2001-05-29 14:41                             ` Richard Henderson
  2001-05-29 14:46                               ` Geoff Keating
                                                 ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Richard Henderson @ 2001-05-29 14:41 UTC (permalink / raw)
  To: Michael Matz; +Cc: Joe Buck, gcc

On Tue, May 29, 2001 at 10:57:39PM +0200, Michael Matz wrote:
> This is due to lib1's version of init() setting another TheA
> than the one read from in main().  I don't know why exactly
> this is happening.

Welcome to Dynamic Linking 304, in which we discuss the peculiarities
of data symbols as seen on common 32-bit architectures.

Exhibit A:

> And the testing insn in main() which abort is this:
>  8048595:       83 3d 0c 97 04 08 00    cmpl   $0x0,0x804970c

Applications are typically built non-pic.  Every address is absolute,
which means that we must know the address of every object at link time
and not runtime.  Yet if symbol in question comes from a shared
library, how can this be?

For functions, the answer is simple: the known fixed address is the
address of a PLT thunk which branches to the actual address in the
shared library.  But what about data?  The instruction above is not
prepared for indirection.

Exhibit B:

> The program itself has for TheA the following relocation:
> 0804970c R_386_COPY        _1A.TheA

For every data object referenced by the main application, the linker
allocates space in the application's .bss section (sometimes referred
to as the .dynbss).  The above relocation copies the in-file contents
of the actual data in the shared library (recall that this ocurrs
before any constructors are run) to the application's .dynbss.

Under normal conditions, the symbol resolution rules search the main
application first, and so _everyone_ uses the copy in the .dynbss,
and everyone is happy.

But look what happens when we change the rules, as with -Bsymbolic:
the shared library searches itself first (optimized at link time in
this instance -- you'd get the same net result from ld.so if you
somehow manually added DT_SYMBOLIC to _DYNAMIC in the normal lib1),
the symbol gets resolved locally, and the application's copy is not
used.  Which then causes the application to fail.

Conclusion: symbol resolution rule changes cannot be made in the
presense of shared (in the multiple users sense) data.

There is something else to watch here.  Note that the linker had
to *reserve* space in the application.  This implies that the size
of the data is known at link time.  Which implies that that size
is constant.  Which implies that the size of the data is part of
the ABI, even if the structure is opaque at the language level.

An example of this last point appears in traditional stdio:

  struct FILE;
  extern FILE _iob[3];
  #define stdin  (&_iob[0])
  #define stdout (&_iob[1])
  #define stderr (&_iob[2])

Note that the entire _iob array will be copied into the main
application if the program references any of the standard files.
Which means that you cannot change the size of struct FILE even
though the application code thinks of it as opaque.

Conclusion: extreme caution should be employed with shared data.

> Is ld.so wrong, or ld, or as, or gcc?

None of the above, I'm afraid.

Of course, there is another way to avoid these problems that 
hasn't been mentioned yet -- build the application PIC as well.


r~

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 14:41                             ` Richard Henderson
@ 2001-05-29 14:46                               ` Geoff Keating
  2001-05-29 15:02                                 ` Richard Henderson
  2001-05-29 14:53                               ` Joe Buck
  2001-05-29 15:21                               ` Michael Matz
  2 siblings, 1 reply; 54+ messages in thread
From: Geoff Keating @ 2001-05-29 14:46 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc

Richard Henderson <rth@redhat.com> writes:

> Of course, there is another way to avoid these problems that 
> hasn't been mentioned yet -- build the application PIC as well.

On some platforms, this doesn't work the way you'd like.  Instead, the
PIC code is fully resolved at link time... and still generating PLT
entries or COPY relocs.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 14:41                             ` Richard Henderson
  2001-05-29 14:46                               ` Geoff Keating
@ 2001-05-29 14:53                               ` Joe Buck
  2001-05-29 15:09                                 ` Richard Henderson
  2001-05-29 15:21                               ` Michael Matz
  2 siblings, 1 reply; 54+ messages in thread
From: Joe Buck @ 2001-05-29 14:53 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Michael Matz, Joe Buck, gcc

Michael Matz wrote:
[ KDE -Bsymbolic experiment ] 
> > This is due to lib1's version of init() setting another TheA
> > than the one read from in main().  I don't know why exactly
> > this is happening.

Richard Henderson writes:
> Welcome to Dynamic Linking 304, in which we discuss the peculiarities
> of data symbols as seen on common 32-bit architectures.
> ...
> For every data object referenced by the main application, the linker
> allocates space in the application's .bss section (sometimes referred
> to as the .dynbss).  The above relocation copies the in-file contents
> of the actual data in the shared library (recall that this ocurrs
> before any constructors are run) to the application's .dynbss.
> 
> Under normal conditions, the symbol resolution rules search the main
> application first, and so _everyone_ uses the copy in the .dynbss,
> and everyone is happy.
> 
> But look what happens when we change the rules, as with -Bsymbolic:
> the shared library searches itself first (optimized at link time in
> this instance -- you'd get the same net result from ld.so if you
> somehow manually added DT_SYMBOLIC to _DYNAMIC in the normal lib1),
> the symbol gets resolved locally, and the application's copy is not
> used.  Which then causes the application to fail.
>
> Conclusion: symbol resolution rule changes cannot be made in the
> presense of shared (in the multiple users sense) data.

OK, it seems that we can't use -Bsymbolic for all symbols, because it
will break for global or file-scope data objects with constructors
or destructors.

So it seems that the next thing to try is to use -Bsymbolic for everything
except such global objects, either by marking them with attributes,
or to move their definitions into separate .cpp files that get compiled
without -Bsymbolic.

Richard, do you think that this will work?

> Of course, there is another way to avoid these problems that 
> hasn't been mentioned yet -- build the application PIC as well.

Users of the KDE libraries are likely to forget to do this, but of course
it could be tried as an experiment.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 14:46                               ` Geoff Keating
@ 2001-05-29 15:02                                 ` Richard Henderson
  2001-05-29 15:24                                   ` Geoff Keating
  0 siblings, 1 reply; 54+ messages in thread
From: Richard Henderson @ 2001-05-29 15:02 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc

On Tue, May 29, 2001 at 02:45:03PM -0700, Geoff Keating wrote:
> On some platforms, this doesn't work the way you'd like.  Instead, the
> PIC code is fully resolved at link time... and still generating PLT
> entries or COPY relocs.

Hmm.  Are you sure?  It shouldn't in the case of data symbols. 
For code symbols that is fine -- we'd be branching to the PLT
in any case.

I'd class this as a linker bug if it happens.


r~

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 14:53                               ` Joe Buck
@ 2001-05-29 15:09                                 ` Richard Henderson
  0 siblings, 0 replies; 54+ messages in thread
From: Richard Henderson @ 2001-05-29 15:09 UTC (permalink / raw)
  To: Joe Buck; +Cc: Michael Matz, gcc

On Tue, May 29, 2001 at 02:51:21PM -0700, Joe Buck wrote:
> So it seems that the next thing to try is to use -Bsymbolic for everything
> except such global objects, either by marking them with attributes,
> or to move their definitions into separate .cpp files that get compiled
> without -Bsymbolic.
> 
> Richard, do you think that this will work?

No.  -Bsymbolic is only relevant at link time.  It does not 
affect the compiler proper.

If you want finer resolution, you have to start playing with
ELF symbol visibility modifiers.  And if you want to use this
on symbols that still need to be seen outside the library
(i.e. most functions), you'll need glibc 2.2.2 or later as
this feature is relatively new.

Moreover, for C++ you'll really need more help from the 
compiler as well, since putting

  asm (".protected _Z6mumble1V");

all over the place with mangled symbols really isn't going
to cut the mustard.


r~

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 14:41                             ` Richard Henderson
  2001-05-29 14:46                               ` Geoff Keating
  2001-05-29 14:53                               ` Joe Buck
@ 2001-05-29 15:21                               ` Michael Matz
  2001-05-29 16:14                                 ` Richard Henderson
  2001-05-29 17:15                                 ` Daniel Berlin
  2 siblings, 2 replies; 54+ messages in thread
From: Michael Matz @ 2001-05-29 15:21 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Joe Buck, gcc

Hi,

On Tue, 29 May 2001, Richard Henderson wrote:
> On Tue, May 29, 2001 at 10:57:39PM +0200, Michael Matz wrote:
> > This is due to lib1's version of init() setting another TheA
> > than the one read from in main().  I don't know why exactly
> > this is happening.
>
> Welcome to Dynamic Linking 304, in which we discuss the peculiarities
> of data symbols as seen on common 32-bit architectures.

;-)  wonderful, educating, funny.  This is how education should be.  How
about you sitting down every two weeks, and write something equally
interesting and entertaining on a random topic. I at least enjoyed reading
very much. :)

> Exhibit B:
>
> > The program itself has for TheA the following relocation:
> > 0804970c R_386_COPY        _1A.TheA
>
> For every data object referenced by the main application, the linker
> allocates space in the application's .bss section (sometimes referred
> to as the .dynbss).

I already wondered why in prog there were symbols _1A.TheA pointing to
.bss.

> Conclusion: symbol resolution rule changes cannot be made in the
> presense of shared (in the multiple users sense) data.

I have the feeling, that -Bsymbolic should only be applied to functions
(or other things for which indirection already exists), and not to data.

(btw. what's sad is, that TheA wouldn't need to be shared data, after all
it's private (C++-wise), and only accessible through accesors.  Only that
they get inlined is the reason why everything falls apart ;)  Not that I
want to turn of inlining for this)

> There is something else to watch here.  Note that the linker had
> to *reserve* space in the application.  This implies that the size
> of the data is known at link time.  Which implies that that size
> is constant.  Which implies that the size of the data is part of
> the ABI, even if the structure is opaque at the language level.

Nice one.  Another line on my blackboard: "beware shared data".

> Of course, there is another way to avoid these problems that
> hasn't been mentioned yet -- build the application PIC as well.

Well, that would be the only way if -Bsymbolic can't be made to leave data
alone, and to still use -Bsymbolic.  At least the small test-program
worked with -fPIC, but I'm not sure I want to take the penalty in KDE (not
that it matters really much, we anyway have most code (outside of
large applications) PICed).  Hmm.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 15:02                                 ` Richard Henderson
@ 2001-05-29 15:24                                   ` Geoff Keating
  2001-05-29 15:36                                     ` Richard Henderson
  0 siblings, 1 reply; 54+ messages in thread
From: Geoff Keating @ 2001-05-29 15:24 UTC (permalink / raw)
  To: rth; +Cc: gcc

> Date: Tue, 29 May 2001 15:02:12 -0700
> From: Richard Henderson <rth@redhat.com>
> Cc: gcc@gcc.gnu.org
> Mail-Followup-To: Richard Henderson <rth@redhat.com>,
> 	Geoff Keating <geoffk@geoffk.org>, gcc@gcc.gnu.org
> Content-Disposition: inline
> User-Agent: Mutt/1.2.5i
> 
> On Tue, May 29, 2001 at 02:45:03PM -0700, Geoff Keating wrote:
> > On some platforms, this doesn't work the way you'd like.  Instead, the
> > PIC code is fully resolved at link time... and still generating PLT
> > entries or COPY relocs.
> 
> Hmm.  Are you sure?  It shouldn't in the case of data symbols. 
> For code symbols that is fine -- we'd be branching to the PLT
> in any case.

Consider the following example program on powerpc-linux:

#include <signal.h>
#include <stdio.h>
int main(void)
{
  printf ("%p\n", sys_siglist);
  return 0;
}

when linked normally, it has:

10010630 R_PPC_COPY        sys_siglist

When linked with -fpic, it has:

1001052c R_PPC_GLOB_DAT    sys_siglist
10010650 R_PPC_COPY        sys_siglist

(OK, _that_ is probably a linker bug.)
which comes from

        lwz 4,sys_siglist@got(30)

But when linked with -fPIC, it has:

10010650 R_PPC_COPY        sys_siglist

because the linker can't tell that the reloc, which is just

        .section        ".got2","aw"
.LCTOC1 = .+32768
.LC1 = .-.LCTOC1
        .long .LC0
.LC2 = .-.LCTOC1
        .long sys_siglist

is related to PIC-ness, or if it's just a plain old data reloc in an
oddly-named section.

> I'd class this as a linker bug if it happens.
> 
> 
> r~
> 


-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 15:24                                   ` Geoff Keating
@ 2001-05-29 15:36                                     ` Richard Henderson
  2001-05-29 15:44                                       ` Geoff Keating
  2001-05-29 16:33                                       ` Michael Meissner
  0 siblings, 2 replies; 54+ messages in thread
From: Richard Henderson @ 2001-05-29 15:36 UTC (permalink / raw)
  To: Geoff Keating; +Cc: gcc

On Tue, May 29, 2001 at 03:23:40PM -0700, Geoff Keating wrote:
> But when linked with -fPIC ... the linker can't tell that
> the reloc ... is related to PIC-ness, or if it's just a plain
> old data reloc in an oddly-named section.

Hum.  Ok, that is ugly.  IMO you'd be much better off with your
linker knowing how to create multiple GOT pools like we do on Alpha.

On second thought, why aren't you using the R_PPC_GOT16_HA and
R_PPC_GOT16_LO relocations for -fPIC?


r~

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 15:36                                     ` Richard Henderson
@ 2001-05-29 15:44                                       ` Geoff Keating
  2001-05-29 16:33                                       ` Michael Meissner
  1 sibling, 0 replies; 54+ messages in thread
From: Geoff Keating @ 2001-05-29 15:44 UTC (permalink / raw)
  To: rth; +Cc: gcc

> Date: Tue, 29 May 2001 15:36:07 -0700
> From: Richard Henderson <rth@redhat.com>
> Cc: gcc@gcc.gnu.org
> Mail-Followup-To: Richard Henderson <rth@redhat.com>,
> 	Geoff Keating <geoffk@redhat.com>, gcc@gcc.gnu.org
> Content-Disposition: inline
> User-Agent: Mutt/1.2.5i
> 
> On Tue, May 29, 2001 at 03:23:40PM -0700, Geoff Keating wrote:
> > But when linked with -fPIC ... the linker can't tell that
> > the reloc ... is related to PIC-ness, or if it's just a plain
> > old data reloc in an oddly-named section.
> 
> Hum.  Ok, that is ugly.  IMO you'd be much better off with your
> linker knowing how to create multiple GOT pools like we do on Alpha.

I agree.  I'm just waiting for Uli to finish his ELF linker so I can
add this feature :-).

> On second thought, why aren't you using the R_PPC_GOT16_HA and
> R_PPC_GOT16_LO relocations for -fPIC?

Speed and code size.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 15:21                               ` Michael Matz
@ 2001-05-29 16:14                                 ` Richard Henderson
  2001-05-29 16:58                                   ` Michael Matz
  2001-05-29 17:15                                 ` Daniel Berlin
  1 sibling, 1 reply; 54+ messages in thread
From: Richard Henderson @ 2001-05-29 16:14 UTC (permalink / raw)
  To: Michael Matz; +Cc: Joe Buck, gcc

On Wed, May 30, 2001 at 12:16:11AM +0200, Michael Matz wrote:
> I have the feeling, that -Bsymbolic should only be applied to functions
> (or other things for which indirection already exists), and not to data.

The rules can't be changed now.  In any case, there are situations
in which you _want_ -Bsymbolic to apply to data.  Namely, when the
data is private and not used between shared objects.

> (btw. what's sad is, that TheA wouldn't need to be shared data, after all
> it's private (C++-wise), and only accessible through accesors.  Only that
> they get inlined is the reason why everything falls apart ;)  Not that I
> want to turn of inlining for this)

Which highlights a point I've always considered a failing of C++ -- it
is exceedingly tedious to write classes that are resistant to ABI change.
You have to write stuff like

  class Foo
  {
    private:
      struct FooData *data;
    public:
      // no inline members, constructors or destructors.
  };

in one (public) header, and put FooData and whatnot in some other
(internal) header.  Which is more or less exactly what you'd use
in plain C, but this is an OO language and people forget and are
lured by the siren song of "private".

Templates of course make the situation worse; fortunately for my
sanity, the last time I had to use C++ for real, templates didn't
actually work in real compilers so there was no temptation to be
burned by them.  ;-)


r~

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 15:36                                     ` Richard Henderson
  2001-05-29 15:44                                       ` Geoff Keating
@ 2001-05-29 16:33                                       ` Michael Meissner
  1 sibling, 0 replies; 54+ messages in thread
From: Michael Meissner @ 2001-05-29 16:33 UTC (permalink / raw)
  To: Richard Henderson, Geoff Keating, gcc

On Tue, May 29, 2001 at 03:36:07PM -0700, Richard Henderson wrote:
> On Tue, May 29, 2001 at 03:23:40PM -0700, Geoff Keating wrote:
> > But when linked with -fPIC ... the linker can't tell that
> > the reloc ... is related to PIC-ness, or if it's just a plain
> > old data reloc in an oddly-named section.
> 
> Hum.  Ok, that is ugly.  IMO you'd be much better off with your
> linker knowing how to create multiple GOT pools like we do on Alpha.
> 
> On second thought, why aren't you using the R_PPC_GOT16_HA and
> R_PPC_GOT16_LO relocations for -fPIC?

Because for better or for worse, -fPIC is implmented as -mrelocatable.  At the
time I implemented it, PowerPC Linux was barely on the radar screen, so rather
than creating 3 different methods for shared libraries, I just used the
existing support.

-- 
Michael Meissner, Red Hat, Inc.  (GCC group)
PMB 198, 174 Littleton Road #3, Westford, Massachusetts 01886, USA
Work:	  meissner@redhat.com		phone: +1 978-486-9304
Non-work: meissner@spectacle-pond.org	fax:   +1 978-692-4482

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 16:14                                 ` Richard Henderson
@ 2001-05-29 16:58                                   ` Michael Matz
  0 siblings, 0 replies; 54+ messages in thread
From: Michael Matz @ 2001-05-29 16:58 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Joe Buck, gcc

Ho,

On Tue, 29 May 2001, Richard Henderson wrote:
> On Wed, May 30, 2001 at 12:16:11AM +0200, Michael Matz wrote:
> > I have the feeling, that -Bsymbolic should only be applied to functions
> > (or other things for which indirection already exists), and not to data.
>
> The rules can't be changed now.  In any case, there are situations
> in which you _want_ -Bsymbolic to apply to data.  Namely, when the
> data is private and not used between shared objects.

Of course, of course, but is there some inherent requirement which would
rule out a -Bsymbolic_for_fun(c), i.e. is it somehow required, that if
code-symbols are resolved at link-time intra-DSO-wise, then also data
symbols need to be?  I guess not, as otherwise the more finegraned ELF
visibility stuff wouldn't be possible (I guess DT_DYNAMIC can't be set as
that really means also to resolve data).  I quickly browsed through bfd
for the use of the .symbolic flag, and it seems to be only on some
distinct places, so a .but_only_for_funcs flag might be feasible.  Or?


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 15:21                               ` Michael Matz
  2001-05-29 16:14                                 ` Richard Henderson
@ 2001-05-29 17:15                                 ` Daniel Berlin
  2001-05-29 20:55                                   ` H . J . Lu
  1 sibling, 1 reply; 54+ messages in thread
From: Daniel Berlin @ 2001-05-29 17:15 UTC (permalink / raw)
  To: Michael Matz; +Cc: Richard Henderson, Joe Buck, gcc

Michael Matz <matz@kde.org> writes:


> I have the feeling, that -Bsymbolic should only be applied to functions
> (or other things for which indirection already exists), and not to
> data.

You might want to takes a looksie at the BeOS config files for LD.

On BeOS, everything is linked -Bsymbolic, by default.

And it does C++ just fine, even across shared libraries.

I think everything is -fPIC too, IIRC.
It's been a while since I touched LD's BeOS related config files.

> 
> Ciao,
> Michael.

-- 
"Last time I went to the movies I was thrown out for bringing my
own food.  My argument was that the concession stand prices are
outrageous.  Besides, I haven't had a Bar-B-Que in a long time.
"-Steven Wright

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 17:15                                 ` Daniel Berlin
@ 2001-05-29 20:55                                   ` H . J . Lu
  2001-05-29 21:57                                     ` Daniel Berlin
  0 siblings, 1 reply; 54+ messages in thread
From: H . J . Lu @ 2001-05-29 20:55 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Michael Matz, Richard Henderson, Joe Buck, gcc

On Tue, May 29, 2001 at 08:13:13PM -0400, Daniel Berlin wrote:
> You might want to takes a looksie at the BeOS config files for LD.
> 
> On BeOS, everything is linked -Bsymbolic, by default.
> 
> And it does C++ just fine, even across shared libraries.
> 

I am not very sure if C++ works ok without any restrictions on BeOS.
That was my impression last time when I worked on a BeOS related
bfd problem.


H.J.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 20:55                                   ` H . J . Lu
@ 2001-05-29 21:57                                     ` Daniel Berlin
  2001-05-29 22:20                                       ` H . J . Lu
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel Berlin @ 2001-05-29 21:57 UTC (permalink / raw)
  To: H . J . Lu; +Cc: Daniel Berlin, Michael Matz, Richard Henderson, Joe Buck, gcc

"H . J . Lu" <hjl@lucon.org> writes:

> On Tue, May 29, 2001 at 08:13:13PM -0400, Daniel Berlin wrote:
> > You might want to takes a looksie at the BeOS config files for LD.
> > 
> > On BeOS, everything is linked -Bsymbolic, by default.
> > 
> > And it does C++ just fine, even across shared libraries.
> > 
> 
> I am not very sure if C++ works ok without any restrictions on BeOS.
> That was my impression last time when I worked on a BeOS related
> bfd problem.

I'm pretty darn positive.
I've written (with 2 others) a large, heavily multithreaded, graphical development
environment, which has shared libraries it's linked to on it's own,
plus tons of dynamically loaded plugins.

It uses templates, RTTI, exceptions, etc, extensively, across shared
library boundaries and whatnot.  

No problemo.

Remember, the whole Be API is in C++ itself (Excluding the hacked
glibc port and the kernel).

Though that's not really a good example, I guess, in this case,
because they restrict what they use and do things like padding classes
so they can add functions without breaking compatibility (the dynamic
linker can rename symbols according to patch files and whatnot in
support of this).

--Dan
> 
> 
> H.J.

-- 
"I was going 70 miles an hour and got stopped by a cop who said,
"Do you know the speed limit is 55 miles per hour?"  "Yes,
officer, but I wasn't going to be out that long..."
"-Steven Wright

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 21:57                                     ` Daniel Berlin
@ 2001-05-29 22:20                                       ` H . J . Lu
  2001-05-30  9:27                                         ` Joe Buck
  0 siblings, 1 reply; 54+ messages in thread
From: H . J . Lu @ 2001-05-29 22:20 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Michael Matz, Richard Henderson, Joe Buck, gcc

On Wed, May 30, 2001 at 12:55:26AM -0400, Daniel Berlin wrote:
> "H . J . Lu" <hjl@lucon.org> writes:
> 
> > On Tue, May 29, 2001 at 08:13:13PM -0400, Daniel Berlin wrote:
> > > You might want to takes a looksie at the BeOS config files for LD.
> > > 
> > > On BeOS, everything is linked -Bsymbolic, by default.
> > > 
> > > And it does C++ just fine, even across shared libraries.
> > > 
> > 
> > I am not very sure if C++ works ok without any restrictions on BeOS.
> > That was my impression last time when I worked on a BeOS related
> > bfd problem.
> 
> I'm pretty darn positive.
> I've written (with 2 others) a large, heavily multithreaded, graphical development
> environment, which has shared libraries it's linked to on it's own,
> plus tons of dynamically loaded plugins.

That doesn't mean much.

> 
> It uses templates, RTTI, exceptions, etc, extensively, across shared
> library boundaries and whatnot.  
> 
> No problemo.
> 
> Remember, the whole Be API is in C++ itself (Excluding the hacked
> glibc port and the kernel).

You can write an ABI in C++ in such a way that -Bsymbolic will work
just fine. BTW, as for those C++ programs which don't work, they
don't conform to the ABI :-).

Remember we only have to deal with the shared libgcc when we want
better support for C++. The static libgcc is fine with C.


H.J.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-29 22:20                                       ` H . J . Lu
@ 2001-05-30  9:27                                         ` Joe Buck
  2001-05-31  4:02                                           ` Michael Matz
  0 siblings, 1 reply; 54+ messages in thread
From: Joe Buck @ 2001-05-30  9:27 UTC (permalink / raw)
  To: H . J . Lu; +Cc: Daniel Berlin, Michael Matz, Richard Henderson, Joe Buck, gcc

H.J. Lu writes:

> You can write an ABI in C++ in such a way that -Bsymbolic will work
> just fine.

This echos a thought that I had last night.  With a minor change, Qt/KDE
could use -Bsymbolic.

The reason that the first attempt failed is that we have a static data
member that is accessed both in the shared library and in code that links
to this shared library.  Since Qt and KDE are reasonably designed
libraries, they don't give public access to important data members, which
means that there must be inline functions that access those members.

So the next thing to try (Michael Matz, or another interested party) is
to see if you can make any private data members effectively a secret
of the library at object code level.  This means that any inline
functions that access those objects should be moved to a .cxx file
that belongs to the same shared library as the definition of the object.

Then, I believe that -Bsymbolic will work (well, until you find the
next problem).


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-30  9:27                                         ` Joe Buck
@ 2001-05-31  4:02                                           ` Michael Matz
  2001-05-31  4:45                                             ` Fergus Henderson
  2001-05-31  5:47                                             ` Hildo.Biersma
  0 siblings, 2 replies; 54+ messages in thread
From: Michael Matz @ 2001-05-31  4:02 UTC (permalink / raw)
  To: Joe Buck; +Cc: H . J . Lu, Richard Henderson, gcc

Hi,

On Wed, 30 May 2001, Joe Buck wrote:
>
> So the next thing to try (Michael Matz, or another interested party) is
> to see if you can make any private data members effectively a secret
> of the library at object code level.  This means that any inline
> functions that access those objects should be moved to a .cxx file
> that belongs to the same shared library as the definition of the object.

Yep, like I already wrote, simply don't allow inline accessors.  OK, I've
done that for kdecore and konsole as a test-program (there it's only the
KApplication::KApp variable, the static instance of the application
object, which makes such problems and from which I constructed my last
example).  Still no go.  The problem now seems to be more involved (and it
seems to be different problems), I couldn't make a small example (I'll try
further), so I give the circumstances:

one problem is our template KStaticDeleter, which serves as a type-safe
delete mechanism for objects which are allocated in such a way:
void f() {
  static Object* o = 0;
  if (!o) o = new Object;
  ...
}

Without further means o would be a program-exit memleak, so we have a
template (for type-safety) which collects all such objects in a list,
which then is cleaned at program exit.  It's used similar to this:
static Deleter<Object> del;
void f() {
  static Object *o = 0;
  if (!o) o = del.setObject(new Object);
  ...
}

We make use of this template in all libraries, and konsole segfaults as
soon as kdecore is -Bsymbolic (kdecore is the place of
deleteStaticDeleter, which traverses the appropriate list and deletes all
these objects).  This segfault doesn't go away when other libraries are
either -Bsymbolic or not, and also not, when everything (also the program
konsole) is -fPIC.  OK, I simply disabled the routine to see, if there are
other problems lurking, and as long as libkdeui is _not_ -Bsymbolic at
least konsole seems to work.  I haven't measured any timing thoroughly,
but the overall effect is, that 10 runs of "konsole -e exit" take 0.7
seconds less (lib{qt,DCOP,kdeui,kssl} where not -Bsymbolic, all other
relevant KDE libs were).

I tried to come up with a reduced example implemented along these lines,
but it worked.

Another data-point, that shared-data seems not the above problem (besides,
that -fPIC for the program didn't help), is, that konsole (without -fPIC)
doesn't show any R_386_COPY relocations for symbols in kdecore (at test
time the only -Bsymbolic lib), _besides_ type info nodes and vtables.
The latter leads me to another problem:

When kdeui is also -Bsymbolic (and the staticdeleter routine is
deactivated), I have a funny crash in some dynamic_cast connected
function:
#0  0x40db346d in __class_type_info::dcast (this=0x8087394,
    desired=@0x403ab378,
    is_public=1, objptr=0x810c288, sub=0x40c3b788, subptr=0x810c288)
    from /usr/lib/libfam.so.0
#1  0x40db3c06 in __dynamic_cast (from=0x807d7d0 <Konsole type_info
    function>,
    to=0x40373410 <KActionCollection type_info function>,
    require_public=1,
    address=0x810c288, sub=0x805684c <QObject type_info function>,
    subptr=0x810c288) from /usr/lib/libfam.so.0
#2  0x402f5d55 in KAction::parentCollection () from
    /opt/kde2/lib/libkdeui.so.3

Besides the fact, that the dynamic_cast functions come from libfam (it's
the static libgcc, or libstdc++, as libfam is partly C++, which has lead
to more problems in the past), which is not a problem, I can only guess,
that this crash result from the non-sharing of RTTI and virtual-table
information, although it would be shared without -Bsymbolic.  I believe,
that those info is read-only, so, what could be the cause of any
observable differences in RTTI with -Bsymbolic?  (I can only imagine
pointer comparings to the RTTI nodes itself, which differ in-library and
out-of-library with -Bymbolic).  I haven't tried yet to come up with a
smaller example demonstrating this.

The last thing seem to be a severe problem, and unavoidable.  Even if I
would eliminate all inline accessors to eliminate shared data, there are
still RTTI and vtable nodes, over which I have no control, and if that is
really the cause of the second crash above, -Bsymbolic as it is now seems
not to be applicable.  Hmm.  I want a -Bsymbolic_for_func. ;)


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-31  4:02                                           ` Michael Matz
@ 2001-05-31  4:45                                             ` Fergus Henderson
  2001-05-31  7:14                                               ` Michael Matz
  2001-05-31  5:47                                             ` Hildo.Biersma
  1 sibling, 1 reply; 54+ messages in thread
From: Fergus Henderson @ 2001-05-31  4:45 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc

On 31-May-2001, Michael Matz <matz@kde.org> wrote:
> one problem is our template KStaticDeleter, which serves as a type-safe
> delete mechanism for objects which are allocated in such a way:
> void f() {
>   static Object* o = 0;
>   if (!o) o = new Object;
>   ...
> }
> 
> Without further means o would be a program-exit memleak, so we have a
> template (for type-safety) which collects all such objects in a list,
> which then is cleaned at program exit.  It's used similar to this:
> static Deleter<Object> del;
> void f() {
>   static Object *o = 0;
>   if (!o) o = del.setObject(new Object);
>   ...
> }

I haven't seen the code for KStaticDeleter, but I'll bet that it
refers to a global or static member variable which contains the list of
objects to be deleted.

The point here is that not only must you not refer to global/static member
variables from inline functions, you must also not refer to them from
template functions, since template functions are instantiated in the
application's executable file.

If the global or static member variable in question is not a member of a class
template, then you can work around this by defining a non-inline access
function which returns a reference to the variable, and then using the
access function in place of the variable.

For static members of template classes, things may get more complicated,
especially if the template class is instantiated with the same type from
two different shared object files.

> Hmm.  I want a -Bsymbolic_for_func. ;)

That certainly sounds like it would be easier in the long run.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: < http://www.cs.mu.oz.au/~fjh >  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-31  4:02                                           ` Michael Matz
  2001-05-31  4:45                                             ` Fergus Henderson
@ 2001-05-31  5:47                                             ` Hildo.Biersma
  2001-05-31  6:55                                               ` Michael Matz
  1 sibling, 1 reply; 54+ messages in thread
From: Hildo.Biersma @ 2001-05-31  5:47 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc

>>>>> "Michael" == Michael Matz <matz@kde.org> writes:

Michael> Hi,
Michael> On Wed, 30 May 2001, Joe Buck wrote:
>> 
>> So the next thing to try (Michael Matz, or another interested party) is
>> to see if you can make any private data members effectively a secret
>> of the library at object code level.  This means that any inline
>> functions that access those objects should be moved to a .cxx file
>> that belongs to the same shared library as the definition of the object.

Michael> Yep, like I already wrote, simply don't allow inline
Michael> accessors.  OK, I've done that for kdecore and konsole as a
Michael> test-program (there it's only the KApplication::KApp
Michael> variable, the static instance of the application object,
Michael> which makes such problems and from which I constructed my
Michael> last example).  Still no go.  The problem now seems to be
Michael> more involved (and it seems to be different problems), I
Michael> couldn't make a small example (I'll try further), so I give
Michael> the circumstances:

Michael> one problem is our template KStaticDeleter, which serves as a
Michael> type-safe delete mechanism for objects which are allocated in
Michael> such a way:

Michael> void f() {
Michael>   static Object* o = 0;
Michael>   if (!o) o = new Object;
Michael>   ...
Michael> }

Michael> Without further means o would be a program-exit memleak, so
Michael> we have a template (for type-safety) which collects all such
Michael> objects in a list, which then is cleaned at program exit.

Why do you store static pointers to objects, instead of static
objects?  As far as I understand the standard, static objects within a
function are created the first time the function is invoked, and
destroyed in reverse order of creation - allowing such objects to
depend on global variables in their constructors.

At first sight, that apporach would let you get rid of the template
without introducing a memory leak.  Or am I missing something here?

Hildo

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-31  5:47                                             ` Hildo.Biersma
@ 2001-05-31  6:55                                               ` Michael Matz
  0 siblings, 0 replies; 54+ messages in thread
From: Michael Matz @ 2001-05-31  6:55 UTC (permalink / raw)
  To: Hildo.Biersma; +Cc: gcc

Hi,

On Thu, 31 May 2001 Hildo.Biersma@morganstanley.com wrote:

> >>>>> "Michael" == Michael Matz <matz@kde.org> writes:
> Michael> Without further means o would be a program-exit memleak, so
> Michael> we have a template (for type-safety) which collects all such
> Michael> objects in a list, which then is cleaned at program exit.
>
> Why do you store static pointers to objects,

Interesting isn't it?  As with everything, to which there exists a simple
solution, which isn't used, there is a reason, whose goodness is
proportional to the convoluteness of the solution taken instead ;-)

> instead of static objects?

You mean like this:
void f() {
  static Object o;
  ...
}
?  This would work, if we wouldn't talk libraries.  libraries and
dlopen'ed DSO's screw everything.  E.g. on linux (I also saw this on
Solaris) something like the above segfaults in ld.so when it's
dlopen()'ed.  Under special circumstances only.  Sometimes you need more
libraries, sometimes loaded in special patterns, to see this behaviour.
What's interesting is, that file-local static objects (instead of
function-local ones) usually work as expected, but they are constructed at
load time, not at function invocation time.  That's why KDE has a "static
object are evil" approach ;-)

Besides: there are platforms, where static ctors/dtors are not supported
at all (g++ on HP-UX I believe), or for dlopen'ed DSOs.

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [I don't think it's off-topic at all] Linking speed for C++
  2001-05-31  4:45                                             ` Fergus Henderson
@ 2001-05-31  7:14                                               ` Michael Matz
  0 siblings, 0 replies; 54+ messages in thread
From: Michael Matz @ 2001-05-31  7:14 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: gcc

Hi,

On Thu, 31 May 2001, Fergus Henderson wrote:
> I haven't seen the code for KStaticDeleter, but I'll bet that it
> refers to a global or static member variable which contains the list of
> objects to be deleted.

One could think, so, yes.  But besides that -fPIC for the program didn't
help, this is also not the case, it just hold some normal class members,
the addition (and removing) to the global list is done through static
member functions of another class, and the private var which is changed
there (which indeed is static), isn't accessed from outside at all.  (As I
said, objdump revealed no R_386_COPY relocs besides RTTI and vtables).

This is KStaticDeleter without comments:
class KStaticDeleterBase {
public:
    virtual void destructObject() = 0;
};
template<class type> class KStaticDeleter : public KStaticDeleterBase {
public:
    KStaticDeleter() { deleteit = 0; }
    type *setObject( type *obj, bool isArray = false) {
        deleteit = obj;
        array = isArray;
        if (obj)
            KGlobal::registerStaticDeleter(this);
        else
            KGlobal::unregisterStaticDeleter(this);
        return obj;
    }
    virtual void destructObject() {
        if (array)
           delete [] deleteit;
        else
           delete deleteit;
        deleteit = 0;
    }
    virtual ~KStaticDeleter() {
        KGlobal::unregisterStaticDeleter(this);
        destructObject();
    }
private:
    type *deleteit;
    bool array;
};

This should be sane.  You can see it here, if interested:
http://webcvs.kde.org/cgi-bin/cvsweb.cgi/kdelibs/kdecore/kglobal.cpp?rev=1.59&content-type=text/x-cvsweb-markup

> For static members of template classes, things may get more complicated,
> especially if the template class is instantiated with the same type from
> two different shared object files.

This is something which I also thought about.  template with the same
params, but instatiated in different DSOs.  Hmm, -Bsymbolic shouldn't make
a difference, as long as no static data is involved in the template.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2001-05-31  7:14 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-05-09  6:59 [Slightly OT] Linking speed biswapesh.chattopadhyay
2001-05-09  7:28 ` Paolo Carlini
2001-05-09  7:36 ` Andreas Jaeger
2001-05-09  7:47   ` Paolo Carlini
2001-05-09  9:20   ` [I don't think it's off-topic at all] Linking speed for C++ Joe Buck
2001-05-09 11:09     ` Richard Henderson
2001-05-09 11:39       ` Joe Buck
2001-05-09 11:45         ` Richard Henderson
2001-05-09 12:34           ` Waldo Bastian
2001-05-09 12:48             ` Richard Henderson
2001-05-09 14:06               ` Joe Buck
2001-05-09 13:54           ` Joe Buck
2001-05-09 14:21             ` Geoff Keating
2001-05-21 16:53               ` Marc Espie
2001-05-21 17:23                 ` Richard Henderson
2001-05-22  2:41                   ` Fergus Henderson
2001-05-22 12:00                     ` Joe Buck
2001-05-22 13:12                       ` Richard Henderson
2001-05-22 13:34                         ` Joe Buck
2001-05-29 13:58                           ` Michael Matz
2001-05-29 14:41                             ` Richard Henderson
2001-05-29 14:46                               ` Geoff Keating
2001-05-29 15:02                                 ` Richard Henderson
2001-05-29 15:24                                   ` Geoff Keating
2001-05-29 15:36                                     ` Richard Henderson
2001-05-29 15:44                                       ` Geoff Keating
2001-05-29 16:33                                       ` Michael Meissner
2001-05-29 14:53                               ` Joe Buck
2001-05-29 15:09                                 ` Richard Henderson
2001-05-29 15:21                               ` Michael Matz
2001-05-29 16:14                                 ` Richard Henderson
2001-05-29 16:58                                   ` Michael Matz
2001-05-29 17:15                                 ` Daniel Berlin
2001-05-29 20:55                                   ` H . J . Lu
2001-05-29 21:57                                     ` Daniel Berlin
2001-05-29 22:20                                       ` H . J . Lu
2001-05-30  9:27                                         ` Joe Buck
2001-05-31  4:02                                           ` Michael Matz
2001-05-31  4:45                                             ` Fergus Henderson
2001-05-31  7:14                                               ` Michael Matz
2001-05-31  5:47                                             ` Hildo.Biersma
2001-05-31  6:55                                               ` Michael Matz
2001-05-09 14:26             ` Jakub Jelinek
2001-05-09 14:38             ` Jeff Sturm
2001-05-09 15:04               ` Geoff Keating
2001-05-09 17:05                 ` Joe Buck
2001-05-09 15:49             ` Richard Henderson
2001-05-09 17:01               ` Joe Buck
2001-05-10 10:14                 ` Richard Henderson
2001-05-10 10:45                   ` Tom Tromey
2001-05-10 10:49                     ` Richard Henderson
2001-05-10 11:17                   ` Joe Buck
2001-05-10 16:23                     ` Richard Henderson
2001-05-10 16:42                       ` Joe Buck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).