Using the PLT for vtables (or not)

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Using the PLT for vtables (or not)
@ 2003-12-05  2:09 Brian Ryner
  2003-12-05  2:52 ` Ian Lance Taylor
  2003-12-05  9:41 ` Jakub Jelinek
  0 siblings, 2 replies; 10+ messages in thread
From: Brian Ryner @ 2003-12-05  2:09 UTC (permalink / raw)
  To: gcc

Hi all,

On ELF, the vtable contains pointers to PLT entries.  I was wondering if 
anyone could comment on the reasons for constructing the vtable this 
way.  In particular, it seems like you could make virtual method calls 
more efficient (by avoiding a load and jump) if you put the vtable into 
writable memory, and initialized it with the real address of every 
method the first time an instance of a class is created.

Any thoughts?

-- 
-Brian Ryner
bryner@brianryner.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using the PLT for vtables (or not)
  2003-12-05  2:09 Using the PLT for vtables (or not) Brian Ryner
@ 2003-12-05  2:52 ` Ian Lance Taylor
  2003-12-05  7:05   ` Brian Ryner
  2003-12-05  9:41 ` Jakub Jelinek
  1 sibling, 1 reply; 10+ messages in thread
From: Ian Lance Taylor @ 2003-12-05  2:52 UTC (permalink / raw)
  To: Brian Ryner; +Cc: gcc

Brian Ryner <bryner@brianryner.com> writes:

> On ELF, the vtable contains pointers to PLT entries.  I was wondering
> if anyone could comment on the reasons for constructing the vtable
> this way.  In particular, it seems like you could make virtual method
> calls more efficient (by avoiding a load and jump) if you put the
> vtable into writable memory, and initialized it with the real address
> of every method the first time an instance of a class is created.

ELF function calls to externally visible symbols are going to go
through the PLT unless you take special preventative measures.  What
you suggest isn't going to help: when you take the address of the
function to initialize the vtable, unless you take some special
action, you're going to get the address of the PLT.  So if it is OK to
take that special action, then you might as well take it when you
initialize the vtable at program start-up time.

The real question is whether it is OK to skip the PLT for vtable
entries.  Normally it is desirable to use the PLT because it permits
the main program to override calls made from within a shared library.
A vtable may be an exception to this, but that is not immediately
obvious to me.

Specifically, I could rewrite your paragraph to say something along
the lines of ``On ELF, function calls are made to the PLT.  This costs
a load or a jump for each function call.  If we know that the function
is defined in the same object, wouldn't it be better to call the
function directly?''

Ian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using the PLT for vtables (or not)
  2003-12-05  2:52 ` Ian Lance Taylor
@ 2003-12-05  7:05   ` Brian Ryner
  2003-12-05  7:15     ` Ian Lance Taylor
  0 siblings, 1 reply; 10+ messages in thread
From: Brian Ryner @ 2003-12-05  7:05 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc

Ian Lance Taylor wrote:
> action, you're going to get the address of the PLT.  So if it is OK to
> take that special action, then you might as well take it when you
> initialize the vtable at program start-up time.
> 

What sort of vtable initialization happens at startup time?  The vtables 
currently reside in a read-only section of memory, don't they?

> The real question is whether it is OK to skip the PLT for vtable
> entries.  Normally it is desirable to use the PLT because it permits
> the main program to override calls made from within a shared library.
> A vtable may be an exception to this, but that is not immediately
> obvious to me.
> 

Well, consider that if I were to build up a struct or array of function 
pointers from a class's constructor, I would be getting the real 
function address, not a PLT stub.  This seems like the same thing 
conceptually as a vtable, and yet it would not have the PLT overhead.

At the least, I'd like to see an attribute or compiler flag which could 
specify this behavior.

-- 
-Brian Ryner
bryner@brianryner.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using the PLT for vtables (or not)
  2003-12-05  7:05   ` Brian Ryner
@ 2003-12-05  7:15     ` Ian Lance Taylor
  2003-12-05  8:52       ` Brian Ryner
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Lance Taylor @ 2003-12-05  7:15 UTC (permalink / raw)
  To: Brian Ryner; +Cc: gcc

Brian Ryner <bryner@brianryner.com> writes:

> Ian Lance Taylor wrote:
> > action, you're going to get the address of the PLT.  So if it is OK to
> > take that special action, then you might as well take it when you
> > initialize the vtable at program start-up time.
> >
> 
> What sort of vtable initialization happens at startup time?  The
> vtables currently reside in a read-only section of memory, don't they?

Yes, that's what I meant, although I expressed it in a confusing
manner.  The initialization consists of loading the executable file
into memory.

> > The real question is whether it is OK to skip the PLT for vtable
> > entries.  Normally it is desirable to use the PLT because it permits
> > the main program to override calls made from within a shared library.
> > A vtable may be an exception to this, but that is not immediately
> > obvious to me.
> >
> 
> Well, consider that if I were to build up a struct or array of
> function pointers from a class's constructor, I would be getting the
> real function address, not a PLT stub.  This seems like the same thing
> conceptually as a vtable, and yet it would not have the PLT overhead.

Hmmm, I think I made an incorrect assumption here.  What target are
you talking about?  The behaviour I see for a vtable on
i686-pc-linux-gnu is the same as building an array of function
pointers by hand.  What actually happens depends upon whether you use
-fpic when compiling and whether you link the vtable into a shared
library.

What target are you using?  Can you give a complete example of what
you are talking about?

Ian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using the PLT for vtables (or not)
  2003-12-05  7:15     ` Ian Lance Taylor
@ 2003-12-05  8:52       ` Brian Ryner
  2003-12-05 10:14         ` Jakub Jelinek
  0 siblings, 1 reply; 10+ messages in thread
From: Brian Ryner @ 2003-12-05  8:52 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc

[-- Attachment #1: Type: text/plain, Size: 1375 bytes --]

I am on i686-pc-linux-gnu.  I'm seeing some odd results in my testcase 
which have me a little confused, maybe someone can explain what's going 
on.  I have this class:

class A
{
   virtual void Foo();
};

and the definition of A::Foo() in a DSO.

If you unpack the attached testcase and build it, you'll get executables 
called test1 and test2.  In test1, the vtable slot for A::Foo points 
directly to A::Foo, not to a PLT stub.  In test2, it points to a PLT 
stub.  The only difference is that in test2.cpp, I have an (unused) 
function which calls A::Foo() directly, rather than through the vtable. 
  This is with gcc 3.3.2, binutils version is 2.13.90.0.18-9 (rpm).

Ian Lance Taylor wrote:
>>Well, consider that if I were to build up a struct or array of
>>function pointers from a class's constructor, I would be getting the
>>real function address, not a PLT stub.  This seems like the same thing
>>conceptually as a vtable, and yet it would not have the PLT overhead.
> 
> 
> Hmmm, I think I made an incorrect assumption here.  What target are
> you talking about?  The behaviour I see for a vtable on
> i686-pc-linux-gnu is the same as building an array of function
> pointers by hand.  What actually happens depends upon whether you use
> -fpic when compiling and whether you link the vtable into a shared
> library.
> 

-- 
-Brian Ryner
bryner@brianryner.com

[-- Attachment #2: virtual-test.tar.gz --]
[-- Type: application/gzip, Size: 530 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using the PLT for vtables (or not)
  2003-12-05  2:09 Using the PLT for vtables (or not) Brian Ryner
  2003-12-05  2:52 ` Ian Lance Taylor
@ 2003-12-05  9:41 ` Jakub Jelinek
  2003-12-05 10:42   ` Brian Ryner
  1 sibling, 1 reply; 10+ messages in thread
From: Jakub Jelinek @ 2003-12-05  9:41 UTC (permalink / raw)
  To: Brian Ryner; +Cc: gcc

On Thu, Dec 04, 2003 at 05:50:28PM -0800, Brian Ryner wrote:
> Hi all,
> 
> On ELF, the vtable contains pointers to PLT entries.  I was wondering if 
> anyone could comment on the reasons for constructing the vtable this 
> way.  In particular, it seems like you could make virtual method calls 
> more efficient (by avoiding a load and jump) if you put the vtable into 
> writable memory, and initialized it with the real address of every 
> method the first time an instance of a class is created.
> 
> Any thoughts?

Actually this is not completely true.
Pointers in vtables normally resolve to:
a) the actual method in the binary if it has been linked into the binary
b) to the PLT slot in the binary if it was not linked into the binary
   but is referenced from the binary
c) otherwise to the actual method in the first shared library in given
   symbol search scope which defines that method
For a) and c) you already get the direct call, for b) it is indirect
through PLT hop.

Until my very recent binutils patch (which has been for IA-32 only so far),
b) means if a method address is taken (e.g. its address stored into some
vtable in the binary or its address taken in some other way) or if it
has been called from the binary.  With my patch calling from the binary
doesn't count, so b) happens only if method address is in a vtable or
stored somewhere, but not if just called from the binary.

The addresses in vtables really don't need to point to PLT slots, as
a conforming program really doesn't care about the pointer values in
vtables - this is invisible to the program.

Prelink even in lots of cases optimizes this out, which both kills thousands
of unneeded prelink conflicts and speeds up the prelinked application
slightly at runtime, not just at startup time[*].

If you can show there are many cases where b) still happens with current CVS
binutils (ie. when some vtable in the binary contains address of some method
defined only in shared libraries and vtables in shared libraries still
unnecessarily resolve to the PLT slot in the binary), maybe a new relocation
would be handy.  As a quick hack the linker can be hacked though that it
would not consider R_386_32 relocations located in .gnu.linkonce.d._ZTV*
sections as taking of address, but similarly to R_386_PC32 relocs.

Or were you talking about vtables in the binary going through PLT
if a method they are referencing is not linked into the binary?
In that case I think it is more questionable if changing the current
state is desirable.  Avoiding the PLT slot will mean huge growth of the
.rel*.dyn section (the R_*J{,U}MP_SLOT relocation would often have to stay
if the method is called also directly and newly you need one R_*_32 per
pointer in vtable), unless prelinked slower startup, the vtable will
need to be unnecessarily writable and thus per-process (i.e. non-shareable)
memory consumption will go up.

[*] See the C++ optimization chapter of my prelink paper draft at
    ftp://people.redhat.com/jakub/prelink/prelink.pdf.

	Jakub

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using the PLT for vtables (or not)
  2003-12-05  8:52       ` Brian Ryner
@ 2003-12-05 10:14         ` Jakub Jelinek
  0 siblings, 0 replies; 10+ messages in thread
From: Jakub Jelinek @ 2003-12-05 10:14 UTC (permalink / raw)
  To: Brian Ryner; +Cc: Ian Lance Taylor, gcc

On Fri, Dec 05, 2003 at 12:03:58AM -0800, Brian Ryner wrote:
> I am on i686-pc-linux-gnu.  I'm seeing some odd results in my testcase 
> which have me a little confused, maybe someone can explain what's going 
> on.  I have this class:
> 
> class A
> {
>   virtual void Foo();
> };
> 
> 
> and the definition of A::Foo() in a DSO.
> 
> If you unpack the attached testcase and build it, you'll get executables 
> called test1 and test2.  In test1, the vtable slot for A::Foo points 
> directly to A::Foo, not to a PLT stub.  In test2, it points to a PLT 
> stub.  The only difference is that in test2.cpp, I have an (unused) 
> function which calls A::Foo() directly, rather than through the vtable. 
>  This is with gcc 3.3.2, binutils version is 2.13.90.0.18-9 (rpm).

In the testcase you posted, the binary has a COPY relocation for the
virtual table, so what matters is how is the vtable in the shared library
resolved.  For your testcase CVS binutils will already do what you're
looking for.

	Jakub

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using the PLT for vtables (or not)
  2003-12-05  9:41 ` Jakub Jelinek
@ 2003-12-05 10:42   ` Brian Ryner
  2003-12-05 11:44     ` Jakub Jelinek
  0 siblings, 1 reply; 10+ messages in thread
From: Brian Ryner @ 2003-12-05 10:42 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc

Jakub Jelinek wrote:
> Actually this is not completely true.
> Pointers in vtables normally resolve to:
> a) the actual method in the binary if it has been linked into the binary
> b) to the PLT slot in the binary if it was not linked into the binary
>    but is referenced from the binary

Just trying to clarify a bit -- by "referenced from the binary", do you 
mean:

class A { virtual void Foo(); };
A a;
a.Foo();

or do you mean calling through the vtable:

A *a;
a->Foo();


> Until my very recent binutils patch (which has been for IA-32 only so far),
> b) means if a method address is taken (e.g. its address stored into some
> vtable in the binary or its address taken in some other way) or if it
> has been called from the binary.  With my patch calling from the binary
> doesn't count, so b) happens only if method address is in a vtable or
> stored somewhere, but not if just called from the binary.

Can you give an example of a case where the method address is stored 
into a vtable in the binary?  Would this happen if the binary subclasses 
the class in the DSO with the virtual method (and does not override all 
of the virtual methods)?  Are there other examples?

> Or were you talking about vtables in the binary going through PLT
> if a method they are referencing is not linked into the binary?

Is this the same sort of scenario I mentioned above with a subclass in 
the binary?

One more question -- does any of this change if the "binary" is a DSO 
which links against the original DSO?

Thanks for your detailed reply.

-- 
-Brian Ryner
bryner@brianryner.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using the PLT for vtables (or not)
  2003-12-05 10:42   ` Brian Ryner
@ 2003-12-05 11:44     ` Jakub Jelinek
  0 siblings, 0 replies; 10+ messages in thread
From: Jakub Jelinek @ 2003-12-05 11:44 UTC (permalink / raw)
  To: Brian Ryner; +Cc: gcc

On Fri, Dec 05, 2003 at 02:14:36AM -0800, Brian Ryner wrote:
> Jakub Jelinek wrote:
> >Actually this is not completely true.
> >Pointers in vtables normally resolve to:
> >a) the actual method in the binary if it has been linked into the binary
> >b) to the PLT slot in the binary if it was not linked into the binary
> >   but is referenced from the binary
> 
> Just trying to clarify a bit -- by "referenced from the binary", do you 
> mean:
> 
> class A { virtual void Foo(); };
> A a;
> a.Foo();

This one is a call to _ZN1A3FooEv (on IA-32 R_386_PC32 reloc).
With older binutils, this acts as any other reference in the binary
and thus, although _ZN1A3FooEv is not defined in the binary, the
SHN_UNDEF _ZN1A3FooEv symbol has non-zero st_value pointing to the
PLT slot.  With CVS binutils this doesn't happen.

> or do you mean calling through the vtable:
> 
> A *a;
> a->Foo();

This doesn't count as any reference from the binary to _ZN1A3FooEv.
The compiler doesn't output the A's virtual table (_ZTV1A) into test1.o
at all.  It happens to be in the binary as COPY reloc because of the
_ZTV1A pointer being used, but e.g. in the shared library you'd just
have a _ZTV1A relocation in GOT.

By reference from the binary other than call I mean something like:

#include "foo.h"

void func(A *a)
{
  a->Foo();
}

class B : public A {};

int main(int argc, char **argv)
{
  A a;
  B b;
  func(&a);
  func(&b);
  return 0;
}

where although A's virtual table is not output into test3.o, B's vtable
(_ZTV1B) is and it contains _ZN1A3FooEv.
In this case, the _ZTV1A virtual table (in the shared library and after the
copy relocation is resolved also in the binary) will unnecessarily point to
the _ZN1A3FooEv PLT slot (which could be fixed by adding a new relocation
R_386_32V or whatever) or by hacking up linker to treat
.gnu.linkonce.[dr]._ZTV* relocations specially (probably not a good idea).

The other thing I talked about is whether _ZTV1B's pointer to _ZN1A3FooEv
should resolve to the PLT slot in the binary or a new dynamic relocation
should be added.  In the above exact case adding a dynamic relocation
for _ZTV1B's _ZN1A3FooEv would mean the PLT slot can go away and
R_386_JMP_SLOT reloc as well, but already when you add func2 as you had
in your test2 case PLT would need to stay.

> One more question -- does any of this change if the "binary" is a DSO 
> which links against the original DSO?

This doesn't change just with -fpic.  But if the binary is a DSO
(whether PIE or you have echo '' | g++ -xc - ... binary and link the
testX.cpp into a new shared library) then there is no such thing
as jumping through PLT unnecessarily.

	Jakub

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using the PLT for vtables (or not)
@ 2003-12-05  7:11 Michael Elizabeth Chastain
  0 siblings, 0 replies; 10+ messages in thread
From: Michael Elizabeth Chastain @ 2003-12-05  7:11 UTC (permalink / raw)
  To: bryner; +Cc: gcc

Brian Ryner wrote:
> Well, consider that if I were to build up a struct or array of
> function pointers from a class's constructor, I would be getting
> the real function address, not a PLT stub.

Actually, you would get the address of the PLT stub.  Try it.

  #include <stdio.h>
  int main ()
  {
    printf ("%p\n", &printf);
    return 0;
  }

This prints a PLT stub address when I run it on native i686-pc-linux-gnu
with a normal shared glibc.

As far as gcc is concerned, it just puts out "address of printf".
vtables are just the same thing.  (Try compiling with and without -fpic
and see if there is any difference in the generated assembly code
for your vtables).

It's the linker's job to decide whether "address of printf" resolves to
a PLT stub or to an actual symbol.  Up until about a week ago, it always
resolved to a stub.  Now the linker has an optimization where it can
resolve "address of ..." directly to the symbol and bypass the PLT in
some cases, which is the optimization that you are asking for.

The trouble with this optimization is that it confuses gdb.

Also, this optimization is not suitable in all cases because if someone
takes "&foo" as a data value (as opposed to just "call foo"), and then
compares that data value against "&foo" from somewhere else, they need
to get the same value all the time.  The only way to do that is to make
"&foo" be the address of the PLT stub all the time.

I don't know if references from a vtable is one of those cases that can
bypass the PLT.  And even if it can, it might be a bad idea.  It would
be hell to load a shared library and resolve 1000's of vtable references
at shlib load time -- that might slow down program initialization.
Better to spread it out with a 1-instruction hit on each method call.

(I bet this has been discussed in the past on the binutils list but
I am just learning about binutils).

Michael C

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2003-12-05 10:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-05  2:09 Using the PLT for vtables (or not) Brian Ryner
2003-12-05  2:52 ` Ian Lance Taylor
2003-12-05  7:05   ` Brian Ryner
2003-12-05  7:15     ` Ian Lance Taylor
2003-12-05  8:52       ` Brian Ryner
2003-12-05 10:14         ` Jakub Jelinek
2003-12-05  9:41 ` Jakub Jelinek
2003-12-05 10:42   ` Brian Ryner
2003-12-05 11:44     ` Jakub Jelinek
2003-12-05  7:11 Michael Elizabeth Chastain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).