New vtable ABI (was Re: Strange behaviour in C++...)

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* New vtable ABI (was Re: Strange behaviour in C++...)
@ 1999-08-26 15:59 Jason Merrill
  1999-08-26 16:38 ` Jason Merrill
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Jason Merrill @ 1999-08-26 15:59 UTC (permalink / raw)
  To: Joe Buck; +Cc: David Edelsohn, law, mrs, oliva, cj, gcc-patches, gcc

I should point out here that the proposed mechanism is not the classic ARM
vtable.  In fact, it doesn't have any more space overhead for single
inheritance code than thunks, and may have less for multiple inheritance
code.  It works like this:

Rather than per-function offsets, we have per-target type offsets.  These
offsets (if any) are stored at a negative index from the vptr.  When a
derived class D overrides a virtual function F from a base class B, if no
previously allocated offset slot can be reused, we add one to the
beginning of the vtable(s) of the closest base(s) which are non-virtually
derived from B.  In the case of non-virtual inheritance, that would be D's
vtable; in simple virtual inheritance, it would be B's.  The vtables are
written out in one large block, laid out like an object of the class, so if
B is a non-virtual base of D, we can find the D vtable from the B vptr.

D::f then recieves a B*, loads the offset from the vtable, and makes the
adjustment to get a D*.  The plan is to also have a non-adjusting vtable
entry in D's vtable, so we don't have to do two adjustments to call D::f
with a D*; the implementation of this is up to the compiler.  I expect that
for g++, we will do the adjustment in a thunk which just falls into the
main function.

The performance problems with classic thunks occur when the thunk is
not close enough to the function it jumps to for a pc-relative branch.
This cannot be avoided in certain cases of virtual inheritance, where a
derived class must whip up a thunk for a new adjustment to a method it
doesn't override.

In this case, we will only ever have one thunk per function, so we don't
even have to jump.

Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-08-26 15:59 New vtable ABI (was Re: Strange behaviour in C++...) Jason Merrill
@ 1999-08-26 16:38 ` Jason Merrill
  1999-08-31 23:20   ` Jason Merrill
  1999-08-26 17:05 ` Joe Buck
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 24+ messages in thread
From: Jason Merrill @ 1999-08-26 16:38 UTC (permalink / raw)
  To: Joe Buck; +Cc: David Edelsohn, law, mrs, oliva, cj, gcc-patches, gcc

>>>>> Jason Merrill <jason@cygnus.com> writes:

 > In this case, we will only ever have one thunk per function

Except in the case of covariant returns, I should say, where we will have
one per return adjustment.  But we know all necessary adjustments at the
point of definition of the function, so they can all be within pc-relative
branch range.

Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-08-26 15:59 New vtable ABI (was Re: Strange behaviour in C++...) Jason Merrill
  1999-08-26 16:38 ` Jason Merrill
@ 1999-08-26 17:05 ` Joe Buck
  1999-08-26 17:13   ` Jason Merrill
  1999-08-31 23:20   ` Joe Buck
  1999-08-31 23:20 ` Jason Merrill
  1999-09-07 18:42 ` Per Bothner
  3 siblings, 2 replies; 24+ messages in thread
From: Joe Buck @ 1999-08-26 17:05 UTC (permalink / raw)
  To: Jason Merrill; +Cc: jbuck, dje, law, mrs, oliva, cj, gcc-patches, gcc

Jason writes:

> I should point out here that the proposed mechanism is not the classic ARM
> vtable.  In fact, it doesn't have any more space overhead for single
> inheritance code than thunks, and may have less for multiple inheritance
> code.  It works like this:

[ details deleted ]

Thanks, Jason, you've removed my objections.  My problems were based on
the classing scheme, where you have to store offsets for each vtable slot
(space overhead) and add the offset during the call (time overhead).

I don't quite grasp all of the details on first reading; diagrams would
help, but it sounds elegant.

(But then it seems that we'll at least temporarily have *three* vtable
schemes, sigh).

> Rather than per-function offsets, we have per-target type offsets.  These
> offsets (if any) are stored at a negative index from the vptr.

Hmm ... RTTI is currently there, but RTTI takes a fixed # of slots
so it's no problem.

Joe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-08-26 17:05 ` Joe Buck
@ 1999-08-26 17:13   ` Jason Merrill
  1999-08-26 17:26     ` Mumit Khan
  1999-08-31 23:20     ` Jason Merrill
  1999-08-31 23:20   ` Joe Buck
  1 sibling, 2 replies; 24+ messages in thread
From: Jason Merrill @ 1999-08-26 17:13 UTC (permalink / raw)
  To: Joe Buck; +Cc: dje, law, mrs, oliva, cj, gcc-patches, gcc

>>>>> Joe Buck <jbuck@synopsys.com> writes:

 >> Rather than per-function offsets, we have per-target type offsets.  These
 >> offsets (if any) are stored at a negative index from the vptr.

 > Hmm ... RTTI is currently there, but RTTI takes a fixed # of slots
 > so it's no problem.

Actually, RTTI is currently at indices 0 and 1.  It will probably make
sense to move them to negative offsets in the new model, since COM wants
the functions to start at offset 0.

Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-08-26 17:13   ` Jason Merrill
@ 1999-08-26 17:26     ` Mumit Khan
  1999-08-31 23:20       ` Mumit Khan
  1999-08-31 23:20     ` Jason Merrill
  1 sibling, 1 reply; 24+ messages in thread
From: Mumit Khan @ 1999-08-26 17:26 UTC (permalink / raw)
  To: Jason Merrill; +Cc: Joe Buck, dje, law, mrs, oliva, cj, gcc-patches, gcc

Jason Merrill <jason@cygnus.com> writes:
> 
> Actually, RTTI is currently at indices 0 and 1.  It will probably make
> sense to move them to negative offsets in the new model, since COM wants
> the functions to start at offset 0.

Makes very good sense. It really would stink to have to disable RTTI when 
using COM and its variants.

Regards,
Mumit

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-08-26 17:26     ` Mumit Khan
@ 1999-08-31 23:20       ` Mumit Khan
  0 siblings, 0 replies; 24+ messages in thread
From: Mumit Khan @ 1999-08-31 23:20 UTC (permalink / raw)
  To: Jason Merrill; +Cc: Joe Buck, dje, law, mrs, oliva, cj, gcc-patches, gcc

Jason Merrill <jason@cygnus.com> writes:
> 
> Actually, RTTI is currently at indices 0 and 1.  It will probably make
> sense to move them to negative offsets in the new model, since COM wants
> the functions to start at offset 0.

Makes very good sense. It really would stink to have to disable RTTI when 
using COM and its variants.

Regards,
Mumit

^ permalink raw reply	[flat|nested] 24+ messages in thread

* New vtable ABI (was Re: Strange behaviour in C++...)
  1999-08-26 15:59 New vtable ABI (was Re: Strange behaviour in C++...) Jason Merrill
  1999-08-26 16:38 ` Jason Merrill
  1999-08-26 17:05 ` Joe Buck
@ 1999-08-31 23:20 ` Jason Merrill
  1999-09-07 18:42 ` Per Bothner
  3 siblings, 0 replies; 24+ messages in thread
From: Jason Merrill @ 1999-08-31 23:20 UTC (permalink / raw)
  To: Joe Buck; +Cc: David Edelsohn, law, mrs, oliva, cj, gcc-patches, gcc

I should point out here that the proposed mechanism is not the classic ARM
vtable.  In fact, it doesn't have any more space overhead for single
inheritance code than thunks, and may have less for multiple inheritance
code.  It works like this:

Rather than per-function offsets, we have per-target type offsets.  These
offsets (if any) are stored at a negative index from the vptr.  When a
derived class D overrides a virtual function F from a base class B, if no
previously allocated offset slot can be reused, we add one to the
beginning of the vtable(s) of the closest base(s) which are non-virtually
derived from B.  In the case of non-virtual inheritance, that would be D's
vtable; in simple virtual inheritance, it would be B's.  The vtables are
written out in one large block, laid out like an object of the class, so if
B is a non-virtual base of D, we can find the D vtable from the B vptr.

D::f then recieves a B*, loads the offset from the vtable, and makes the
adjustment to get a D*.  The plan is to also have a non-adjusting vtable
entry in D's vtable, so we don't have to do two adjustments to call D::f
with a D*; the implementation of this is up to the compiler.  I expect that
for g++, we will do the adjustment in a thunk which just falls into the
main function.

The performance problems with classic thunks occur when the thunk is
not close enough to the function it jumps to for a pc-relative branch.
This cannot be avoided in certain cases of virtual inheritance, where a
derived class must whip up a thunk for a new adjustment to a method it
doesn't override.

In this case, we will only ever have one thunk per function, so we don't
even have to jump.

Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-08-26 17:13   ` Jason Merrill
  1999-08-26 17:26     ` Mumit Khan
@ 1999-08-31 23:20     ` Jason Merrill
  1 sibling, 0 replies; 24+ messages in thread
From: Jason Merrill @ 1999-08-31 23:20 UTC (permalink / raw)
  To: Joe Buck; +Cc: dje, law, mrs, oliva, cj, gcc-patches, gcc

>>>>> Joe Buck <jbuck@synopsys.com> writes:

 >> Rather than per-function offsets, we have per-target type offsets.  These
 >> offsets (if any) are stored at a negative index from the vptr.

 > Hmm ... RTTI is currently there, but RTTI takes a fixed # of slots
 > so it's no problem.

Actually, RTTI is currently at indices 0 and 1.  It will probably make
sense to move them to negative offsets in the new model, since COM wants
the functions to start at offset 0.

Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-08-26 16:38 ` Jason Merrill
@ 1999-08-31 23:20   ` Jason Merrill
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Merrill @ 1999-08-31 23:20 UTC (permalink / raw)
  To: Joe Buck; +Cc: David Edelsohn, law, mrs, oliva, cj, gcc-patches, gcc

>>>>> Jason Merrill <jason@cygnus.com> writes:

 > In this case, we will only ever have one thunk per function

Except in the case of covariant returns, I should say, where we will have
one per return adjustment.  But we know all necessary adjustments at the
point of definition of the function, so they can all be within pc-relative
branch range.

Jason

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-08-26 17:05 ` Joe Buck
  1999-08-26 17:13   ` Jason Merrill
@ 1999-08-31 23:20   ` Joe Buck
  1 sibling, 0 replies; 24+ messages in thread
From: Joe Buck @ 1999-08-31 23:20 UTC (permalink / raw)
  To: Jason Merrill; +Cc: jbuck, dje, law, mrs, oliva, cj, gcc-patches, gcc

Jason writes:

> I should point out here that the proposed mechanism is not the classic ARM
> vtable.  In fact, it doesn't have any more space overhead for single
> inheritance code than thunks, and may have less for multiple inheritance
> code.  It works like this:

[ details deleted ]

Thanks, Jason, you've removed my objections.  My problems were based on
the classing scheme, where you have to store offsets for each vtable slot
(space overhead) and add the offset during the call (time overhead).

I don't quite grasp all of the details on first reading; diagrams would
help, but it sounds elegant.

(But then it seems that we'll at least temporarily have *three* vtable
schemes, sigh).

> Rather than per-function offsets, we have per-target type offsets.  These
> offsets (if any) are stored at a negative index from the vptr.

Hmm ... RTTI is currently there, but RTTI takes a fixed # of slots
so it's no problem.

Joe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-08-26 15:59 New vtable ABI (was Re: Strange behaviour in C++...) Jason Merrill
                   ` (2 preceding siblings ...)
  1999-08-31 23:20 ` Jason Merrill
@ 1999-09-07 18:42 ` Per Bothner
  1999-09-07 23:14   ` Martin v. Loewis
  1999-09-30 18:02   ` Per Bothner
  3 siblings, 2 replies; 24+ messages in thread
From: Per Bothner @ 1999-09-07 18:42 UTC (permalink / raw)
  To: gcc

Jason Merrill <jason@cygnus.com> writes:

> Rather than per-function offsets, we have per-target type offsets.
> [very terse description ellided]

I confess to not being able to follow more than the very basic idea;
I guess I'm rather rusty on g++ vtable management.  However, it does
not seem like this will do much for Java, since the big deal is
how to handle the offsets, while for Java object the offsets are
always zero.  Still, it is essential for Gcj that C++ and Java
have compatible ABIs, and it would be nice that CNI (access of
Java objects from C++) work for interface inheritance as well.

So my plea when designing a new ABI:  Keep in mind the needs for
Java.  Specifically, we need an ABI that handles fast "interface"
calls.  I.e. we need constant-time handling of virtual inheritance
of pure abstract base classes with no instance fields, at least
in the case that the base classes have the "Java property" specified.
Earlier, I posted my idea for how to do this. Something like that
(or better) needs to be able to co-exist with the new C++ ABI.
(It does not follow that Java interface support should be *part*
of the C++ ABI spec;  however, it should be part of the Gcc ABI.)
-- 
	--Per Bothner
bothner@pacbell.net  per@bothner.com   http://www.bothner.com/~per/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-09-07 18:42 ` Per Bothner
@ 1999-09-07 23:14   ` Martin v. Loewis
  1999-09-07 23:53     ` Per Bothner
  1999-09-30 18:02     ` Martin v. Loewis
  1999-09-30 18:02   ` Per Bothner
  1 sibling, 2 replies; 24+ messages in thread
From: Martin v. Loewis @ 1999-09-07 23:14 UTC (permalink / raw)
  To: per; +Cc: gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1586 bytes --]

> I confess to not being able to follow more than the very basic idea;
> I guess I'm rather rusty on g++ vtable management.  However, it does
> not seem like this will do much for Java, since the big deal is
> how to handle the offsets, while for Java object the offsets are
> always zero.

Is that mandatory? The naÃ¯ve way of implementing interfaces is to take
the usual C++ approach: interfaces are virtual classes with pure
virtual methods. So

interface FooBar{
  public void x();
}
class Foo extends Bar implements FooBar{public void x(){...}}

becomes

class FooBar{
  virtual void x()=0;
};

class Foo:Bar, virtual FooBar{
  void x(){...}
};

Does that meet all requirements? If so, you will have adjustments when
you have a FooBar pointer and invoke x. The reason is that Foo has two
embedded vtbl pointers: the one of Foo, and the one of FooBar. A
FooBar pointer is represented as a pointer to the location of the
FooBar vtbl pointer inside the Foo object.

In the current ABI proposal (by Jason), the FooBar vtbl is represented
as

-1: RTTI (at negative offset for COM compatibility)
0:  x

Inside Foo, Foo::x has two entry points:

x_for_FooBar: this = this + this->_vtbl[1] ;fallthrough
x__3Foo:      ...

So, the FooBar-in-Foo vtbl has a different layout:

-1: RTTI
0:  x_for_FooBar
1:  Adjust FooBar to Foo (say, -16)

An interface call would then be a normal virtual call: Using the
pointer you have, retrieve the vtbl (at offset 0), retrieve the method
pointer, and call it. The adjustment to the full object is made at the
target.

Would that work for Java?

Martin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-09-07 23:14   ` Martin v. Loewis
@ 1999-09-07 23:53     ` Per Bothner
  1999-09-08  0:49       ` Martin v. Loewis
  1999-09-30 18:02       ` Per Bothner
  1999-09-30 18:02     ` Martin v. Loewis
  1 sibling, 2 replies; 24+ messages in thread
From: Per Bothner @ 1999-09-07 23:53 UTC (permalink / raw)
  To: gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1785 bytes --]

"Martin v. Loewis" <martin@mira.isdn.cs.tu-berlin.de> writes:

>> while for Java object the offsets are always zero.
> 
> Is that mandatory?

Yes, keeping the offsets zero is required by the Java Virtual Machine
specification.  (And the Java Virtual machine specification is part
of the Java Language Specification, because a Java program can
dynamically create new classes.)

Of course we could always invoke an "as if rule" - if conforming
programs can't tell the difference, then we are free to implement
things differently.

But certainly there is a wide-spread *presumption* in Java that
neither up-casts and down-casts changes a pointer.

Furhermore, it is not possible to implement a "simple" Java
interpreter if this assumption is violated.  That is because there is
no instruction in the JVM to convert a pointer whose type is a
derived class to a pointer to one of its base classes or interfaces;
this operation is assumed to be a no-op.  The only option would be
a "complex" interpreter, for example a Just-In-Time compiler (JIT),
or something else that analyzes the whole method.  (Such analysis
is done anyway by a bytecoder verifier, for security purposes,
but traditionally only for "untrusted" code.)

> The naÃ¯ve way of implementing interfaces is to take
> the usual C++ approach: interfaces are virtual classes with pure
> virtual methods.

Yes, that what I have been saying.  My point is that the traditional
way we implement "virtual classes with pure virtual methods and no
fields" in C++ is not the only way to do it.  For Java, it is almost
certainly the *wrong* way to do it.  For C++ too it might be
worth considering alternatives for this quite important special case.
-- 
	--Per Bothner
bothner@pacbell.net  per@bothner.com   http://www.bothner.com/~per/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-09-07 23:53     ` Per Bothner
@ 1999-09-08  0:49       ` Martin v. Loewis
  1999-09-08  6:33         ` Per Bothner
  1999-09-30 18:02         ` Martin v. Loewis
  1999-09-30 18:02       ` Per Bothner
  1 sibling, 2 replies; 24+ messages in thread
From: Martin v. Loewis @ 1999-09-08  0:49 UTC (permalink / raw)
  To: per; +Cc: gcc

> Yes, keeping the offsets zero is required by the Java Virtual Machine
> specification.  (And the Java Virtual machine specification is part
> of the Java Language Specification, because a Java program can
> dynamically create new classes.)

Hmm. Could you point me to the exact quote in the JVM spec?

> But certainly there is a wide-spread *presumption* in Java that
> neither up-casts and down-casts changes a pointer.

Well, for Objects, that property is preserved. I'm talking about casts
to and from interfaces, here.

> Furhermore, it is not possible to implement a "simple" Java
> interpreter if this assumption is violated.  That is because there is
> no instruction in the JVM to convert a pointer whose type is a
> derived class to a pointer to one of its base classes or interfaces;
> this operation is assumed to be a no-op.  The only option would be
> a "complex" interpreter, for example a Just-In-Time compiler (JIT),
> or something else that analyzes the whole method.

I assume the 'simple' Java interpreter would iterate over a list of
implemented interfaces when making a call-to-interface. Well, there is
nothing wrong with keeping such a list as part of the embedded class
description. This is just not as efficient as it could be.

When passing an object to CNI code expecting an interface, you'd have
to cast. However, I think you could preserve the desired
single-pointer approach for the byte code interpreter, and still use
the virtual-base approach in native code.

> Yes, that what I have been saying.  My point is that the traditional
> way we implement "virtual classes with pure virtual methods and no
> fields" in C++ is not the only way to do it.  For Java, it is almost
> certainly the *wrong* way to do it.  For C++ too it might be
> worth considering alternatives for this quite important special case.

For C++, the primary goals would be speed-efficiency and compact
representation. I don't think any of these alternatives would be more
efficient than vtable calls.

Regards,
Martin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-09-08  0:49       ` Martin v. Loewis
@ 1999-09-08  6:33         ` Per Bothner
  1999-09-09  1:39           ` Martin v. Loewis
  1999-09-30 18:02           ` Per Bothner
  1999-09-30 18:02         ` Martin v. Loewis
  1 sibling, 2 replies; 24+ messages in thread
From: Per Bothner @ 1999-09-08  6:33 UTC (permalink / raw)
  To: gcc

"Martin v. Loewis" <martin@mira.isdn.cs.tu-berlin.de> writes:

> > Yes, keeping the offsets zero is required by the Java Virtual Machine
> > specification.
> 
> Hmm. Could you point me to the exact quote in the JVM spec?

All over.  For example the checkcast specification.  And it is implicit
in the description of the verifier, and the fact that there is no
instruction to "widen" (up-cast?) a pointer.

> > But certainly there is a wide-spread *presumption* in Java that
> > neither up-casts and down-casts changes a pointer.
> 
> Well, for Objects, that property is preserved. I'm talking about casts
> to and from interfaces, here.

What do you think I'm talking about?

> > Furhermore, it is not possible to implement a "simple" Java
> > interpreter if this assumption is violated.  That is because there is
> > no instruction in the JVM to convert a pointer whose type is a
> > derived class to a pointer to one of its base classes or interfaces;
> > this operation is assumed to be a no-op.  The only option would be
> > a "complex" interpreter, for example a Just-In-Time compiler (JIT),
> > or something else that analyzes the whole method.
> 
> I assume the 'simple' Java interpreter would iterate over a list of
> implemented interfaces when making a call-to-interface.

In this context, all I mean by "simple" interpreter is one that can
execute a bytecode operation without first having analyzed the entire
method.  Fast method calls can still be compatible with that.  In Gcj
we use a slow search, but the traditional implementation uses a cache.

> When passing an object to CNI code expecting an interface, you'd have
> to cast. However, I think you could preserve the desired
> single-pointer approach for the byte code interpreter, and still use
> the virtual-base approach in native code.

No, I don't think so.  A pointer is a pointer.  If we have an array of
interface pointers, we can't represent that differently depending on
whether we pass it to a compiled method or an interpreted method.

> For C++, the primary goals would be speed-efficiency and compact
> representation. I don't think any of these alternatives would be more
> efficient than vtable calls.

My proposed design is *more* space-efficient than the traditional
C++ implementation, because it does not use extra per-object vtable
pointers.  My guess it would be a couple of instructions slower,
but I'm not sure.  It would still be constant-time, and in-line.

I put my design up on http://www.bothner.com/~per/jv-inherit.txt
-- 
	--Per Bothner
bothner@pacbell.net  per@bothner.com   http://www.bothner.com/~per/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-09-08  6:33         ` Per Bothner
@ 1999-09-09  1:39           ` Martin v. Loewis
  1999-09-09  9:28             ` Per Bothner
  1999-09-30 18:02             ` Martin v. Loewis
  1999-09-30 18:02           ` Per Bothner
  1 sibling, 2 replies; 24+ messages in thread
From: Martin v. Loewis @ 1999-09-09  1:39 UTC (permalink / raw)
  To: per; +Cc: gcc

> > > Yes, keeping the offsets zero is required by the Java Virtual Machine
> > > specification.
> All over.  For example the checkcast specification.  And it is implicit
> in the description of the verifier, and the fact that there is no
> instruction to "widen" (up-cast?) a pointer.

I see. Looking at

http://java.sun.com/docs/books/vmspec/2nd-edition/html/VMSpecTOC.doc.htm

I find in 3.7 (Representation of Objects)

# The Java virtual machine does not mandate any particular internal
# structure for objects

OTOH, the specification of checkcast indeed says

# If objectref is null or can be cast to the resolved class, array, or
# interface type, the operand stack is unchanged

where the 'no change' part indeed appears to restrict the internal
structure for objects

> No, I don't think so.  A pointer is a pointer.  If we have an array of
> interface pointers, we can't represent that differently depending on
> whether we pass it to a compiled method or an interpreted method.

That would be a problem, indeed.

> My proposed design is *more* space-efficient than the traditional
> C++ implementation, because it does not use extra per-object vtable
> pointers.  My guess it would be a couple of instructions slower,
> but I'm not sure.  It would still be constant-time, and in-line.
> 
> I put my design up on http://www.bothner.com/~per/jv-inherit.txt

Interesting. Is this used in jc1/libgcj? I couldn't find it there.

I think cc1plus could certainly follow this ABI for extern "Java"
classes. It won't work as a general C++ ABI, for a number of reasons:

- It requires (potentially dynamic) allocation of numbers to
  interfaces, and in turn potentially dynamic resizing of arrays.
  C++ users (especially those designing embedded applications) usually
  don't like the idea of the C++ runtime system allocating memory when
  they didn't explicitly requires that.
- It requires class information structures which more details than
  mandated by the C++ standard. For the standard C++ ABI, only features
  justified by the C++ standard are acceptable (in the case of RTTI,
  this is dynamic_cast, typeid, and EH)
- It appears to assume a single-rooted class hierarchy. In C++, an
  interface can be implemented by classes that don't share a base class.
  If a pointer-to-interface really means pointer-to-complete-object
  in the ABI, I don't think you can support multi-rooted hierarchies
  at the same time. You wouldn't know where the vtbl is, or the class
  description (i.e. you couldn't implement OBJECT_CLASS)
- It distinguishes between classes and interfaces, when there is no
  such distinction in C++. As a result, it does not support all features
  required in C++. E.g. in Java, you cannot cast from one interface
  to another, unless they are in an inheritance hierarchy (you need
  two casts to implement a cross-cast). In C++, you can dynamic_cast to
  a sibling class. Of course, this could be implemented with an additional
  RTTI mechanism.

So, in short: I think its nice for Java, but it doesn't port to C++.

Regards,
Martin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-09-09  1:39           ` Martin v. Loewis
@ 1999-09-09  9:28             ` Per Bothner
  1999-09-30 18:02               ` Per Bothner
  1999-09-30 18:02             ` Martin v. Loewis
  1 sibling, 1 reply; 24+ messages in thread
From: Per Bothner @ 1999-09-09  9:28 UTC (permalink / raw)
  To: gcc

"Martin v. Loewis" <martin@mira.isdn.cs.tu-berlin.de> writes:

> > I put my design up on http://www.bothner.com/~per/jv-inherit.txt
> Interesting. Is this used in jc1/linbgcj? I couldn't find it there.

No.  I'm hoping it (or something better) will get implemented.
Currently, Gcj does a linear search on *each call*, which is
why something better has high priority.

> I think cc1plus could certainly follow this ABI for extern "Java"
> classes. It won't work as a general C++ ABI, for a number of reasons:
> 
> - It requires (potentially dynamic) allocation of numbers to
>   interfaces, and in turn potentially dynamic resizing of arrays.
>   C++ users (especially those designing embedded applications) usually
>   don't like the idea of the C++ runtime system allocating memory when
>   they didn't explicitly requires that.

No, the design requires *link-time* allocation of numbers.  Thus your
objection about embedded applications is not relevant, since such
applications use static linking.

> - It requires class information structures which more details than
>   mandated by the C++ standard. For the standard C++ ABI, only features
>   justified by the C++ standard are acceptable (in the case of RTTI,
>   this is dynamic_cast, typeid, and EH)

I don't follow this.  Yes, Java does require more class
information data structures than C++, but that is all orthogonal to
my design.  My design proposes alternative data structured for
virtual function calls and exactly the things you mention.

> - It appears to assume a single-rooted class hierarchy. In C++, an
>   interface can be implemented by classes that don't share a base class.
>   If a pointer-to-interface really means pointer-to-complete-object
>   in the ABI, I don't think you can support multi-rooted hierarchies
>   at the same time. You wouldn't know where the vtbl is, or the class
>   description (i.e. you couldn't implement OBJECT_CLASS)

Well, the write-up is written in the context of Java.  But is there
any fundamental reason the basic idea can't be generalized to more of
C++? Note the idea is to use special "extra-indirect vtables", but *only*
for abstract classes with no fields.  Specifically, I don't think
pointer-to-interface needs to mean pointer-to-complete-object.  Instead,
how about pointer-to-interface meaning pointer-to-*concrete*-object,
where a "concrete" object corresponds to a class that has fields or
non-abstract virtual methods.

> - It distinguishes between classes and interfaces, when there is no
>   such distinction in C++.

But we can make a distiction between "classes that have no instance fields
or non-abstract virtual methods" and "other classes" in the implementation.

>   As a result, it does not support all features
>   required in C++. E.g. in Java, you cannot cast from one interface
>   to another, unless they are in an inheritance hierarchy (you need
>   two casts to implement a cross-cast). In C++, you can dynamic_cast to
>   a sibling class.

I think that is an implementation detail.

> So, in short: I think its nice for Java, but it doesn't port to C++.

I am not convinced yet.  Of course I'm not the one that needs to be
convinced either way, as I don't see myself being able to do the
implementation.
-- 
	--Per Bothner
bothner@pacbell.net  per@bothner.com   http://www.bothner.com/~per/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-09-09  9:28             ` Per Bothner
@ 1999-09-30 18:02               ` Per Bothner
  0 siblings, 0 replies; 24+ messages in thread
From: Per Bothner @ 1999-09-30 18:02 UTC (permalink / raw)
  To: gcc

"Martin v. Loewis" <martin@mira.isdn.cs.tu-berlin.de> writes:

> > I put my design up on http://www.bothner.com/~per/jv-inherit.txt
> Interesting. Is this used in jc1/linbgcj? I couldn't find it there.

No.  I'm hoping it (or something better) will get implemented.
Currently, Gcj does a linear search on *each call*, which is
why something better has high priority.

> I think cc1plus could certainly follow this ABI for extern "Java"
> classes. It won't work as a general C++ ABI, for a number of reasons:
> 
> - It requires (potentially dynamic) allocation of numbers to
>   interfaces, and in turn potentially dynamic resizing of arrays.
>   C++ users (especially those designing embedded applications) usually
>   don't like the idea of the C++ runtime system allocating memory when
>   they didn't explicitly requires that.

No, the design requires *link-time* allocation of numbers.  Thus your
objection about embedded applications is not relevant, since such
applications use static linking.

> - It requires class information structures which more details than
>   mandated by the C++ standard. For the standard C++ ABI, only features
>   justified by the C++ standard are acceptable (in the case of RTTI,
>   this is dynamic_cast, typeid, and EH)

I don't follow this.  Yes, Java does require more class
information data structures than C++, but that is all orthogonal to
my design.  My design proposes alternative data structured for
virtual function calls and exactly the things you mention.

> - It appears to assume a single-rooted class hierarchy. In C++, an
>   interface can be implemented by classes that don't share a base class.
>   If a pointer-to-interface really means pointer-to-complete-object
>   in the ABI, I don't think you can support multi-rooted hierarchies
>   at the same time. You wouldn't know where the vtbl is, or the class
>   description (i.e. you couldn't implement OBJECT_CLASS)

Well, the write-up is written in the context of Java.  But is there
any fundamental reason the basic idea can't be generalized to more of
C++? Note the idea is to use special "extra-indirect vtables", but *only*
for abstract classes with no fields.  Specifically, I don't think
pointer-to-interface needs to mean pointer-to-complete-object.  Instead,
how about pointer-to-interface meaning pointer-to-*concrete*-object,
where a "concrete" object corresponds to a class that has fields or
non-abstract virtual methods.

> - It distinguishes between classes and interfaces, when there is no
>   such distinction in C++.

But we can make a distiction between "classes that have no instance fields
or non-abstract virtual methods" and "other classes" in the implementation.

>   As a result, it does not support all features
>   required in C++. E.g. in Java, you cannot cast from one interface
>   to another, unless they are in an inheritance hierarchy (you need
>   two casts to implement a cross-cast). In C++, you can dynamic_cast to
>   a sibling class.

I think that is an implementation detail.

> So, in short: I think its nice for Java, but it doesn't port to C++.

I am not convinced yet.  Of course I'm not the one that needs to be
convinced either way, as I don't see myself being able to do the
implementation.
-- 
	--Per Bothner
bothner@pacbell.net  per@bothner.com   http://www.bothner.com/~per/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-09-07 18:42 ` Per Bothner
  1999-09-07 23:14   ` Martin v. Loewis
@ 1999-09-30 18:02   ` Per Bothner
  1 sibling, 0 replies; 24+ messages in thread
From: Per Bothner @ 1999-09-30 18:02 UTC (permalink / raw)
  To: gcc

Jason Merrill <jason@cygnus.com> writes:

> Rather than per-function offsets, we have per-target type offsets.
> [very terse description ellided]

I confess to not being able to follow more than the very basic idea;
I guess I'm rather rusty on g++ vtable management.  However, it does
not seem like this will do much for Java, since the big deal is
how to handle the offsets, while for Java object the offsets are
always zero.  Still, it is essential for Gcj that C++ and Java
have compatible ABIs, and it would be nice that CNI (access of
Java objects from C++) work for interface inheritance as well.

So my plea when designing a new ABI:  Keep in mind the needs for
Java.  Specifically, we need an ABI that handles fast "interface"
calls.  I.e. we need constant-time handling of virtual inheritance
of pure abstract base classes with no instance fields, at least
in the case that the base classes have the "Java property" specified.
Earlier, I posted my idea for how to do this. Something like that
(or better) needs to be able to co-exist with the new C++ ABI.
(It does not follow that Java interface support should be *part*
of the C++ ABI spec;  however, it should be part of the Gcc ABI.)
-- 
	--Per Bothner
bothner@pacbell.net  per@bothner.com   http://www.bothner.com/~per/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-09-08  6:33         ` Per Bothner
  1999-09-09  1:39           ` Martin v. Loewis
@ 1999-09-30 18:02           ` Per Bothner
  1 sibling, 0 replies; 24+ messages in thread
From: Per Bothner @ 1999-09-30 18:02 UTC (permalink / raw)
  To: gcc

"Martin v. Loewis" <martin@mira.isdn.cs.tu-berlin.de> writes:

> > Yes, keeping the offsets zero is required by the Java Virtual Machine
> > specification.
> 
> Hmm. Could you point me to the exact quote in the JVM spec?

All over.  For example the checkcast specification.  And it is implicit
in the description of the verifier, and the fact that there is no
instruction to "widen" (up-cast?) a pointer.

> > But certainly there is a wide-spread *presumption* in Java that
> > neither up-casts and down-casts changes a pointer.
> 
> Well, for Objects, that property is preserved. I'm talking about casts
> to and from interfaces, here.

What do you think I'm talking about?

> > Furhermore, it is not possible to implement a "simple" Java
> > interpreter if this assumption is violated.  That is because there is
> > no instruction in the JVM to convert a pointer whose type is a
> > derived class to a pointer to one of its base classes or interfaces;
> > this operation is assumed to be a no-op.  The only option would be
> > a "complex" interpreter, for example a Just-In-Time compiler (JIT),
> > or something else that analyzes the whole method.
> 
> I assume the 'simple' Java interpreter would iterate over a list of
> implemented interfaces when making a call-to-interface.

In this context, all I mean by "simple" interpreter is one that can
execute a bytecode operation without first having analyzed the entire
method.  Fast method calls can still be compatible with that.  In Gcj
we use a slow search, but the traditional implementation uses a cache.

> When passing an object to CNI code expecting an interface, you'd have
> to cast. However, I think you could preserve the desired
> single-pointer approach for the byte code interpreter, and still use
> the virtual-base approach in native code.

No, I don't think so.  A pointer is a pointer.  If we have an array of
interface pointers, we can't represent that differently depending on
whether we pass it to a compiled method or an interpreted method.

> For C++, the primary goals would be speed-efficiency and compact
> representation. I don't think any of these alternatives would be more
> efficient than vtable calls.

My proposed design is *more* space-efficient than the traditional
C++ implementation, because it does not use extra per-object vtable
pointers.  My guess it would be a couple of instructions slower,
but I'm not sure.  It would still be constant-time, and in-line.

I put my design up on http://www.bothner.com/~per/jv-inherit.txt
-- 
	--Per Bothner
bothner@pacbell.net  per@bothner.com   http://www.bothner.com/~per/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-09-08  0:49       ` Martin v. Loewis
  1999-09-08  6:33         ` Per Bothner
@ 1999-09-30 18:02         ` Martin v. Loewis
  1 sibling, 0 replies; 24+ messages in thread
From: Martin v. Loewis @ 1999-09-30 18:02 UTC (permalink / raw)
  To: per; +Cc: gcc

> Yes, keeping the offsets zero is required by the Java Virtual Machine
> specification.  (And the Java Virtual machine specification is part
> of the Java Language Specification, because a Java program can
> dynamically create new classes.)

Hmm. Could you point me to the exact quote in the JVM spec?

> But certainly there is a wide-spread *presumption* in Java that
> neither up-casts and down-casts changes a pointer.

Well, for Objects, that property is preserved. I'm talking about casts
to and from interfaces, here.

> Furhermore, it is not possible to implement a "simple" Java
> interpreter if this assumption is violated.  That is because there is
> no instruction in the JVM to convert a pointer whose type is a
> derived class to a pointer to one of its base classes or interfaces;
> this operation is assumed to be a no-op.  The only option would be
> a "complex" interpreter, for example a Just-In-Time compiler (JIT),
> or something else that analyzes the whole method.

I assume the 'simple' Java interpreter would iterate over a list of
implemented interfaces when making a call-to-interface. Well, there is
nothing wrong with keeping such a list as part of the embedded class
description. This is just not as efficient as it could be.

When passing an object to CNI code expecting an interface, you'd have
to cast. However, I think you could preserve the desired
single-pointer approach for the byte code interpreter, and still use
the virtual-base approach in native code.

> Yes, that what I have been saying.  My point is that the traditional
> way we implement "virtual classes with pure virtual methods and no
> fields" in C++ is not the only way to do it.  For Java, it is almost
> certainly the *wrong* way to do it.  For C++ too it might be
> worth considering alternatives for this quite important special case.

For C++, the primary goals would be speed-efficiency and compact
representation. I don't think any of these alternatives would be more
efficient than vtable calls.

Regards,
Martin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-09-07 23:53     ` Per Bothner
  1999-09-08  0:49       ` Martin v. Loewis
@ 1999-09-30 18:02       ` Per Bothner
  1 sibling, 0 replies; 24+ messages in thread
From: Per Bothner @ 1999-09-30 18:02 UTC (permalink / raw)
  To: gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1785 bytes --]

"Martin v. Loewis" <martin@mira.isdn.cs.tu-berlin.de> writes:

>> while for Java object the offsets are always zero.
> 
> Is that mandatory?

Yes, keeping the offsets zero is required by the Java Virtual Machine
specification.  (And the Java Virtual machine specification is part
of the Java Language Specification, because a Java program can
dynamically create new classes.)

Of course we could always invoke an "as if rule" - if conforming
programs can't tell the difference, then we are free to implement
things differently.

But certainly there is a wide-spread *presumption* in Java that
neither up-casts and down-casts changes a pointer.

Furhermore, it is not possible to implement a "simple" Java
interpreter if this assumption is violated.  That is because there is
no instruction in the JVM to convert a pointer whose type is a
derived class to a pointer to one of its base classes or interfaces;
this operation is assumed to be a no-op.  The only option would be
a "complex" interpreter, for example a Just-In-Time compiler (JIT),
or something else that analyzes the whole method.  (Such analysis
is done anyway by a bytecoder verifier, for security purposes,
but traditionally only for "untrusted" code.)

> The naÃ¯ve way of implementing interfaces is to take
> the usual C++ approach: interfaces are virtual classes with pure
> virtual methods.

Yes, that what I have been saying.  My point is that the traditional
way we implement "virtual classes with pure virtual methods and no
fields" in C++ is not the only way to do it.  For Java, it is almost
certainly the *wrong* way to do it.  For C++ too it might be
worth considering alternatives for this quite important special case.
-- 
	--Per Bothner
bothner@pacbell.net  per@bothner.com   http://www.bothner.com/~per/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-09-07 23:14   ` Martin v. Loewis
  1999-09-07 23:53     ` Per Bothner
@ 1999-09-30 18:02     ` Martin v. Loewis
  1 sibling, 0 replies; 24+ messages in thread
From: Martin v. Loewis @ 1999-09-30 18:02 UTC (permalink / raw)
  To: per; +Cc: gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1586 bytes --]

> I confess to not being able to follow more than the very basic idea;
> I guess I'm rather rusty on g++ vtable management.  However, it does
> not seem like this will do much for Java, since the big deal is
> how to handle the offsets, while for Java object the offsets are
> always zero.

Is that mandatory? The naÃ¯ve way of implementing interfaces is to take
the usual C++ approach: interfaces are virtual classes with pure
virtual methods. So

interface FooBar{
  public void x();
}
class Foo extends Bar implements FooBar{public void x(){...}}

becomes

class FooBar{
  virtual void x()=0;
};

class Foo:Bar, virtual FooBar{
  void x(){...}
};

Does that meet all requirements? If so, you will have adjustments when
you have a FooBar pointer and invoke x. The reason is that Foo has two
embedded vtbl pointers: the one of Foo, and the one of FooBar. A
FooBar pointer is represented as a pointer to the location of the
FooBar vtbl pointer inside the Foo object.

In the current ABI proposal (by Jason), the FooBar vtbl is represented
as

-1: RTTI (at negative offset for COM compatibility)
0:  x

Inside Foo, Foo::x has two entry points:

x_for_FooBar: this = this + this->_vtbl[1] ;fallthrough
x__3Foo:      ...

So, the FooBar-in-Foo vtbl has a different layout:

-1: RTTI
0:  x_for_FooBar
1:  Adjust FooBar to Foo (say, -16)

An interface call would then be a normal virtual call: Using the
pointer you have, retrieve the vtbl (at offset 0), retrieve the method
pointer, and call it. The adjustment to the full object is made at the
target.

Would that work for Java?

Martin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New vtable ABI (was Re: Strange behaviour in C++...)
  1999-09-09  1:39           ` Martin v. Loewis
  1999-09-09  9:28             ` Per Bothner
@ 1999-09-30 18:02             ` Martin v. Loewis
  1 sibling, 0 replies; 24+ messages in thread
From: Martin v. Loewis @ 1999-09-30 18:02 UTC (permalink / raw)
  To: per; +Cc: gcc

> > > Yes, keeping the offsets zero is required by the Java Virtual Machine
> > > specification.
> All over.  For example the checkcast specification.  And it is implicit
> in the description of the verifier, and the fact that there is no
> instruction to "widen" (up-cast?) a pointer.

I see. Looking at

http://java.sun.com/docs/books/vmspec/2nd-edition/html/VMSpecTOC.doc.htm

I find in 3.7 (Representation of Objects)

# The Java virtual machine does not mandate any particular internal
# structure for objects

OTOH, the specification of checkcast indeed says

# If objectref is null or can be cast to the resolved class, array, or
# interface type, the operand stack is unchanged

where the 'no change' part indeed appears to restrict the internal
structure for objects

> No, I don't think so.  A pointer is a pointer.  If we have an array of
> interface pointers, we can't represent that differently depending on
> whether we pass it to a compiled method or an interpreted method.

That would be a problem, indeed.

> My proposed design is *more* space-efficient than the traditional
> C++ implementation, because it does not use extra per-object vtable
> pointers.  My guess it would be a couple of instructions slower,
> but I'm not sure.  It would still be constant-time, and in-line.
> 
> I put my design up on http://www.bothner.com/~per/jv-inherit.txt

Interesting. Is this used in jc1/libgcj? I couldn't find it there.

I think cc1plus could certainly follow this ABI for extern "Java"
classes. It won't work as a general C++ ABI, for a number of reasons:

- It requires (potentially dynamic) allocation of numbers to
  interfaces, and in turn potentially dynamic resizing of arrays.
  C++ users (especially those designing embedded applications) usually
  don't like the idea of the C++ runtime system allocating memory when
  they didn't explicitly requires that.
- It requires class information structures which more details than
  mandated by the C++ standard. For the standard C++ ABI, only features
  justified by the C++ standard are acceptable (in the case of RTTI,
  this is dynamic_cast, typeid, and EH)
- It appears to assume a single-rooted class hierarchy. In C++, an
  interface can be implemented by classes that don't share a base class.
  If a pointer-to-interface really means pointer-to-complete-object
  in the ABI, I don't think you can support multi-rooted hierarchies
  at the same time. You wouldn't know where the vtbl is, or the class
  description (i.e. you couldn't implement OBJECT_CLASS)
- It distinguishes between classes and interfaces, when there is no
  such distinction in C++. As a result, it does not support all features
  required in C++. E.g. in Java, you cannot cast from one interface
  to another, unless they are in an inheritance hierarchy (you need
  two casts to implement a cross-cast). In C++, you can dynamic_cast to
  a sibling class. Of course, this could be implemented with an additional
  RTTI mechanism.

So, in short: I think its nice for Java, but it doesn't port to C++.

Regards,
Martin

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~1999-09-30 18:02 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-08-26 15:59 New vtable ABI (was Re: Strange behaviour in C++...) Jason Merrill
1999-08-26 16:38 ` Jason Merrill
1999-08-31 23:20   ` Jason Merrill
1999-08-26 17:05 ` Joe Buck
1999-08-26 17:13   ` Jason Merrill
1999-08-26 17:26     ` Mumit Khan
1999-08-31 23:20       ` Mumit Khan
1999-08-31 23:20     ` Jason Merrill
1999-08-31 23:20   ` Joe Buck
1999-08-31 23:20 ` Jason Merrill
1999-09-07 18:42 ` Per Bothner
1999-09-07 23:14   ` Martin v. Loewis
1999-09-07 23:53     ` Per Bothner
1999-09-08  0:49       ` Martin v. Loewis
1999-09-08  6:33         ` Per Bothner
1999-09-09  1:39           ` Martin v. Loewis
1999-09-09  9:28             ` Per Bothner
1999-09-30 18:02               ` Per Bothner
1999-09-30 18:02             ` Martin v. Loewis
1999-09-30 18:02           ` Per Bothner
1999-09-30 18:02         ` Martin v. Loewis
1999-09-30 18:02       ` Per Bothner
1999-09-30 18:02     ` Martin v. Loewis
1999-09-30 18:02   ` Per Bothner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).