public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* More space efficient virtual function calls
@ 2002-07-07  8:45 Wink Saville
  2002-07-07 15:45 ` mike stump
  0 siblings, 1 reply; 5+ messages in thread
From: Wink Saville @ 2002-07-07  8:45 UTC (permalink / raw)
  To: gcc

Hello,

I was wondering if there was switch in gcc that would cause it to generate
smaller code when invoking virtual functions. I'm using GCC 3.1 ARM cross
compiler with -O3 and -fvtable-thunks=3, and it does the following when
calling a virtual function with no parameters:

    LDR    R1,[R4]        /* Fetch the vtable pointer */
    MOV    R0,R4        /* R0 = this */
    MOV LR, PC          /* Set of the link register */
    LDR PC,[R1, #4]  /* Call virtual function 1, the second entry */

This is good code but it takes 20 bytes, what I was hoping for when I
enabled -fvtable-thunks=3 the compiler would use intermediate "thunks" to
invoke functions such as:

    LDR    R0,R4    /* R0 = this */
    BL    __VTFunc1     /* Branch and Link to virtual thunk function 1  */

...

__VTFunc0:
    MOV    R5,[R0]        /* Get vtable pointer */
    LDR    PC,[R5,#0]    /* Jump to the function */
__VTFunc1:
    MOV    R5,[R0]        /* Get vtable pointer */
    LDR    PC,[R5,#4]    /* Jump to the function */
__VTFunc2:
    MOV    R5,[R0]        /* Get vtable pointer */
    LDR    PC,[R5,#8]    /* Jump to the function */


This only costs 8 bytes per call instead of 20 (plus the size of the
_VTFuncX code). Of course this has at least two consequences, 1) A register,
in this example R5, would have to be reserved for use by the thunking code.
2) There is a performance hit because of two jumps. For space constrained
applications the trade off would be well worth it and the programmer should
be able to use a pragma or compiler switch to select which type of code to
emit.

Is this already possible with an existing switch/pragma?
Is this practical?
Would most of the modifications be need to needed to the front end, back end
or both?
Could someone point me to the places I would need to change to try this out?


Thanks,

Wink Saville



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: More space efficient virtual function calls
  2002-07-07  8:45 More space efficient virtual function calls Wink Saville
@ 2002-07-07 15:45 ` mike stump
  2002-07-07 21:35   ` Wink Saville
  0 siblings, 1 reply; 5+ messages in thread
From: mike stump @ 2002-07-07 15:45 UTC (permalink / raw)
  To: gcc, wink

> From: "Wink Saville" <wink@saville.com>
> To: <gcc@gcc.gnu.org>
> Date: Sat, 6 Jul 2002 17:29:11 -0700

> I was wondering if there was switch in gcc that would cause it to
> generate smaller code when invoking virtual functions.

Most likely not, other than -Os.

> Is this already possible with an existing switch/pragma?

No.

> Is this practical?

Yes.

> Would most of the modifications be need to needed to the front end, back end
> or both?

It could be done in the port file.

> Could someone point me to the places I would need to change to try this out?

Sure.

Just create a peephole to find the code you want to replace:

>     LDR    R1,[R4]        /* Fetch the vtable pointer */
>     MOV    R0,R4        /* R0 = this */
>     MOV LR, PC          /* Set of the link register */
>     LDR PC,[R1, #4]  /* Call virtual function 1, the second entry */

and replace it with what you want:

>     LDR    R0,R4    /* R0 = this */
>     BL    __VTFunc1     /* Branch and Link to virtual thunk function 1  */

[ or whatever the right version is ], and do this when -Os is given.
You should then be able to `see' the compiler generate this code.
Then you will discover there is no definition for __VTFunc1, and you
will discover you want to added it to the .asm or .s file for the port
so it will wind up in libgcc.a.  Or, you can put it in a linkonce
section if the port supports it, and always emit them in each unit,
or, after you discover what longcall is, you might want to do it as
private in each unit.  I might recommend the last alternative.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: More space efficient virtual function calls
  2002-07-07 15:45 ` mike stump
@ 2002-07-07 21:35   ` Wink Saville
  0 siblings, 0 replies; 5+ messages in thread
From: Wink Saville @ 2002-07-07 21:35 UTC (permalink / raw)
  To: mike stump, gcc

Mike,

Thanks I'll try it,

Wink

----- Original Message -----
From: "mike stump" <mrs@windriver.com>
To: <gcc@gcc.gnu.org>; <wink@saville.com>
Sent: Sunday, July 07, 2002 12:17 PM
Subject: Re: More space efficient virtual function calls


> > From: "Wink Saville" <wink@saville.com>
> > To: <gcc@gcc.gnu.org>
> > Date: Sat, 6 Jul 2002 17:29:11 -0700
>
> > I was wondering if there was switch in gcc that would cause it to
> > generate smaller code when invoking virtual functions.
>
> Most likely not, other than -Os.
>
> > Is this already possible with an existing switch/pragma?
>
> No.
>
> > Is this practical?
>
> Yes.
>
> > Would most of the modifications be need to needed to the front end, back
end
> > or both?
>
> It could be done in the port file.
>
> > Could someone point me to the places I would need to change to try this
out?
>
> Sure.
>
> Just create a peephole to find the code you want to replace:
>
> >     LDR    R1,[R4]        /* Fetch the vtable pointer */
> >     MOV    R0,R4        /* R0 = this */
> >     MOV LR, PC          /* Set of the link register */
> >     LDR PC,[R1, #4]  /* Call virtual function 1, the second entry */
>
> and replace it with what you want:
>
> >     LDR    R0,R4    /* R0 = this */
> >     BL    __VTFunc1     /* Branch and Link to virtual thunk function 1
*/
>
> [ or whatever the right version is ], and do this when -Os is given.
> You should then be able to `see' the compiler generate this code.
> Then you will discover there is no definition for __VTFunc1, and you
> will discover you want to added it to the .asm or .s file for the port
> so it will wind up in libgcc.a.  Or, you can put it in a linkonce
> section if the port supports it, and always emit them in each unit,
> or, after you discover what longcall is, you might want to do it as
> private in each unit.  I might recommend the last alternative.
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: More space efficient virtual function calls
  2002-07-07 10:03 Matthew Wilcox
@ 2002-07-07 10:11 ` Wink Saville
  0 siblings, 0 replies; 5+ messages in thread
From: Wink Saville @ 2002-07-07 10:11 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: gcc

Whoops Matthew is correct on both of his points.

I also looked at the ATPCS document and it looks like register 12 can be
used, its other name is IP (Intra-Procedure-call scratch register) and as
Mathew pointed out the savings of 8 bytes per call.

So the new suggestion is:

    MOV    R0,R4    /* R0 = this */
    BL    __VTFunc1     /* Branch and Link to virtual thunk function 1  */

...

__VTFunc0:
    MOV    R12,[R0]        /* Get vtable pointer */
    LDR    PC,[R12,#0]    /* Jump to the function 0 */
__VTFunc1:
    MOV    R12,[R0]        /* Get vtable pointer */
    LDR    PC,[R12,#4]    /* Jump to the function 1 */
__VTFunc2:
    MOV    R12,[R0]        /* Get vtable pointer */
    LDR    PC,[R12,#8]    /* Jump to the function 2 */
...

As I said before there is a performance hit and this would have to be
optional. I looked at the compiler switches and a first suggestion would be
to enabled this feature when the -Os (save space switch was used).

So:

Is this already possible with an existing switch/pragma?
Are there comparable savings for other architectures?
Is this practical?
Should the VTFuncX function be emitted by the compiler or required to be in
the runtime?
If required to be in the runtime, what would be the best approach for
handling large vtables?

I'm sure there are other questions/issues and probably bugs:) and any
suggestions are welcome.

Thanks,

Wink


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: More space efficient virtual function calls
@ 2002-07-07 10:03 Matthew Wilcox
  2002-07-07 10:11 ` Wink Saville
  0 siblings, 1 reply; 5+ messages in thread
From: Matthew Wilcox @ 2002-07-07 10:03 UTC (permalink / raw)
  To: Wink Saville; +Cc: gcc

On Sat, 6 Jul 2002 17:29:11 -0700, Wink Saville wrote:
> I was wondering if there was switch in gcc that would cause it to generate
> smaller code when invoking virtual functions. I'm using GCC 3.1 ARM cross
> compiler with -O3 and -fvtable-thunks=3, and it does the following when
> calling a virtual function with no parameters:
> 
>     LDR    R1,[R4]        /* Fetch the vtable pointer */
>     MOV    R0,R4        /* R0 = this */
>     MOV LR, PC          /* Set of the link register */
>     LDR PC,[R1, #4]  /* Call virtual function 1, the second entry */
> 
> This is good code but it takes 20 bytes, what I was hoping for when I
> enabled -fvtable-thunks=3 the compiler would use intermediate "thunks" to
> invoke functions such as:
> 
>     LDR    R0,R4    /* R0 = this */
>     BL    __VTFunc1     /* Branch and Link to virtual thunk function 1  */
> 
> ...
> 
> __VTFunc0:
>     MOV    R5,[R0]        /* Get vtable pointer */
>     LDR    PC,[R5,#0]    /* Jump to the function */
> __VTFunc1:
>     MOV    R5,[R0]        /* Get vtable pointer */
>     LDR    PC,[R5,#4]    /* Jump to the function */
> __VTFunc2:
>     MOV    R5,[R0]        /* Get vtable pointer */
>     LDR    PC,[R5,#8]    /* Jump to the function */

The current code takes 16 bytes, not 20.  The proposed code above is buggy;
I think what you meant was:

	MOV	r0, r4			/* r0 = this */
	BL	__VTFunc1

...

__VTFunc0:
	LDR	r5, [r0]		/* fetch vtable pointer */
	LDR	pc, [r5, #0]		/* jump to the function */
__VTFunc1:
	LDR	r5, [r0]
	LDR	pc, [r5, #4]
__VTFunc2:
	LDR	r5, [r0]
	LDR	pc, [r5, #8]

if you use a call-clobbered register instead of r5 (i don't remember
the apcs right now), you don't even need to reserve r5.

i have no idea how to change gcc to do what you want though ;-)

-- 
Revolutions do not require corporate support.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2002-07-07 22:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-07  8:45 More space efficient virtual function calls Wink Saville
2002-07-07 15:45 ` mike stump
2002-07-07 21:35   ` Wink Saville
2002-07-07 10:03 Matthew Wilcox
2002-07-07 10:11 ` Wink Saville

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).