public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* Inlining functions vs. inlining member functions as templates
@ 2005-05-13 15:10 Peter Doerfler
  2005-05-13 19:41 ` Jeffrey Holle
  0 siblings, 1 reply; 4+ messages in thread
From: Peter Doerfler @ 2005-05-13 15:10 UTC (permalink / raw)
  To: gcc-help

[-- Attachment #1: Type: text/plain, Size: 1562 bytes --]

Hi.

I'm working on a lib that uses policy templates quite extensively, in the vein 
of the STL's template <class Compare> std::list::sort(Compare comp).

Compare can be a function or a struct with operator(). While the functionality 
is equivalent the compiler inlines the struct but not the function. Changing 
to the struct implementation decreased runtime by more than 40% in an image 
filtering function.

Below you can find a tiny example. The commented code gives the alternate 
implementation which is perfectly inlined. See attachments for assembler 
code.

Can somebody explain why lessFunc is not inlined?

/usr/i686-pc-linux-gnu/gcc-bin/4.0.1-beta20050507/gcc --version
gcc (GCC) 4.0.1-beta20050507 (Gentoo 4.0.1_beta20050507)
flags: -mtune=pentium4 -O3 -fomit-frame-pointer -S

gcc3.3.3 (SuSE9.1), gcc3.3.5(gentoo), gcc.3.4.3(Gentoo 3.4.3.20050110-r2) 
perform much worse in both cases. Good point for gcc4.
icc8.0 produces roughly the same result as gcc4.0.1. Also very good.
I think the gcc4.x of gentoo is very close to or the same as the official 
snapshot.

Thanks for hints,
Peter


--------------------------------------------------------------------------------------------

// struct less {
//   inline bool operator()(const int a, const int b) {
//     return a<b;
//   }
// };

inline bool lessFunc(const int a, const int b) {
  return a<b;
}

template <class Comp>
bool foo(const int a, const int b, Comp comp) {
  return comp(a,b);
}

int main(int argc, char** argv) {
  return foo(argc,1,lessFunc);
//   return foo(argc,3,less());
}



[-- Attachment #2: testTemplateInline.cpp --]
[-- Type: text/x-c++src, Size: 365 bytes --]


// struct less {
//   inline bool operator()(const int a, const int b) {
//     return a<b;
//   }
// };

inline bool lessFunc(const int a, const int b) {
  return a<b;
}

template <class Comp>
bool foo(const int a, const int b, Comp comp) {
  return comp(a,b);
}

int main(int argc, char** argv) {
  return foo(argc,1,lessFunc);
//   return foo(argc,3,less());
}

[-- Attachment #3: testTemplateInlineFunc.s --]
[-- Type: text/plain, Size: 697 bytes --]

	.file	"testTemplateInline.cpp"
	.section	.gnu.linkonce.t._Z8lessFuncii,"ax",@progbits
	.align 2
	.weak	_Z8lessFuncii
	.type	_Z8lessFuncii, @function
_Z8lessFuncii:
.LFB2:
	movl	8(%esp), %eax
	cmpl	%eax, 4(%esp)
	setl	%al
	movzbl	%al, %eax
	ret
.LFE2:
	.size	_Z8lessFuncii, .-_Z8lessFuncii
	.text
	.align 2
.globl main
	.type	main, @function
main:
.LFB4:
	pushl	%ebp
.LCFI0:
	movl	%esp, %ebp
.LCFI1:
	subl	$8, %esp
.LCFI2:
	andl	$-16, %esp
	subl	$16, %esp
	movl	$1, 4(%esp)
	movl	8(%ebp), %eax
	movl	%eax, (%esp)
	call	_Z8lessFuncii
	movzbl	%al, %eax
	leave
	ret
.LFE4:
	.size	main, .-main
	.ident	"GCC: (GNU) 4.0.1-beta20050507 (Gentoo 4.0.1_beta20050507)"
	.section	.note.GNU-stack,"",@progbits

[-- Attachment #4: testTemplateInlineStruct.s --]
[-- Type: text/plain, Size: 388 bytes --]

	.file	"testTemplateInline.cpp"
	.text
	.align 2
.globl main
	.type	main, @function
main:
.LFB4:
	pushl	%ebp
.LCFI0:
	movl	%esp, %ebp
.LCFI1:
	subl	$8, %esp
.LCFI2:
	andl	$-16, %esp
	subl	$16, %esp
	cmpl	$2, 8(%ebp)
	setle	%al
	andl	$1, %eax
	leave
	ret
.LFE4:
	.size	main, .-main
	.ident	"GCC: (GNU) 4.0.1-beta20050507 (Gentoo 4.0.1_beta20050507)"
	.section	.note.GNU-stack,"",@progbits

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Inlining functions vs. inlining member functions as templates
  2005-05-13 15:10 Inlining functions vs. inlining member functions as templates Peter Doerfler
@ 2005-05-13 19:41 ` Jeffrey Holle
       [not found]   ` <200505171331.48271.doerfler@techinfo.rwth-aachen.de>
  0 siblings, 1 reply; 4+ messages in thread
From: Jeffrey Holle @ 2005-05-13 19:41 UTC (permalink / raw)
  To: gcc-help

Thats one of several advantages that functors have over function pointers.
Basically, because a pointer is needed, the function can't be in-lined.
This is avoided by employing a functor.

Peter Doerfler wrote:
> Hi.
> 
> I'm working on a lib that uses policy templates quite extensively, in the vein 
> of the STL's template <class Compare> std::list::sort(Compare comp).
> 
> Compare can be a function or a struct with operator(). While the functionality 
> is equivalent the compiler inlines the struct but not the function. Changing 
> to the struct implementation decreased runtime by more than 40% in an image 
> filtering function.
> 
> Below you can find a tiny example. The commented code gives the alternate 
> implementation which is perfectly inlined. See attachments for assembler 
> code.
> 
> Can somebody explain why lessFunc is not inlined?
> 
> /usr/i686-pc-linux-gnu/gcc-bin/4.0.1-beta20050507/gcc --version
> gcc (GCC) 4.0.1-beta20050507 (Gentoo 4.0.1_beta20050507)
> flags: -mtune=pentium4 -O3 -fomit-frame-pointer -S
> 
> gcc3.3.3 (SuSE9.1), gcc3.3.5(gentoo), gcc.3.4.3(Gentoo 3.4.3.20050110-r2) 
> perform much worse in both cases. Good point for gcc4.
> icc8.0 produces roughly the same result as gcc4.0.1. Also very good.
> I think the gcc4.x of gentoo is very close to or the same as the official 
> snapshot.
> 
> Thanks for hints,
> Peter
> 
> 
> --------------------------------------------------------------------------------------------
> 
> // struct less {
> //   inline bool operator()(const int a, const int b) {
> //     return a<b;
> //   }
> // };
> 
> inline bool lessFunc(const int a, const int b) {
>   return a<b;
> }
> 
> template <class Comp>
> bool foo(const int a, const int b, Comp comp) {
>   return comp(a,b);
> }
> 
> int main(int argc, char** argv) {
>   return foo(argc,1,lessFunc);
> //   return foo(argc,3,less());
> }
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> 
> // struct less {
> //   inline bool operator()(const int a, const int b) {
> //     return a<b;
> //   }
> // };
> 
> inline bool lessFunc(const int a, const int b) {
>   return a<b;
> }
> 
> template <class Comp>
> bool foo(const int a, const int b, Comp comp) {
>   return comp(a,b);
> }
> 
> int main(int argc, char** argv) {
>   return foo(argc,1,lessFunc);
> //   return foo(argc,3,less());
> }
> 
> 
> ------------------------------------------------------------------------
> 
> 	.file	"testTemplateInline.cpp"
> 	.section	.gnu.linkonce.t._Z8lessFuncii,"ax",@progbits
> 	.align 2
> 	.weak	_Z8lessFuncii
> 	.type	_Z8lessFuncii, @function
> _Z8lessFuncii:
> .LFB2:
> 	movl	8(%esp), %eax
> 	cmpl	%eax, 4(%esp)
> 	setl	%al
> 	movzbl	%al, %eax
> 	ret
> .LFE2:
> 	.size	_Z8lessFuncii, .-_Z8lessFuncii
> 	.text
> 	.align 2
> .globl main
> 	.type	main, @function
> main:
> .LFB4:
> 	pushl	%ebp
> .LCFI0:
> 	movl	%esp, %ebp
> .LCFI1:
> 	subl	$8, %esp
> .LCFI2:
> 	andl	$-16, %esp
> 	subl	$16, %esp
> 	movl	$1, 4(%esp)
> 	movl	8(%ebp), %eax
> 	movl	%eax, (%esp)
> 	call	_Z8lessFuncii
> 	movzbl	%al, %eax
> 	leave
> 	ret
> .LFE4:
> 	.size	main, .-main
> 	.ident	"GCC: (GNU) 4.0.1-beta20050507 (Gentoo 4.0.1_beta20050507)"
> 	.section	.note.GNU-stack,"",@progbits
> 
> 
> ------------------------------------------------------------------------
> 
> 	.file	"testTemplateInline.cpp"
> 	.text
> 	.align 2
> .globl main
> 	.type	main, @function
> main:
> .LFB4:
> 	pushl	%ebp
> .LCFI0:
> 	movl	%esp, %ebp
> .LCFI1:
> 	subl	$8, %esp
> .LCFI2:
> 	andl	$-16, %esp
> 	subl	$16, %esp
> 	cmpl	$2, 8(%ebp)
> 	setle	%al
> 	andl	$1, %eax
> 	leave
> 	ret
> .LFE4:
> 	.size	main, .-main
> 	.ident	"GCC: (GNU) 4.0.1-beta20050507 (Gentoo 4.0.1_beta20050507)"
> 	.section	.note.GNU-stack,"",@progbits

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Inlining functions vs. inlining member functions as  templates
       [not found]   ` <200505171331.48271.doerfler@techinfo.rwth-aachen.de>
@ 2005-05-17 11:53     ` Eljay Love-Jensen
  2005-05-17 13:02       ` Peter Doerfler
  0 siblings, 1 reply; 4+ messages in thread
From: Eljay Love-Jensen @ 2005-05-17 11:53 UTC (permalink / raw)
  To: Peter Doerfler, gcc-help, jeffholle

Hi Peter,

>Why doesn't the compiler manage to do the same thing for the function version, that it apparently does between -O1 and -O3 for the struct version?

It's a pretty tall order.  You are saying "Ignore what I explicitly programmed the function to do, and inline what I specified as a function pointer parameter because it will be more efficient".

Note:  a function pointer parameter is *NOT* the same as being able to inline an inline-tagged function.

>Or is it simply not allowed to do that kind of optimization for functions (i.e. function pointers)?

I believe there are two schools of thought.

School A:  C (and by extension, C++) is a really nice macro assemblers that have wonderful optimization techniques.  But I don't expect the compiler to do anything other than what I tell it to do.

School B:  as long as the programs output (including side-effects) is the same after some seriously impressive wholistic optimizations, I expect aggressive optimizing.  (Even if that makes the million function program turn into as-if it were all coded inside one gigantic main() routine -- if that's more efficient.)

I find that School A is the kind of optimization that I expect out of a C/C++ optimizing compiler.  Since I come from an assembly language (6502, 680x0) background, I think of C as a really neat macro assembler.  (But I think of C++ as an object oriented language moreso than a excellent macro assembler; go figure.)

And that School B is the kind of optimization that I expect out of a Java compiler *AND* a continuously optimizing/re-optimizing JIT JVM.

You can get some of the School B benefits in School A, if you have a feedback mechanism in your development process to feed profiling information back into the compilation process.  (Or if your C/C++ code (or tokenized binary thereof) was interpreted live instead of compiled.)  But even then, I really wouldn't expect* the compiler to ignore what you told it to do because some impressive wholistic optimization analysis technique.

* Not that it's impossible; just not what I'd expect.  I'd be pleasantly surprised if it did the kind of optimization that even changes function signatures to remove (optimize away) parameters (such as the function pointer parameter specified in your example)!  Wow!

Sincerely,
--Eljay

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Inlining functions vs. inlining member functions as  templates
  2005-05-17 11:53     ` Eljay Love-Jensen
@ 2005-05-17 13:02       ` Peter Doerfler
  0 siblings, 0 replies; 4+ messages in thread
From: Peter Doerfler @ 2005-05-17 13:02 UTC (permalink / raw)
  To: gcc-help, eljay

Hi Eljay and others, just a few concluding remarks.

On Tuesday 17 May 2005 13:55, Eljay Love-Jensen wrote:
> Hi Peter,
>
> >Why doesn't the compiler manage to do the same thing for the function
> > version, that it apparently does between -O1 and -O3 for the struct
> > version?
>
> It's a pretty tall order.  You are saying "Ignore what I explicitly
> programmed the function to do, and inline what I specified as a function
> pointer parameter because it will be more efficient".
>

Well, first I wasn't really thinking so much about what I was _actually_ 
telling the function (or the compiler) to do. For some reason I had a feeling 
that the function version should be faster because no class is involved. 

With a bit of thinking and experimenting I understood the function pointer 
part, but wasn't sure that that was all. The last optimization step didn't 
seem that difficult when the pointer was already gone. That's why I asked 
here.

> .....

>  Since I come from an assembly language (6502,
> 680x0) background, I think of C as a really neat macro assembler.  (But I
> think of C++ as an object oriented language moreso than a excellent macro
> assembler; go figure.)
>

I am totally on the OO side. My motivation for the function was (as I said 
above) that somehow I thought it should be faster in a very critical place (4 
nested loops). Oh, and the implementation is completely hidden from users of 
the library I'm working on, of course.

> And that School B is the kind of optimization that I expect out of a Java
> compiler *AND* a continuously optimizing/re-optimizing JIT JVM.
>
> You can get some of the School B benefits in School A, if you have a
> feedback mechanism in your development process to feed profiling
> information back into the compilation process.  (Or if your C/C++ code (or
> tokenized binary thereof) was interpreted live instead of compiled.)  But
> even then, I really wouldn't expect* the compiler to ignore what you told
> it to do because some impressive wholistic optimization analysis technique.
>
> * Not that it's impossible; just not what I'd expect.  I'd be pleasantly
> surprised if it did the kind of optimization that even changes function
> signatures to remove (optimize away) parameters (such as the function
> pointer parameter specified in your example)!  Wow!
>

I _am_ very impressed with the optimizations gcc4.0.x does. The purpose of 
this question was not to criticize what the compiler does (or does not). I 
just wanted to be sure never to fall into the same trap again. So I wanted to 
fully understand what was happening.

From what I learned I draw the conclusion that the optimization I thought of 
could be done but that maybe it should not be done. And it should definitely 
not be expected (more than is usual for optimizations).

> Sincerely,
> --Eljay

Thanks alot for the detailed answer to my question. I'll be staying away from 
the function pointers now _and_ know why. 

Best regards,
Peter

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-05-17 13:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-05-13 15:10 Inlining functions vs. inlining member functions as templates Peter Doerfler
2005-05-13 19:41 ` Jeffrey Holle
     [not found]   ` <200505171331.48271.doerfler@techinfo.rwth-aachen.de>
2005-05-17 11:53     ` Eljay Love-Jensen
2005-05-17 13:02       ` Peter Doerfler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).