* Inlining Improvements @ 1999-12-21 2:49 Oskar Enoksson 1999-12-21 4:59 ` Martin v. Loewis 1999-12-31 23:54 ` Oskar Enoksson 0 siblings, 2 replies; 32+ messages in thread From: Oskar Enoksson @ 1999-12-21 2:49 UTC (permalink / raw) To: gcc I read the announcement about the "inlining improvements" on the website. It's great news! Is this code checked in already? If not, how soon could it be available? Thanks! /* Oskar Enoksson, Linkoping, Sweden */ ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 2:49 Inlining Improvements Oskar Enoksson @ 1999-12-21 4:59 ` Martin v. Loewis 1999-12-21 8:04 ` Jamie Lokier 1999-12-31 23:54 ` Martin v. Loewis 1999-12-31 23:54 ` Oskar Enoksson 1 sibling, 2 replies; 32+ messages in thread From: Martin v. Loewis @ 1999-12-21 4:59 UTC (permalink / raw) To: osken393; +Cc: gcc > I read the announcement about the "inlining improvements" on the website. > It's great news! Is this code checked in already? Yes, it is. Have a look at cp/ChangeLog, in particular 1999-12-05 Mark Mitchell <mark@codesourcery.com> 1999-12-04 Mark Mitchell <mark@codesourcery.com> 1999-11-25 Mark Mitchell <mark@codesourcery.com> and others. Regards, Martin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 4:59 ` Martin v. Loewis @ 1999-12-21 8:04 ` Jamie Lokier 1999-12-21 8:55 ` Mark Mitchell ` (3 more replies) 1999-12-31 23:54 ` Martin v. Loewis 1 sibling, 4 replies; 32+ messages in thread From: Jamie Lokier @ 1999-12-21 8:04 UTC (permalink / raw) To: Martin v. Loewis, n; +Cc: osken393, gcc Martin v. Loewis wrote: > > I read the announcement about the "inlining improvements" on the website. > > It's great news! Is this code checked in already? > > Yes, it is. Have a look at cp/ChangeLog, [...] So we have a situation where the C++ compiler generates better code than the C compiler from the same source? Are there plans to add the tree inlining to C any time soon? thanks, -- Jamie ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 8:04 ` Jamie Lokier @ 1999-12-21 8:55 ` Mark Mitchell 1999-12-21 9:06 ` Jamie Lokier 1999-12-31 23:54 ` Mark Mitchell 1999-12-21 9:43 ` Jeffrey A Law ` (2 subsequent siblings) 3 siblings, 2 replies; 32+ messages in thread From: Mark Mitchell @ 1999-12-21 8:55 UTC (permalink / raw) To: jamie.lokier; +Cc: martin, n, osken393, gcc >>>>> "Jamie" == Jamie Lokier <jamie.lokier@cern.ch> writes: Jamie> So we have a situation where the C++ compiler generates Jamie> better code than the C compiler from the same source? Jamie> Are there plans to add the tree inlining to C any time Jamie> soon? We (CodeSourcerY) don't have any such plans, although we're actively encouraging customers to do that work. I believe that Cygnus is working on moving some of the function-at-a-time work which is a necessary prerequisite for the new inliner, into language-independent code. I'm actually quite surprised that the tree-based inlining has made as much a difference (in the quality of the generated code) as it has in some cases. Some MIPS benchmarks one of our customers had now run twice as quickly -- somehow, the new inliner is making it easier for the back-end to do its job, at least in some situations. -- Mark Mitchell mark@codesourcery.com CodeSourcery, LLC http://www.codesourcery.com ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 8:55 ` Mark Mitchell @ 1999-12-21 9:06 ` Jamie Lokier 1999-12-31 23:54 ` Jamie Lokier 1999-12-31 23:54 ` Mark Mitchell 1 sibling, 1 reply; 32+ messages in thread From: Jamie Lokier @ 1999-12-21 9:06 UTC (permalink / raw) To: Mark Mitchell; +Cc: martin, n, osken393, gcc Mark Mitchell wrote: > I'm actually quite surprised that the tree-based inlining has made as > much a difference (in the quality of the generated code) as it has in > some cases. Some MIPS benchmarks one of our customers had now run > twice as quickly -- somehow, the new inliner is making it easier for > the back-end to do its job, at least in some situations. I'm not surprised. I long ago complained that the "inline function is as fast as a macro claim" was totally bogus. With tree-based inlining changes hopefully the claim will finally reflect reality. Just recently I noticed that in an inline (C) function, __builtin_constant_p was returning 1 just fine. But in a nested inline function, it was not. Perhaps the large expression was too much for the constant folder after RTL inlining, and it gave up. Presumably if __builtin_constant_p is not reflecting constantness even in some simple cases due to RTL inlining, early code generation decisions based on "is this a constant" are also assuming "no". Perhaps this gives some clue as to the kind of transformation the back end should have been doing all along to do good RTL-based inlining? I would not be surprised if such a transformation would be effective on other kinds code too. I look forward to seeing if tree inlining gives better __builtin_constant_p results. -- Jamie ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 9:06 ` Jamie Lokier @ 1999-12-31 23:54 ` Jamie Lokier 0 siblings, 0 replies; 32+ messages in thread From: Jamie Lokier @ 1999-12-31 23:54 UTC (permalink / raw) To: Mark Mitchell; +Cc: martin, n, osken393, gcc Mark Mitchell wrote: > I'm actually quite surprised that the tree-based inlining has made as > much a difference (in the quality of the generated code) as it has in > some cases. Some MIPS benchmarks one of our customers had now run > twice as quickly -- somehow, the new inliner is making it easier for > the back-end to do its job, at least in some situations. I'm not surprised. I long ago complained that the "inline function is as fast as a macro claim" was totally bogus. With tree-based inlining changes hopefully the claim will finally reflect reality. Just recently I noticed that in an inline (C) function, __builtin_constant_p was returning 1 just fine. But in a nested inline function, it was not. Perhaps the large expression was too much for the constant folder after RTL inlining, and it gave up. Presumably if __builtin_constant_p is not reflecting constantness even in some simple cases due to RTL inlining, early code generation decisions based on "is this a constant" are also assuming "no". Perhaps this gives some clue as to the kind of transformation the back end should have been doing all along to do good RTL-based inlining? I would not be surprised if such a transformation would be effective on other kinds code too. I look forward to seeing if tree inlining gives better __builtin_constant_p results. -- Jamie ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 8:55 ` Mark Mitchell 1999-12-21 9:06 ` Jamie Lokier @ 1999-12-31 23:54 ` Mark Mitchell 1 sibling, 0 replies; 32+ messages in thread From: Mark Mitchell @ 1999-12-31 23:54 UTC (permalink / raw) To: jamie.lokier; +Cc: martin, n, osken393, gcc >>>>> "Jamie" == Jamie Lokier <jamie.lokier@cern.ch> writes: Jamie> So we have a situation where the C++ compiler generates Jamie> better code than the C compiler from the same source? Jamie> Are there plans to add the tree inlining to C any time Jamie> soon? We (CodeSourcerY) don't have any such plans, although we're actively encouraging customers to do that work. I believe that Cygnus is working on moving some of the function-at-a-time work which is a necessary prerequisite for the new inliner, into language-independent code. I'm actually quite surprised that the tree-based inlining has made as much a difference (in the quality of the generated code) as it has in some cases. Some MIPS benchmarks one of our customers had now run twice as quickly -- somehow, the new inliner is making it easier for the back-end to do its job, at least in some situations. -- Mark Mitchell mark@codesourcery.com CodeSourcery, LLC http://www.codesourcery.com ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 8:04 ` Jamie Lokier 1999-12-21 8:55 ` Mark Mitchell @ 1999-12-21 9:43 ` Jeffrey A Law 1999-12-31 23:54 ` Jeffrey A Law 1999-12-21 9:46 ` Martin v. Loewis 1999-12-31 23:54 ` Jamie Lokier 3 siblings, 1 reply; 32+ messages in thread From: Jeffrey A Law @ 1999-12-21 9:43 UTC (permalink / raw) To: Jamie Lokier; +Cc: Martin v. Loewis, n, osken393, gcc In message < 19991221170444.B10482@pcep-jamie.cern.ch >you write: > > So we have a situation where the C++ compiler generates better code than > the C compiler from the same source? > > Are there plans to add the tree inlining to C any time soon? Cygnus is currently working on implementing functions as trees. The plan is to submit it for review as soon as it's working. jeff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 9:43 ` Jeffrey A Law @ 1999-12-31 23:54 ` Jeffrey A Law 0 siblings, 0 replies; 32+ messages in thread From: Jeffrey A Law @ 1999-12-31 23:54 UTC (permalink / raw) To: Jamie Lokier; +Cc: Martin v. Loewis, n, osken393, gcc In message < 19991221170444.B10482@pcep-jamie.cern.ch >you write: > > So we have a situation where the C++ compiler generates better code than > the C compiler from the same source? > > Are there plans to add the tree inlining to C any time soon? Cygnus is currently working on implementing functions as trees. The plan is to submit it for review as soon as it's working. jeff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 8:04 ` Jamie Lokier 1999-12-21 8:55 ` Mark Mitchell 1999-12-21 9:43 ` Jeffrey A Law @ 1999-12-21 9:46 ` Martin v. Loewis 1999-12-21 16:00 ` Jamie Lokier 1999-12-31 23:54 ` Martin v. Loewis 1999-12-31 23:54 ` Jamie Lokier 3 siblings, 2 replies; 32+ messages in thread From: Martin v. Loewis @ 1999-12-21 9:46 UTC (permalink / raw) To: jamie.lokier; +Cc: n, osken393, gcc > So we have a situation where the C++ compiler generates better code than > the C compiler from the same source? It might be possible to create examples. On the average, I doubt that. If it is plain C code that also compiles as C++ code, inlining most likely happens at the same places. I believe that the main advantage is in terms of memory consumption in the compiler itself. > Are there plans to add the tree inlining to C any time soon? I can't answer that question. Regards, Martin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 9:46 ` Martin v. Loewis @ 1999-12-21 16:00 ` Jamie Lokier 1999-12-21 16:08 ` Joe Buck ` (2 more replies) 1999-12-31 23:54 ` Martin v. Loewis 1 sibling, 3 replies; 32+ messages in thread From: Jamie Lokier @ 1999-12-21 16:00 UTC (permalink / raw) To: Martin v. Loewis; +Cc: osken393, gcc Martin v. Loewis wrote: > > So we have a situation where the C++ compiler generates better code than > > the C compiler from the same source? > > It might be possible to create examples. On the average, I doubt that. > If it is plain C code that also compiles as C++ code, inlining most > likely happens at the same places. The point is that tree inlining seems to generate better code than RTL inlining which the C compiler currently does. -- Jamie ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 16:00 ` Jamie Lokier @ 1999-12-21 16:08 ` Joe Buck 1999-12-22 0:35 ` Martin v. Loewis 1999-12-31 23:54 ` Joe Buck 1999-12-22 0:04 ` Martin v. Loewis 1999-12-31 23:54 ` Jamie Lokier 2 siblings, 2 replies; 32+ messages in thread From: Joe Buck @ 1999-12-21 16:08 UTC (permalink / raw) To: Jamie Lokier; +Cc: martin, osken393, gcc > Martin v. Loewis wrote: > > > So we have a situation where the C++ compiler generates better code than > > > the C compiler from the same source? > > > > It might be possible to create examples. On the average, I doubt that. > > If it is plain C code that also compiles as C++ code, inlining most > > likely happens at the same places. > > The point is that tree inlining seems to generate better code than RTL > inlining which the C compiler currently does. The RTL inlining happens too late, after some objects have already been assigned to memory. Thus passing an automatic struct or C++ class to an inline function often results in dead stores when the RTL inliner is used. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 16:08 ` Joe Buck @ 1999-12-22 0:35 ` Martin v. Loewis 1999-12-31 1:56 ` Kevin Atkinson 1999-12-31 23:54 ` Martin v. Loewis 1999-12-31 23:54 ` Joe Buck 1 sibling, 2 replies; 32+ messages in thread From: Martin v. Loewis @ 1999-12-22 0:35 UTC (permalink / raw) To: jbuck; +Cc: jamie.lokier, gcc > The RTL inlining happens too late, after some objects have already been > assigned to memory. Thus passing an automatic struct or C++ class to an > inline function often results in dead stores when the RTL inliner is used. Given this hint, I would guess that the code struct A{ int i; int j; }; inline int foo(struct A a) { return a.i+a.j; } int bar() { struct A a = {1,2}; return foo(a); } should compile better now, right? Compiled with g++ -V2.95.2 -O2 -fomit-frame-pointer, I get bar__Fv: .LFB1: subl $28,%esp .LCFI0: movl $0,8(%esp) movl $1,8(%esp) movl 8(%esp),%eax movl $0,12(%esp) movl $2,12(%esp) movl 12(%esp),%edx addl %edx,%eax addl $28,%esp .LCFI1: ret .LFE1: I can clearly see the dead stores you are talking about. Now let's try 2.96 19991221: bar__Fv: .LFB1: subl $28, %esp .LCFI0: movl $1, %eax movl $2, %edx movl %eax, 8(%esp) movl $3, %eax movl %edx, 12(%esp) addl $28, %esp ret .LFE1: Yes, it does eliminate some of the dead stores. Now compile it as plain C (with either 2.95, or the new back-end): bar: movl $3,%eax ret So C is still much better than C++. I understand that 2.96 still stores the final state of "a", because it believes the address of a was taken, but I'm surprised it can't emit movl $1, 8(%esp) movl $2, 12(%esp) movl $3, %eax ret since the values of %eax and %edx are not used after the store, anymore. Also, the stack manipulation seems unnecessary. I was blaming it on exception handling, but -fno-exceptions does not improve the code. Regards, Martin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-22 0:35 ` Martin v. Loewis @ 1999-12-31 1:56 ` Kevin Atkinson 1999-12-31 23:54 ` Kevin Atkinson 1999-12-31 23:54 ` Martin v. Loewis 1 sibling, 1 reply; 32+ messages in thread From: Kevin Atkinson @ 1999-12-31 1:56 UTC (permalink / raw) To: Martin v. Loewis; +Cc: jbuck, jamie.lokier, gcc "Martin v. Loewis" wrote: > > > The RTL inlining happens too late, after some objects have already been > > assigned to memory. Thus passing an automatic struct or C++ class to an > > inline function often results in dead stores when the RTL inliner is used. > > Given this hint, I would guess that the code > > struct A{ > int i; > int j; > }; > > inline > int foo(struct A a) > { > return a.i+a.j; > } > > int bar() > { > struct A a = {1,2}; > return foo(a); > } > > should compile better now, right? Compiled with g++ -V2.95.2 -O2 > -fomit-frame-pointer, I get > > bar__Fv: > .LFB1: > subl $28,%esp > .LCFI0: > movl $0,8(%esp) > movl $1,8(%esp) > movl 8(%esp),%eax > movl $0,12(%esp) > movl $2,12(%esp) > movl 12(%esp),%edx > addl %edx,%eax > addl $28,%esp > .LCFI1: > ret > .LFE1: > > I can clearly see the dead stores you are talking about. Now let's try > 2.96 19991221: > > bar__Fv: > .LFB1: > subl $28, %esp > .LCFI0: > movl $1, %eax > movl $2, %edx > movl %eax, 8(%esp) > movl $3, %eax > movl %edx, 12(%esp) > addl $28, %esp > ret > .LFE1: > > Yes, it does eliminate some of the dead stores. Now compile it as > plain C (with either 2.95, or the new back-end): > > bar: > movl $3,%eax > ret > > So C is still much better than C++. I understand that 2.96 still > stores the final state of "a", because it believes the address of a > was taken, but I'm surprised it can't emit > > movl $1, 8(%esp) > movl $2, 12(%esp) > movl $3, %eax > ret > > since the values of %eax and %edx are not used after the store, > anymore. Also, the stack manipulation seems unnecessary. I was blaming > it on exception handling, but -fno-exceptions does not improve the > code. So when compiled as plain C gcc does a better job at inlining then C++ or did you use a macro there and just not tell us? -- Kevin Atkinson kevinatk@home.com http://metalab.unc.edu/kevina/ ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-31 1:56 ` Kevin Atkinson @ 1999-12-31 23:54 ` Kevin Atkinson 0 siblings, 0 replies; 32+ messages in thread From: Kevin Atkinson @ 1999-12-31 23:54 UTC (permalink / raw) To: Martin v. Loewis; +Cc: jbuck, jamie.lokier, gcc "Martin v. Loewis" wrote: > > > The RTL inlining happens too late, after some objects have already been > > assigned to memory. Thus passing an automatic struct or C++ class to an > > inline function often results in dead stores when the RTL inliner is used. > > Given this hint, I would guess that the code > > struct A{ > int i; > int j; > }; > > inline > int foo(struct A a) > { > return a.i+a.j; > } > > int bar() > { > struct A a = {1,2}; > return foo(a); > } > > should compile better now, right? Compiled with g++ -V2.95.2 -O2 > -fomit-frame-pointer, I get > > bar__Fv: > .LFB1: > subl $28,%esp > .LCFI0: > movl $0,8(%esp) > movl $1,8(%esp) > movl 8(%esp),%eax > movl $0,12(%esp) > movl $2,12(%esp) > movl 12(%esp),%edx > addl %edx,%eax > addl $28,%esp > .LCFI1: > ret > .LFE1: > > I can clearly see the dead stores you are talking about. Now let's try > 2.96 19991221: > > bar__Fv: > .LFB1: > subl $28, %esp > .LCFI0: > movl $1, %eax > movl $2, %edx > movl %eax, 8(%esp) > movl $3, %eax > movl %edx, 12(%esp) > addl $28, %esp > ret > .LFE1: > > Yes, it does eliminate some of the dead stores. Now compile it as > plain C (with either 2.95, or the new back-end): > > bar: > movl $3,%eax > ret > > So C is still much better than C++. I understand that 2.96 still > stores the final state of "a", because it believes the address of a > was taken, but I'm surprised it can't emit > > movl $1, 8(%esp) > movl $2, 12(%esp) > movl $3, %eax > ret > > since the values of %eax and %edx are not used after the store, > anymore. Also, the stack manipulation seems unnecessary. I was blaming > it on exception handling, but -fno-exceptions does not improve the > code. So when compiled as plain C gcc does a better job at inlining then C++ or did you use a macro there and just not tell us? -- Kevin Atkinson kevinatk@home.com http://metalab.unc.edu/kevina/ ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-22 0:35 ` Martin v. Loewis 1999-12-31 1:56 ` Kevin Atkinson @ 1999-12-31 23:54 ` Martin v. Loewis 1 sibling, 0 replies; 32+ messages in thread From: Martin v. Loewis @ 1999-12-31 23:54 UTC (permalink / raw) To: jbuck; +Cc: jamie.lokier, gcc > The RTL inlining happens too late, after some objects have already been > assigned to memory. Thus passing an automatic struct or C++ class to an > inline function often results in dead stores when the RTL inliner is used. Given this hint, I would guess that the code struct A{ int i; int j; }; inline int foo(struct A a) { return a.i+a.j; } int bar() { struct A a = {1,2}; return foo(a); } should compile better now, right? Compiled with g++ -V2.95.2 -O2 -fomit-frame-pointer, I get bar__Fv: .LFB1: subl $28,%esp .LCFI0: movl $0,8(%esp) movl $1,8(%esp) movl 8(%esp),%eax movl $0,12(%esp) movl $2,12(%esp) movl 12(%esp),%edx addl %edx,%eax addl $28,%esp .LCFI1: ret .LFE1: I can clearly see the dead stores you are talking about. Now let's try 2.96 19991221: bar__Fv: .LFB1: subl $28, %esp .LCFI0: movl $1, %eax movl $2, %edx movl %eax, 8(%esp) movl $3, %eax movl %edx, 12(%esp) addl $28, %esp ret .LFE1: Yes, it does eliminate some of the dead stores. Now compile it as plain C (with either 2.95, or the new back-end): bar: movl $3,%eax ret So C is still much better than C++. I understand that 2.96 still stores the final state of "a", because it believes the address of a was taken, but I'm surprised it can't emit movl $1, 8(%esp) movl $2, 12(%esp) movl $3, %eax ret since the values of %eax and %edx are not used after the store, anymore. Also, the stack manipulation seems unnecessary. I was blaming it on exception handling, but -fno-exceptions does not improve the code. Regards, Martin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 16:08 ` Joe Buck 1999-12-22 0:35 ` Martin v. Loewis @ 1999-12-31 23:54 ` Joe Buck 1 sibling, 0 replies; 32+ messages in thread From: Joe Buck @ 1999-12-31 23:54 UTC (permalink / raw) To: Jamie Lokier; +Cc: martin, osken393, gcc > Martin v. Loewis wrote: > > > So we have a situation where the C++ compiler generates better code than > > > the C compiler from the same source? > > > > It might be possible to create examples. On the average, I doubt that. > > If it is plain C code that also compiles as C++ code, inlining most > > likely happens at the same places. > > The point is that tree inlining seems to generate better code than RTL > inlining which the C compiler currently does. The RTL inlining happens too late, after some objects have already been assigned to memory. Thus passing an automatic struct or C++ class to an inline function often results in dead stores when the RTL inliner is used. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 16:00 ` Jamie Lokier 1999-12-21 16:08 ` Joe Buck @ 1999-12-22 0:04 ` Martin v. Loewis 1999-12-22 0:15 ` Marcin Dalecki ` (2 more replies) 1999-12-31 23:54 ` Jamie Lokier 2 siblings, 3 replies; 32+ messages in thread From: Martin v. Loewis @ 1999-12-22 0:04 UTC (permalink / raw) To: jamie.lokier; +Cc: gcc > The point is that tree inlining seems to generate better code than RTL > inlining which the C compiler currently does. Examples? Martin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-22 0:04 ` Martin v. Loewis @ 1999-12-22 0:15 ` Marcin Dalecki 1999-12-22 1:56 ` Martin v. Loewis 1999-12-31 23:54 ` Marcin Dalecki 1999-12-22 6:57 ` Jamie Lokier 1999-12-31 23:54 ` Martin v. Loewis 2 siblings, 2 replies; 32+ messages in thread From: Marcin Dalecki @ 1999-12-22 0:15 UTC (permalink / raw) To: Martin v. Loewis; +Cc: jamie.lokier, gcc "Martin v. Loewis" wrote: > > > The point is that tree inlining seems to generate better code than RTL > > inlining which the C compiler currently does. > > Examples? > No problem: looks at some recent linux-2.3.xxx kernel source: root:/usr/src/linux/fs# less buffer.c Look for the macros: #define _hashfn(dev,block) #define hash(dev,block) And theyr usage. Later down they are used with some intermediate value which get's outpotimized for the macro version, but which doesn't go away without reordering of the usage code for the inline versions thereof. --Marcin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-22 0:15 ` Marcin Dalecki @ 1999-12-22 1:56 ` Martin v. Loewis 1999-12-31 23:54 ` Martin v. Loewis 1999-12-31 23:54 ` Marcin Dalecki 1 sibling, 1 reply; 32+ messages in thread From: Martin v. Loewis @ 1999-12-22 1:56 UTC (permalink / raw) To: dalecki; +Cc: jamie.lokier, gcc > > > The point is that tree inlining seems to generate better code than RTL > > > inlining which the C compiler currently does. > > > > Examples? > > > > No problem: looks at some recent linux-2.3.xxx kernel source: Pardon? How is this example relevant to tree inlining? Tree inlining is currently done only by the C++ compiler, and the kernel is not compiled by the C++ compiler. I have no doubt macros generate better code than inline functions, in any version of gcc. But that was not my question. Regards, Martin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-22 1:56 ` Martin v. Loewis @ 1999-12-31 23:54 ` Martin v. Loewis 0 siblings, 0 replies; 32+ messages in thread From: Martin v. Loewis @ 1999-12-31 23:54 UTC (permalink / raw) To: dalecki; +Cc: jamie.lokier, gcc > > > The point is that tree inlining seems to generate better code than RTL > > > inlining which the C compiler currently does. > > > > Examples? > > > > No problem: looks at some recent linux-2.3.xxx kernel source: Pardon? How is this example relevant to tree inlining? Tree inlining is currently done only by the C++ compiler, and the kernel is not compiled by the C++ compiler. I have no doubt macros generate better code than inline functions, in any version of gcc. But that was not my question. Regards, Martin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-22 0:15 ` Marcin Dalecki 1999-12-22 1:56 ` Martin v. Loewis @ 1999-12-31 23:54 ` Marcin Dalecki 1 sibling, 0 replies; 32+ messages in thread From: Marcin Dalecki @ 1999-12-31 23:54 UTC (permalink / raw) To: Martin v. Loewis; +Cc: jamie.lokier, gcc "Martin v. Loewis" wrote: > > > The point is that tree inlining seems to generate better code than RTL > > inlining which the C compiler currently does. > > Examples? > No problem: looks at some recent linux-2.3.xxx kernel source: root:/usr/src/linux/fs# less buffer.c Look for the macros: #define _hashfn(dev,block) #define hash(dev,block) And theyr usage. Later down they are used with some intermediate value which get's outpotimized for the macro version, but which doesn't go away without reordering of the usage code for the inline versions thereof. --Marcin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-22 0:04 ` Martin v. Loewis 1999-12-22 0:15 ` Marcin Dalecki @ 1999-12-22 6:57 ` Jamie Lokier 1999-12-22 7:58 ` Mark Mitchell 1999-12-31 23:54 ` Jamie Lokier 1999-12-31 23:54 ` Martin v. Loewis 2 siblings, 2 replies; 32+ messages in thread From: Jamie Lokier @ 1999-12-22 6:57 UTC (permalink / raw) To: Martin v. Loewis; +Cc: gcc Martin v. Loewis wrote: > > The point is that tree inlining seems to generate better code than RTL > > inlining which the C compiler currently does. > > Examples? Mark Mitchell said so; I believe him. I haven't used the tree inlining compiler yet. There are many fine examples of trivial optimisation not being done with inline functions that are done with macros. I assume most of them will occur with tree inlining too (why not?). But I will have to wait and see. -- Jamie ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-22 6:57 ` Jamie Lokier @ 1999-12-22 7:58 ` Mark Mitchell 1999-12-31 23:54 ` Mark Mitchell 1999-12-31 23:54 ` Jamie Lokier 1 sibling, 1 reply; 32+ messages in thread From: Mark Mitchell @ 1999-12-22 7:58 UTC (permalink / raw) To: jamie.lokier; +Cc: martin, gcc Martin v. Loewis wrote: > > The point is that tree inlining seems to generate better code than RTL > > inlining which the C compiler currently does. > > Examples? Mark Mitchell said so; I believe him. I haven't used the tree inlining compiler yet. The LANL Pooma II library runs faster with the changes on some of its benchmarks. There is *extreme* inlining going on there, and the final loops are very small. So, saving one instruction to do one dead store going away, say, could make a 30% difference. There are many fine examples of trivial optimisation not being done with inline functions that are done with macros. I assume most of them will occur with tree inlining too (why not?). But I will have to wait and see. I concur. I don't expect typical code to see major wins, yet. One of the things now easy to do (in theory) is scatter-gather of loads and stores. That will expose small structures (with two members, say, like a `complex' class) to the back-end optimizers (which deal almost exclusively with REGs). -- Mark Mitchell mark@codesourcery.com CodeSourcery, LLC http://www.codesourcery.com ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-22 7:58 ` Mark Mitchell @ 1999-12-31 23:54 ` Mark Mitchell 0 siblings, 0 replies; 32+ messages in thread From: Mark Mitchell @ 1999-12-31 23:54 UTC (permalink / raw) To: jamie.lokier; +Cc: martin, gcc Martin v. Loewis wrote: > > The point is that tree inlining seems to generate better code than RTL > > inlining which the C compiler currently does. > > Examples? Mark Mitchell said so; I believe him. I haven't used the tree inlining compiler yet. The LANL Pooma II library runs faster with the changes on some of its benchmarks. There is *extreme* inlining going on there, and the final loops are very small. So, saving one instruction to do one dead store going away, say, could make a 30% difference. There are many fine examples of trivial optimisation not being done with inline functions that are done with macros. I assume most of them will occur with tree inlining too (why not?). But I will have to wait and see. I concur. I don't expect typical code to see major wins, yet. One of the things now easy to do (in theory) is scatter-gather of loads and stores. That will expose small structures (with two members, say, like a `complex' class) to the back-end optimizers (which deal almost exclusively with REGs). -- Mark Mitchell mark@codesourcery.com CodeSourcery, LLC http://www.codesourcery.com ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-22 6:57 ` Jamie Lokier 1999-12-22 7:58 ` Mark Mitchell @ 1999-12-31 23:54 ` Jamie Lokier 1 sibling, 0 replies; 32+ messages in thread From: Jamie Lokier @ 1999-12-31 23:54 UTC (permalink / raw) To: Martin v. Loewis; +Cc: gcc Martin v. Loewis wrote: > > The point is that tree inlining seems to generate better code than RTL > > inlining which the C compiler currently does. > > Examples? Mark Mitchell said so; I believe him. I haven't used the tree inlining compiler yet. There are many fine examples of trivial optimisation not being done with inline functions that are done with macros. I assume most of them will occur with tree inlining too (why not?). But I will have to wait and see. -- Jamie ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-22 0:04 ` Martin v. Loewis 1999-12-22 0:15 ` Marcin Dalecki 1999-12-22 6:57 ` Jamie Lokier @ 1999-12-31 23:54 ` Martin v. Loewis 2 siblings, 0 replies; 32+ messages in thread From: Martin v. Loewis @ 1999-12-31 23:54 UTC (permalink / raw) To: jamie.lokier; +Cc: gcc > The point is that tree inlining seems to generate better code than RTL > inlining which the C compiler currently does. Examples? Martin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 16:00 ` Jamie Lokier 1999-12-21 16:08 ` Joe Buck 1999-12-22 0:04 ` Martin v. Loewis @ 1999-12-31 23:54 ` Jamie Lokier 2 siblings, 0 replies; 32+ messages in thread From: Jamie Lokier @ 1999-12-31 23:54 UTC (permalink / raw) To: Martin v. Loewis; +Cc: osken393, gcc Martin v. Loewis wrote: > > So we have a situation where the C++ compiler generates better code than > > the C compiler from the same source? > > It might be possible to create examples. On the average, I doubt that. > If it is plain C code that also compiles as C++ code, inlining most > likely happens at the same places. The point is that tree inlining seems to generate better code than RTL inlining which the C compiler currently does. -- Jamie ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 9:46 ` Martin v. Loewis 1999-12-21 16:00 ` Jamie Lokier @ 1999-12-31 23:54 ` Martin v. Loewis 1 sibling, 0 replies; 32+ messages in thread From: Martin v. Loewis @ 1999-12-31 23:54 UTC (permalink / raw) To: jamie.lokier; +Cc: n, osken393, gcc > So we have a situation where the C++ compiler generates better code than > the C compiler from the same source? It might be possible to create examples. On the average, I doubt that. If it is plain C code that also compiles as C++ code, inlining most likely happens at the same places. I believe that the main advantage is in terms of memory consumption in the compiler itself. > Are there plans to add the tree inlining to C any time soon? I can't answer that question. Regards, Martin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 8:04 ` Jamie Lokier ` (2 preceding siblings ...) 1999-12-21 9:46 ` Martin v. Loewis @ 1999-12-31 23:54 ` Jamie Lokier 3 siblings, 0 replies; 32+ messages in thread From: Jamie Lokier @ 1999-12-31 23:54 UTC (permalink / raw) To: Martin v. Loewis, n; +Cc: osken393, gcc Martin v. Loewis wrote: > > I read the announcement about the "inlining improvements" on the website. > > It's great news! Is this code checked in already? > > Yes, it is. Have a look at cp/ChangeLog, [...] So we have a situation where the C++ compiler generates better code than the C compiler from the same source? Are there plans to add the tree inlining to C any time soon? thanks, -- Jamie ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Inlining Improvements 1999-12-21 4:59 ` Martin v. Loewis 1999-12-21 8:04 ` Jamie Lokier @ 1999-12-31 23:54 ` Martin v. Loewis 1 sibling, 0 replies; 32+ messages in thread From: Martin v. Loewis @ 1999-12-31 23:54 UTC (permalink / raw) To: osken393; +Cc: gcc > I read the announcement about the "inlining improvements" on the website. > It's great news! Is this code checked in already? Yes, it is. Have a look at cp/ChangeLog, in particular 1999-12-05 Mark Mitchell <mark@codesourcery.com> 1999-12-04 Mark Mitchell <mark@codesourcery.com> 1999-11-25 Mark Mitchell <mark@codesourcery.com> and others. Regards, Martin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Inlining Improvements 1999-12-21 2:49 Inlining Improvements Oskar Enoksson 1999-12-21 4:59 ` Martin v. Loewis @ 1999-12-31 23:54 ` Oskar Enoksson 1 sibling, 0 replies; 32+ messages in thread From: Oskar Enoksson @ 1999-12-31 23:54 UTC (permalink / raw) To: gcc I read the announcement about the "inlining improvements" on the website. It's great news! Is this code checked in already? If not, how soon could it be available? Thanks! /* Oskar Enoksson, Linkoping, Sweden */ ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~1999-12-31 23:54 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 1999-12-21 2:49 Inlining Improvements Oskar Enoksson 1999-12-21 4:59 ` Martin v. Loewis 1999-12-21 8:04 ` Jamie Lokier 1999-12-21 8:55 ` Mark Mitchell 1999-12-21 9:06 ` Jamie Lokier 1999-12-31 23:54 ` Jamie Lokier 1999-12-31 23:54 ` Mark Mitchell 1999-12-21 9:43 ` Jeffrey A Law 1999-12-31 23:54 ` Jeffrey A Law 1999-12-21 9:46 ` Martin v. Loewis 1999-12-21 16:00 ` Jamie Lokier 1999-12-21 16:08 ` Joe Buck 1999-12-22 0:35 ` Martin v. Loewis 1999-12-31 1:56 ` Kevin Atkinson 1999-12-31 23:54 ` Kevin Atkinson 1999-12-31 23:54 ` Martin v. Loewis 1999-12-31 23:54 ` Joe Buck 1999-12-22 0:04 ` Martin v. Loewis 1999-12-22 0:15 ` Marcin Dalecki 1999-12-22 1:56 ` Martin v. Loewis 1999-12-31 23:54 ` Martin v. Loewis 1999-12-31 23:54 ` Marcin Dalecki 1999-12-22 6:57 ` Jamie Lokier 1999-12-22 7:58 ` Mark Mitchell 1999-12-31 23:54 ` Mark Mitchell 1999-12-31 23:54 ` Jamie Lokier 1999-12-31 23:54 ` Martin v. Loewis 1999-12-31 23:54 ` Jamie Lokier 1999-12-31 23:54 ` Martin v. Loewis 1999-12-31 23:54 ` Jamie Lokier 1999-12-31 23:54 ` Martin v. Loewis 1999-12-31 23:54 ` Oskar Enoksson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).