public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: 3.3 compile time regression (22400%)
       [not found] <Pine.LNX.4.44.0303181300580.22005-100000@bellatrix.tat.physik.uni-tuebingen .de>
@ 2003-03-18 17:04 ` Steven Bosscher
  2003-03-18 17:31   ` Richard Guenther
  2003-03-22 15:13 ` Steven Bosscher
  2003-03-22 17:03 ` Steven Bosscher
  2 siblings, 1 reply; 41+ messages in thread
From: Steven Bosscher @ 2003-03-18 17:04 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

Op di 18-03-2003, om 13:13 schreef Richard Guenther:
> Hi!
> 
> I'm experiencing huge compile time regression with todays 3.3 compared to
> 3.2 during compiling of the POOMA library. -ftime-report shows that the
> culprit is expand:
> 
> gcc-3.2:
> 
>  expand           :   0.83 ( 8%) usr   0.04 ( 8%) sys   0.94 ( 9%) wall
> 
> gcc-3.3:
> 
>  expand           :2151.80 (96%) usr  11.35 (44%) sys2181.19 (95%) wall
> 
> This is with -O2 -fomit-frame-pointer -funroll-loops -ftime-report

Do you see this slowdown on the mainline too?

Greetz
Steven

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 17:04 ` 3.3 compile time regression (22400%) Steven Bosscher
@ 2003-03-18 17:31   ` Richard Guenther
  0 siblings, 0 replies; 41+ messages in thread
From: Richard Guenther @ 2003-03-18 17:31 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc

On 18 Mar 2003, Steven Bosscher wrote:

> Op di 18-03-2003, om 13:13 schreef Richard Guenther:
> > Hi!
> >
> > I'm experiencing huge compile time regression with todays 3.3 compared to
> > 3.2 during compiling of the POOMA library. -ftime-report shows that the
> > culprit is expand:
> >
> > gcc-3.2:
> >
> >  expand           :   0.83 ( 8%) usr   0.04 ( 8%) sys   0.94 ( 9%) wall
> >
> > gcc-3.3:
> >
> >  expand           :2151.80 (96%) usr  11.35 (44%) sys2181.19 (95%) wall
> >
> > This is with -O2 -fomit-frame-pointer -funroll-loops -ftime-report
>
> Do you see this slowdown on the mainline too?

Yes, with mainline as of 20030313 I get

min-inline-insns             time
  100                         8.16
  150                        10.81
  200                        12.51
  250                       291.96

-fno-default-inline          12.03

Richard.

--
Richard Guenther <richard.guenther@uni-tuebingen.de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
       [not found] <Pine.LNX.4.44.0303181300580.22005-100000@bellatrix.tat.physik.uni-tuebingen .de>
  2003-03-18 17:04 ` 3.3 compile time regression (22400%) Steven Bosscher
@ 2003-03-22 15:13 ` Steven Bosscher
  2003-03-22 17:03 ` Steven Bosscher
  2 siblings, 0 replies; 41+ messages in thread
From: Steven Bosscher @ 2003-03-22 15:13 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

Op di 18-03-2003, om 13:13 schreef Richard Guenther:
> Hi!
> 
> I'm experiencing huge compile time regression with todays 3.3 compared to
> 3.2 during compiling of the POOMA library. -ftime-report shows that the
> culprit is expand:
> 
> gcc-3.2:
> 
>  expand           :   0.83 ( 8%) usr   0.04 ( 8%) sys   0.94 ( 9%) wall
> 
> gcc-3.3:
> 
>  expand           :2151.80 (96%) usr  11.35 (44%) sys2181.19 (95%) wall

PR 2692 is another example of excesive "expand" time.  If you or
somebody has a profiling compiler, it may be worth to investigate that
PR as well.

Greetz
Steven


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
       [not found] <Pine.LNX.4.44.0303181300580.22005-100000@bellatrix.tat.physik.uni-tuebingen .de>
  2003-03-18 17:04 ` 3.3 compile time regression (22400%) Steven Bosscher
  2003-03-22 15:13 ` Steven Bosscher
@ 2003-03-22 17:03 ` Steven Bosscher
  2003-03-23  4:30   ` Janis Johnson
  2 siblings, 1 reply; 41+ messages in thread
From: Steven Bosscher @ 2003-03-22 17:03 UTC (permalink / raw)
  To: janis187, richard.guenther; +Cc: marc, gcc

Hi Janis, Richard,

This is a regression but somehow there is no PR for it, and no test
case.  Isn't being able to compile POOMA a release requirement?

Richard, can you open a PR for this issue and put that multi-megabyte
test case up for download somewhere?  If you don't have a public space
for your test case, then I can make some of my own available for you.

Janis, when the test case is there, do you have the cycles to track this
one down to the breaking patch somehow?

Greetz
Steven



Op di 18-03-2003, om 13:13 schreef Richard Guenther:
> Hi!
> 
> I'm experiencing huge compile time regression with todays 3.3 compared to
> 3.2 during compiling of the POOMA library. -ftime-report shows that the
> culprit is expand:
> 
> gcc-3.2:
> 
>  expand           :   0.83 ( 8%) usr   0.04 ( 8%) sys   0.94 ( 9%) wall
> 
> gcc-3.3:
> 
>  expand           :2151.80 (96%) usr  11.35 (44%) sys2181.19 (95%) wall
> 
> This is with -O2 -fomit-frame-pointer -funroll-loops -ftime-report
> 
> The picture does change if I specify -fno-default-inline, in this case
> compile times are comparable (expand time shrinks to 1.33s in this case).
> The complete build is then also faster with gcc 3.3 (3m17) than with
> gcc 3.2 (4m5).
> 
> I can provide a multi-megabyte .ii file as testcase, but maybe somebody
> already has an idea what is happening.
> 
> Richard.
> 
> --
> Richard Guenther <richard.guenther@uni-tuebingen.de>
> WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-22 17:03 ` Steven Bosscher
@ 2003-03-23  4:30   ` Janis Johnson
  2003-03-23 22:14     ` Richard Guenther
  2003-03-25  2:27     ` Janis Johnson
  0 siblings, 2 replies; 41+ messages in thread
From: Janis Johnson @ 2003-03-23  4:30 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: janis187, richard.guenther, mark, gcc

On Sat, Mar 22, 2003 at 02:39:43PM +0100, Steven Bosscher wrote:
> Hi Janis, Richard,
> 
> This is a regression but somehow there is no PR for it, and no test
> case.  Isn't being able to compile POOMA a release requirement?
> 
> Richard, can you open a PR for this issue and put that multi-megabyte
> test case up for download somewhere?  If you don't have a public space
> for your test case, then I can make some of my own available for you.
> 
> Janis, when the test case is there, do you have the cycles to track this
> one down to the breaking patch somehow?

Yes, I can do that.

Richard Guenther, if you can tell me explicitly how to set things up
using an existing POOMA tarball then I can do that instead of getting
the giant preprocessed source file (this is a one-time offer).  I've
got the tarball that is referenced in
http://gcc.gnu.org/testing/testing-pooma.html.

Janis

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-23  4:30   ` Janis Johnson
@ 2003-03-23 22:14     ` Richard Guenther
  2003-03-24  3:26       ` Richard Guenther
  2003-03-25  2:27     ` Janis Johnson
  1 sibling, 1 reply; 41+ messages in thread
From: Richard Guenther @ 2003-03-23 22:14 UTC (permalink / raw)
  To: Janis Johnson; +Cc: Steven Bosscher, richard.guenther, mark, gcc

On Sat, 22 Mar 2003, Janis Johnson wrote:

> On Sat, Mar 22, 2003 at 02:39:43PM +0100, Steven Bosscher wrote:
> > Hi Janis, Richard,
> >
> > This is a regression but somehow there is no PR for it, and no test
> > case.  Isn't being able to compile POOMA a release requirement?
> >
> > Richard, can you open a PR for this issue and put that multi-megabyte
> > test case up for download somewhere?  If you don't have a public space
> > for your test case, then I can make some of my own available for you.
> >
> > Janis, when the test case is there, do you have the cycles to track this
> > one down to the breaking patch somehow?
>
> Yes, I can do that.
>
> Richard Guenther, if you can tell me explicitly how to set things up
> using an existing POOMA tarball then I can do that instead of getting
> the giant preprocessed source file (this is a one-time offer).  I've
> got the tarball that is referenced in
> http://gcc.gnu.org/testing/testing-pooma.html.

I dont know if the tarball produces the same failure (I'm using current
CVS), but I can easily (tomorrow morning) construct two testcases that
show excessive expand time compared to 3.2 and put both online for
download (and open a PR for them).

Richard.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-23 22:14     ` Richard Guenther
@ 2003-03-24  3:26       ` Richard Guenther
  2003-03-24  8:47         ` Steven Bosscher
  0 siblings, 1 reply; 41+ messages in thread
From: Richard Guenther @ 2003-03-24  3:26 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Janis Johnson, Steven Bosscher, richard.guenther, mark, gcc

On Sun, 23 Mar 2003, Richard Guenther wrote:

> On Sat, 22 Mar 2003, Janis Johnson wrote:
>
> > On Sat, Mar 22, 2003 at 02:39:43PM +0100, Steven Bosscher wrote:
> > > Hi Janis, Richard,
> > >
> > > This is a regression but somehow there is no PR for it, and no test
> > > case.  Isn't being able to compile POOMA a release requirement?
> > >
> > > Richard, can you open a PR for this issue and put that multi-megabyte
> > > test case up for download somewhere?  If you don't have a public space
> > > for your test case, then I can make some of my own available for you.
> > >
> > > Janis, when the test case is there, do you have the cycles to track this
> > > one down to the breaking patch somehow?
> >
> > Yes, I can do that.
> >
> > Richard Guenther, if you can tell me explicitly how to set things up
> > using an existing POOMA tarball then I can do that instead of getting
> > the giant preprocessed source file (this is a one-time offer).  I've
> > got the tarball that is referenced in
> > http://gcc.gnu.org/testing/testing-pooma.html.
>
> I dont know if the tarball produces the same failure (I'm using current
> CVS), but I can easily (tomorrow morning) construct two testcases that
> show excessive expand time compared to 3.2 and put both online for
> download (and open a PR for them).

This is optimization/10196 now, and I have linked the testcase from POOMA
from it
(http://www.tat.physik.uni-tuebingen.de/~rguenth/DynamicLayout2.cmpl.ii.gz).
This is the testcase with the most serious regression in compilation speed
from the pooma library (all other files regress, too, but not that much).
Its rather large, sorry about that.

If you need more information, please tell me.

Richard.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-24  3:26       ` Richard Guenther
@ 2003-03-24  8:47         ` Steven Bosscher
  2003-03-24 10:06           ` Richard Guenther
                             ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Steven Bosscher @ 2003-03-24  8:47 UTC (permalink / raw)
  To: Richard Guenther; +Cc: rmark, gcc, Janis Johnson

Op zo 23-03-2003, om 23:21 schreef Richard Guenther:
> On Sun, 23 Mar 2003, Richard Guenther wrote:
> This is optimization/10196 now, and I have linked the testcase from POOMA
> from it
> (http://www.tat.physik.uni-tuebingen.de/~rguenth/DynamicLayout2.cmpl.ii.gz).
> This is the testcase with the most serious regression in compilation speed
> from the pooma library (all other files regress, too, but not that much).
> Its rather large, sorry about that.
> 
> If you need more information, please tell me.

Richard,

Mark Mitchell has fixed 7086, which is another PR about excessive time
spent in "expand".  In this PR the slowdown is not nearly as bad as
yours, so the issues could be unrelated, but maybe you try this once
more with an updated GCC 3.3 tree?

I tried to download your testcase, but the indicated URL
"http://www.tat.physik.uni-tuebingen.de/~rguenth/DynamicLayout2.ii.gz"
does not exist.

Greetz
Steven


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-24  8:47         ` Steven Bosscher
@ 2003-03-24 10:06           ` Richard Guenther
  2003-03-24 10:50           ` 3.3 Bootstrap failure (was Re: 3.3 compile time regression (22400%)) Richard Guenther
  2003-03-24 15:21           ` 3.3 compile time regression (22400%) Richard Guenther
  2 siblings, 0 replies; 41+ messages in thread
From: Richard Guenther @ 2003-03-24 10:06 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: rmark, gcc, Janis Johnson

On 24 Mar 2003, Steven Bosscher wrote:

> Op zo 23-03-2003, om 23:21 schreef Richard Guenther:
> > On Sun, 23 Mar 2003, Richard Guenther wrote:
> > This is optimization/10196 now, and I have linked the testcase from POOMA
> > from it
> > (http://www.tat.physik.uni-tuebingen.de/~rguenth/DynamicLayout2.cmpl.ii.gz).
> > This is the testcase with the most serious regression in compilation speed
> > from the pooma library (all other files regress, too, but not that much).
> > Its rather large, sorry about that.
> >
> > If you need more information, please tell me.
>
> Richard,
>
> Mark Mitchell has fixed 7086, which is another PR about excessive time
> spent in "expand".  In this PR the slowdown is not nearly as bad as
> yours, so the issues could be unrelated, but maybe you try this once
> more with an updated GCC 3.3 tree?
>
> I tried to download your testcase, but the indicated URL
> "http://www.tat.physik.uni-tuebingen.de/~rguenth/DynamicLayout2.ii.gz"
> does not exist.

Whoops - I should not file PRs after midnight ;) The correct URL is
http://www.tat.physik.uni-tuebingen.de/~rguenth/DynamicLayout2.cmpl.ii.gz

I'm currently bootstrapping new gcc 3.3 and will report back changes in
time report.

Richard.

--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 41+ messages in thread

* 3.3 Bootstrap failure (was Re: 3.3 compile time regression (22400%))
  2003-03-24  8:47         ` Steven Bosscher
  2003-03-24 10:06           ` Richard Guenther
@ 2003-03-24 10:50           ` Richard Guenther
  2003-03-24 15:21           ` 3.3 compile time regression (22400%) Richard Guenther
  2 siblings, 0 replies; 41+ messages in thread
From: Richard Guenther @ 2003-03-24 10:50 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: mmitchel, gcc, Janis Johnson

On 24 Mar 2003, Steven Bosscher wrote:

> Op zo 23-03-2003, om 23:21 schreef Richard Guenther:
> > On Sun, 23 Mar 2003, Richard Guenther wrote:
> > This is optimization/10196 now, and I have linked the testcase from POOMA
> > from it
> > (http://www.tat.physik.uni-tuebingen.de/~rguenth/DynamicLayout2.cmpl.ii.gz).
> > This is the testcase with the most serious regression in compilation speed
> > from the pooma library (all other files regress, too, but not that much).
> > Its rather large, sorry about that.
> >
> > If you need more information, please tell me.
>
> Richard,
>
> Mark Mitchell has fixed 7086, which is another PR about excessive time
> spent in "expand".  In this PR the slowdown is not nearly as bad as
> yours, so the issues could be unrelated, but maybe you try this once
> more with an updated GCC 3.3 tree?

Current 3.3 HEAD does not bootstrap for me (SuSE 8.0):

/bin/sh ../libtool --tag CXX --tag disable-shared --mode=link
/tmp/gcc-obj/gcc/xgcc -shared-libgcc -B/tmp/gcc-obj/gcc/ -nostdinc++
-L/tmp/gcc-obj/i686-pc-linux-gnu/libstdc++-v3/src
-L/tmp/gcc-obj/i686-pc-linux-gnu/libstdc++-v3/src/.libs
-B-dir/i686-pc-linux-gnu/bin/ -B-dir/i686-pc-linux-gnu/lib/ -isystem
-dir/i686-pc-linux-gnu/include -Wl,-O1   -fno-implicit-templates
-prefer-pic -Wall -Wno-format -W -Wwrite-strings -Winline
-fdiagnostics-show-location=once  -ffunction-sections -fdata-sections
-o libsupc++.la -rpath -dir/lib/.  del_op.lo del_opnt.lo del_opv.lo
del_opvnt.lo eh_alloc.lo eh_aux_runtime.lo eh_catch.lo eh_exception.lo
eh_globals.lo eh_personality.lo eh_terminate.lo eh_throw.lo eh_type.lo
guard.lo new_handler.lo new_op.lo new_opnt.lo new_opv.lo new_opvnt.lo
pure.lo tinfo.lo tinfo2.lo vec.lo cxa_demangle.lo dyn-string.lo  -lm
libtool: link: only absolute run-paths are allowed
make[4]: *** [libsupc++.la] Error 1
make[4]: Leaving directory
`/tmp/gcc-obj/i686-pc-linux-gnu/libstdc++-v3/libsupc++'
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory `/tmp/gcc-obj/i686-pc-linux-gnu/libstdc++-v3'
make[2]: *** [all-recursive-am] Error 2
make[2]: Leaving directory `/tmp/gcc-obj/i686-pc-linux-gnu/libstdc++-v3'
make[1]: *** [all-target-libstdc++-v3] Error 2
make[1]: Leaving directory `/tmp/gcc-obj'
make: *** [bootstrap] Error 2
ln: invalid option -- r
Try `ln --help' for more information.

Richard.

--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-24  8:47         ` Steven Bosscher
  2003-03-24 10:06           ` Richard Guenther
  2003-03-24 10:50           ` 3.3 Bootstrap failure (was Re: 3.3 compile time regression (22400%)) Richard Guenther
@ 2003-03-24 15:21           ` Richard Guenther
  2 siblings, 0 replies; 41+ messages in thread
From: Richard Guenther @ 2003-03-24 15:21 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc, Janis Johnson

On 24 Mar 2003, Steven Bosscher wrote:

> Op zo 23-03-2003, om 23:21 schreef Richard Guenther:
> > On Sun, 23 Mar 2003, Richard Guenther wrote:
> > This is optimization/10196 now, and I have linked the testcase from POOMA
> > from it
> > (http://www.tat.physik.uni-tuebingen.de/~rguenth/DynamicLayout2.cmpl.ii.gz).
> > This is the testcase with the most serious regression in compilation speed
> > from the pooma library (all other files regress, too, but not that much).
> > Its rather large, sorry about that.
> >
> > If you need more information, please tell me.
>
> Richard,
>
> Mark Mitchell has fixed 7086, which is another PR about excessive time
> spent in "expand".  In this PR the slowdown is not nearly as bad as
> yours, so the issues could be unrelated, but maybe you try this once
> more with an updated GCC 3.3 tree?

Managed to bootstrap current 3.3, the numbers are a lot better now, but
still about 10 times slower than 3.2 (see PR).

Richard.

--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-23  4:30   ` Janis Johnson
  2003-03-23 22:14     ` Richard Guenther
@ 2003-03-25  2:27     ` Janis Johnson
  2003-03-25 15:11       ` Richard Guenther
  1 sibling, 1 reply; 41+ messages in thread
From: Janis Johnson @ 2003-03-25  2:27 UTC (permalink / raw)
  To: Janis Johnson; +Cc: Steven Bosscher, richard.guenther, mark, gcc

On Sat, Mar 22, 2003 at 01:51:57PM -0800, Janis Johnson wrote:
> On Sat, Mar 22, 2003 at 02:39:43PM +0100, Steven Bosscher wrote:
> > 
> > Richard, can you open a PR for this issue and put that multi-megabyte
> > test case up for download somewhere?  If you don't have a public space
> > for your test case, then I can make some of my own available for you.
> > 
> > Janis, when the test case is there, do you have the cycles to track this
> > one down to the breaking patch somehow?
> 
> Yes, I can do that.

It turns out that I can't investigate when the performance changed for
this.  When the test case is takes a long time to compile, it also runs
out of memory.  I don't have access to a bigger i686-linux system and
can't use the preprocessed source file on other systems.

Janis

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-25  2:27     ` Janis Johnson
@ 2003-03-25 15:11       ` Richard Guenther
  0 siblings, 0 replies; 41+ messages in thread
From: Richard Guenther @ 2003-03-25 15:11 UTC (permalink / raw)
  To: Janis Johnson; +Cc: Steven Bosscher, richard.guenther, mark, gcc

On Mon, 24 Mar 2003, Janis Johnson wrote:

> On Sat, Mar 22, 2003 at 01:51:57PM -0800, Janis Johnson wrote:
> > On Sat, Mar 22, 2003 at 02:39:43PM +0100, Steven Bosscher wrote:
> > >
> > > Richard, can you open a PR for this issue and put that multi-megabyte
> > > test case up for download somewhere?  If you don't have a public space
> > > for your test case, then I can make some of my own available for you.
> > >
> > > Janis, when the test case is there, do you have the cycles to track this
> > > one down to the breaking patch somehow?
> >
> > Yes, I can do that.
>
> It turns out that I can't investigate when the performance changed for
> this.  When the test case is takes a long time to compile, it also runs
> out of memory.  I don't have access to a bigger i686-linux system and
> can't use the preprocessed source file on other systems.

Maybe the patch identified to cause c++/10047 is related - it defers
inline functions (see also c++/4803 which the patch fixes).

Richard.

--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-19  0:40   ` Steven Bosscher
  2003-03-19  1:20     ` John David Anglin
@ 2003-03-20  6:23     ` Albert Chin
  1 sibling, 0 replies; 41+ messages in thread
From: Albert Chin @ 2003-03-20  6:23 UTC (permalink / raw)
  To: gcc

On Wed, Mar 19, 2003 at 12:58:46AM +0100, Steven Bosscher wrote:
> Op wo 19-03-2003, om 00:42 schreef Michael Matz:
> > Hi,
> > 
> > On Tue, 18 Mar 2003, John David Anglin wrote:
> > 
> > > I was finally successful in building Dialogs.C from LyX 1.3.0 on
> > > hppa2.0-hp-hpux11.11 with -O2 and -fno-default-inline:
> > >
> > >  life analysis         :17359.67 (92%) usr   9.10 (16%) sys17522.54 (90%) wall
> > 
> > Yikes!  Such behaviour would normally only be expected from fully
> > connected CFGs if at all, or CFGs with big tree-width, but normal C++ code
> > even if using exceptions doesn't result in a CFG _that_ ugly.  Do you care
> > enough to reduce the file a bit (so you have a chance to play with it
> > without becoming dust), create a profileable cc1plus and post the
> > hotspots?
> 
> Is there a PR for this issue now?  This issue obvioulsy is quite bad for
> a 3.3 release, but I couldn't find a PR for it.

I just filed PR10160 about this. This is against Solaris 9/SPARC. Time
isn't consumed by "life analysis" this time but by:
 scheduling            :7510.98 (87%) usr   0.74 ( 4%) sys13791.00 (86%) wall

This is with GNU C++ version 3.3 20030319 and "-O2
-fno-default-inline". I run out of memory when -fno-default-inline is
left out.

I'm building a profiling cc1plus on HP-UX 11.00 now to run tests.

-- 
albert chin (china@thewrittenword.com)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: 3.3 compile time regression (22400%)
@ 2003-03-19 23:19 S. Bosscher
  0 siblings, 0 replies; 41+ messages in thread
From: S. Bosscher @ 2003-03-19 23:19 UTC (permalink / raw)
  To: 'John David Anglin ', 'matz@suse.de '
  Cc: 'gcc@gcc.gnu.org '

John David Anglin wrote:
> One thing that I noticed in working on PR10062 is that the code
> generated for computed gotos seems horribly inefficient,  

Isn't that what the patch in PR 2001 is for?

Greetz
Steven

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-19  0:29 ` Michael Matz
  2003-03-19  0:40   ` John David Anglin
  2003-03-19  0:40   ` Steven Bosscher
@ 2003-03-19 20:12   ` John David Anglin
  2 siblings, 0 replies; 41+ messages in thread
From: John David Anglin @ 2003-03-19 20:12 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc

> On Tue, 18 Mar 2003, John David Anglin wrote:
> 
> > I was finally successful in building Dialogs.C from LyX 1.3.0 on
> > hppa2.0-hp-hpux11.11 with -O2 and -fno-default-inline:
> >
> >  life analysis         :17359.67 (92%) usr   9.10 (16%) sys17522.54 (90%) wall
> 
> Yikes!  Such behaviour would normally only be expected from fully
> connected CFGs if at all, or CFGs with big tree-width, but normal C++ code
> even if using exceptions doesn't result in a CFG _that_ ugly.  Do you care
> enough to reduce the file a bit (so you have a chance to play with it
> without becoming dust), create a profileable cc1plus and post the
> hotspots?

One thing that I noticed in working on PR10062 is that the code
generated for computed gotos seems horribly inefficient, at least
in some circumstances.  Where we needed the long branch fix, there
was a linear sequence of ~1500 comparisons of the same register
with a constant int.  The value of the constant int increased
by one at each step.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-19  0:40   ` Steven Bosscher
@ 2003-03-19  1:20     ` John David Anglin
  2003-03-20  6:23     ` Albert Chin
  1 sibling, 0 replies; 41+ messages in thread
From: John David Anglin @ 2003-03-19  1:20 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: matz, gcc

> > Yikes!  Such behaviour would normally only be expected from fully
> > connected CFGs if at all, or CFGs with big tree-width, but normal C++ code
> > even if using exceptions doesn't result in a CFG _that_ ugly.  Do you care
> > enough to reduce the file a bit (so you have a chance to play with it
> > without becoming dust), create a profileable cc1plus and post the
> > hotspots?
> 
> Is there a PR for this issue now?  This issue obvioulsy is quite bad for
> a 3.3 release, but I couldn't find a PR for it.

Not that I am aware of.  I am doing final testing of a patch to fix
the backend problem that was the subject of PR10062 but the fix doesn't
address the above problem.  We go from about a 90s compilation with
-fno-inline, to 5.4 hours with -fno-default-inline, to virtual memory
exhausted with no inline option specified.  The amount of virtual memory
required for the later probably exceeds the maximum data size possible
on the PA for the 32-bit runtime.  For my last try, the maximum data
size was set to 768MB.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-19  0:29 ` Michael Matz
  2003-03-19  0:40   ` John David Anglin
@ 2003-03-19  0:40   ` Steven Bosscher
  2003-03-19  1:20     ` John David Anglin
  2003-03-20  6:23     ` Albert Chin
  2003-03-19 20:12   ` John David Anglin
  2 siblings, 2 replies; 41+ messages in thread
From: Steven Bosscher @ 2003-03-19  0:40 UTC (permalink / raw)
  To: Michael Matz; +Cc: John David Anglin, gcc

Op wo 19-03-2003, om 00:42 schreef Michael Matz:
> Hi,
> 
> On Tue, 18 Mar 2003, John David Anglin wrote:
> 
> > I was finally successful in building Dialogs.C from LyX 1.3.0 on
> > hppa2.0-hp-hpux11.11 with -O2 and -fno-default-inline:
> >
> >  life analysis         :17359.67 (92%) usr   9.10 (16%) sys17522.54 (90%) wall
> 
> Yikes!  Such behaviour would normally only be expected from fully
> connected CFGs if at all, or CFGs with big tree-width, but normal C++ code
> even if using exceptions doesn't result in a CFG _that_ ugly.  Do you care
> enough to reduce the file a bit (so you have a chance to play with it
> without becoming dust), create a profileable cc1plus and post the
> hotspots?

Is there a PR for this issue now?  This issue obvioulsy is quite bad for
a 3.3 release, but I couldn't find a PR for it.

Greetz
Steven

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-19  0:29 ` Michael Matz
@ 2003-03-19  0:40   ` John David Anglin
  2003-03-19  0:40   ` Steven Bosscher
  2003-03-19 20:12   ` John David Anglin
  2 siblings, 0 replies; 41+ messages in thread
From: John David Anglin @ 2003-03-19  0:40 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc, law

> Yikes!  Such behaviour would normally only be expected from fully
> connected CFGs if at all, or CFGs with big tree-width, but normal C++ code
> even if using exceptions doesn't result in a CFG _that_ ugly.  Do you care
> enough to reduce the file a bit (so you have a chance to play with it
> without becoming dust), create a profileable cc1plus and post the
> hotspots?

I'm off on vacation next week, so I don't have much more time for this.
There are a couple more issues that I want to look at before I go, so
any such analysis on my part will have to wait.  The file is attached
to PR10062 and according to Albert the problem is present on all ports
to one degree or another.  For some reason, the problem seems worst
on the PA.

> Which GCC version btw.  Also 3.3?

Yes.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 22:57 John David Anglin
  2003-03-18 23:54 ` Albert Chin
@ 2003-03-19  0:29 ` Michael Matz
  2003-03-19  0:40   ` John David Anglin
                     ` (2 more replies)
  1 sibling, 3 replies; 41+ messages in thread
From: Michael Matz @ 2003-03-19  0:29 UTC (permalink / raw)
  To: John David Anglin; +Cc: gcc

Hi,

On Tue, 18 Mar 2003, John David Anglin wrote:

> I was finally successful in building Dialogs.C from LyX 1.3.0 on
> hppa2.0-hp-hpux11.11 with -O2 and -fno-default-inline:
>
>  life analysis         :17359.67 (92%) usr   9.10 (16%) sys17522.54 (90%) wall

Yikes!  Such behaviour would normally only be expected from fully
connected CFGs if at all, or CFGs with big tree-width, but normal C++ code
even if using exceptions doesn't result in a CFG _that_ ugly.  Do you care
enough to reduce the file a bit (so you have a chance to play with it
without becoming dust), create a profileable cc1plus and post the
hotspots?

Which GCC version btw.  Also 3.3?


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 22:57 John David Anglin
@ 2003-03-18 23:54 ` Albert Chin
  2003-03-19  0:29 ` Michael Matz
  1 sibling, 0 replies; 41+ messages in thread
From: Albert Chin @ 2003-03-18 23:54 UTC (permalink / raw)
  To: gcc

On Tue, Mar 18, 2003 at 04:36:43PM -0500, John David Anglin wrote:
> > I'm experiencing the same problems on Solaris 9/SPARC, HP-UX 11.00 and
> > 11i, and Redhat Linux 7.1 building LyX 1.3.0
> > (src/frontends/qt2/Dialogs.C). On Solaris and HP-UX, I run out of
> > memory. On RHL 7.1, build time takes 25 minutes.
> > 
> > On Solaris, -fno-default-inline brings the build time to 1 hour. It
> > brings the build time to 4 minutes on RHL 7.1.
> 
> I was finally successful in building Dialogs.C from LyX 1.3.0 on
> hppa2.0-hp-hpux11.11 with -O2 and -fno-default-inline:
> 
> Execution times (seconds)
>  ...
>  life analysis         :17359.67 (92%) usr   9.10 (16%) sys17522.54 (90%) wall
>  ...
>  expand                :  30.61 ( 0%) usr  18.70 (32%) sys  70.85 ( 0%) wall
> 
> As can be seen, almost all the time is taken up in life analysis.

What is "life analysis"? Others have seem similar problems during
expand.

-- 
albert chin (china@thewrittenword.com)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
@ 2003-03-18 22:57 John David Anglin
  2003-03-18 23:54 ` Albert Chin
  2003-03-19  0:29 ` Michael Matz
  0 siblings, 2 replies; 41+ messages in thread
From: John David Anglin @ 2003-03-18 22:57 UTC (permalink / raw)
  To: gcc; +Cc: gcc

> I'm experiencing the same problems on Solaris 9/SPARC, HP-UX 11.00 and
> 11i, and Redhat Linux 7.1 building LyX 1.3.0
> (src/frontends/qt2/Dialogs.C). On Solaris and HP-UX, I run out of
> memory. On RHL 7.1, build time takes 25 minutes.
> 
> On Solaris, -fno-default-inline brings the build time to 1 hour. It
> brings the build time to 4 minutes on RHL 7.1.

I was finally successful in building Dialogs.C from LyX 1.3.0 on
hppa2.0-hp-hpux11.11 with -O2 and -fno-default-inline:

Execution times (seconds)
 garbage collection    :   6.68 ( 0%) usr   0.93 ( 2%) sys 144.26 ( 1%) wall
 cfg construction      :   4.85 ( 0%) usr   0.09 ( 0%) sys   4.93 ( 0%) wall
 cfg cleanup           :   3.08 ( 0%) usr   0.06 ( 0%) sys   3.64 ( 0%) wall
 trivially dead code   :   3.95 ( 0%) usr   0.05 ( 0%) sys   3.91 ( 0%) wall
 life analysis         :17359.67 (92%) usr   9.10 (16%) sys17522.54 (90%) wall
 life info update      :  98.91 ( 1%) usr   0.02 ( 0%) sys  99.37 ( 1%) wall
 preprocessing         :   0.66 ( 0%) usr   0.79 ( 1%) sys   1.69 ( 0%) wall
 lexical analysis      :   0.97 ( 0%) usr   1.60 ( 3%) sys   2.58 ( 0%) wall
 parser                :  29.15 ( 0%) usr   6.72 (12%) sys  53.32 ( 0%) wall
 name lookup           :  15.01 ( 0%) usr  15.53 (27%) sys  41.34 ( 0%) wall
 expand                :  30.61 ( 0%) usr  18.70 (32%) sys  70.85 ( 0%) wall
 varconst              :   1.36 ( 0%) usr   0.33 ( 1%) sys  33.18 ( 0%) wall
 integration           :   0.87 ( 0%) usr   0.05 ( 0%) sys   4.24 ( 0%) wall
 jump                  :  88.02 ( 0%) usr   0.12 ( 0%) sys  88.96 ( 0%) wall
 CSE                   :   7.30 ( 0%) usr   0.18 ( 0%) sys   9.22 ( 0%) wall
 global CSE            :   0.33 ( 0%) usr   0.02 ( 0%) sys   0.59 ( 0%) wall
 loop analysis         :   0.65 ( 0%) usr   0.01 ( 0%) sys   0.74 ( 0%) wall
 CSE 2                 :   2.87 ( 0%) usr   0.12 ( 0%) sys   3.22 ( 0%) wall
 branch prediction     :   7.32 ( 0%) usr   0.12 ( 0%) sys   7.66 ( 0%) wall
 flow analysis         :   1.02 ( 0%) usr   0.12 ( 0%) sys   2.24 ( 0%) wall
 combiner              :   2.63 ( 0%) usr   0.10 ( 0%) sys   9.21 ( 0%) wall
 if-conversion         :   0.72 ( 0%) usr   0.05 ( 0%) sys   2.01 ( 0%) wall
 regmove               :   1.49 ( 0%) usr   0.02 ( 0%) sys   1.48 ( 0%) wall
 scheduling            :1094.16 ( 6%) usr   0.65 ( 1%) sys1111.42 ( 6%) wall
 local alloc           :  14.83 ( 0%) usr   0.16 ( 0%) sys  16.77 ( 0%) wall
 global alloc          :  47.06 ( 0%) usr   0.45 ( 1%) sys  69.51 ( 0%) wall
 reload CSE regs       :  30.11 ( 0%) usr   0.40 ( 1%) sys  48.88 ( 0%) wall
 flow 2                :   1.54 ( 0%) usr   0.11 ( 0%) sys   2.20 ( 0%) wall
 if-conversion 2       :   0.56 ( 0%) usr   0.01 ( 0%) sys   0.63 ( 0%) wall
 peephole 2            :   0.81 ( 0%) usr   0.00 ( 0%) sys   0.90 ( 0%) wall
 rename registers      :   3.36 ( 0%) usr   0.02 ( 0%) sys   3.48 ( 0%) wall
 scheduling 2          :   7.44 ( 0%) usr   0.14 ( 0%) sys  15.78 ( 0%) wall
 machine dep reorg     :   0.29 ( 0%) usr   0.00 ( 0%) sys   0.31 ( 0%) wall
 delay branch sched    :  11.82 ( 0%) usr   0.07 ( 0%) sys  20.61 ( 0%) wall
 reorder blocks        :   0.56 ( 0%) usr   0.02 ( 0%) sys   0.81 ( 0%) wall
 shorten branches      :   1.42 ( 0%) usr   0.15 ( 0%) sys   2.03 ( 0%) wall
 final                 :   3.25 ( 0%) usr   0.28 ( 0%) sys  28.63 ( 0%) wall
 symout                :   0.01 ( 0%) usr   0.03 ( 0%) sys   0.02 ( 0%) wall
 rest of compilation   :  55.63 ( 0%) usr   0.51 ( 1%) sys  92.10 ( 0%) wall
 TOTAL                 :18941.01            57.83          19525.37

As can be seen, almost all the time is taken up in life analysis.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 12:28 Richard Guenther
  2003-03-18 12:33 ` Karel Gardas
  2003-03-18 13:21 ` Michael Matz
@ 2003-03-18 16:35 ` Albert Chin
  2 siblings, 0 replies; 41+ messages in thread
From: Albert Chin @ 2003-03-18 16:35 UTC (permalink / raw)
  To: gcc

On Tue, Mar 18, 2003 at 01:13:52PM +0100, Richard Guenther wrote:
> Hi!
> 
> I'm experiencing huge compile time regression with todays 3.3 compared to
> 3.2 during compiling of the POOMA library. -ftime-report shows that the
> culprit is expand:
> 
> gcc-3.2:
> 
>  expand           :   0.83 ( 8%) usr   0.04 ( 8%) sys   0.94 ( 9%) wall
> 
> gcc-3.3:
> 
>  expand           :2151.80 (96%) usr  11.35 (44%) sys2181.19 (95%) wall

I'm experiencing the same problems on Solaris 9/SPARC, HP-UX 11.00 and
11i, and Redhat Linux 7.1 building LyX 1.3.0
(src/frontends/qt2/Dialogs.C). On Solaris and HP-UX, I run out of
memory. On RHL 7.1, build time takes 25 minutes.

On Solaris, -fno-default-inline brings the build time to 1 hour. It
brings the build time to 4 minutes on RHL 7.1.

-- 
albert chin (china@thewrittenword.com)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 15:50     ` Karel Gardas
@ 2003-03-18 16:31       ` Richard Guenther
  0 siblings, 0 replies; 41+ messages in thread
From: Richard Guenther @ 2003-03-18 16:31 UTC (permalink / raw)
  To: Karel Gardas; +Cc: Michael Matz, gcc

On Tue, 18 Mar 2003, Karel Gardas wrote:

> On Tue, 18 Mar 2003, Richard Guenther wrote:
>
> > max-inline-insns-single      compile time
> >    100                          7.59
> >    150                          9.10
> >    200                         10.82
> >    250                         81.66
> >    300                       2237.98
> >
> > And 300 is the default... ugh.
>
> Nice table, my results look:
>
>      100                         25.64
>      150                         26.55
>      200                        151.97
>      250                        378.44
>
> and if 300 is default, then
>
>      300                       1025.44

PR c++/7086 looks like its the same problem.

Richard.

--
Richard Guenther <richard.guenther@uni-tuebingen.de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 15:07   ` Richard Guenther
  2003-03-18 15:25     ` Michael Matz
@ 2003-03-18 15:50     ` Karel Gardas
  2003-03-18 16:31       ` Richard Guenther
  1 sibling, 1 reply; 41+ messages in thread
From: Karel Gardas @ 2003-03-18 15:50 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Michael Matz, gcc

On Tue, 18 Mar 2003, Richard Guenther wrote:

> max-inline-insns-single      compile time
>    100                          7.59
>    150                          9.10
>    200                         10.82
>    250                         81.66
>    300                       2237.98
>
> And 300 is the default... ugh.

Nice table, my results look:

     100                         25.64
     150                         26.55
     200                        151.97
     250                        378.44

and if 300 is default, then

     300                       1025.44

Cheers,

Karel
--
Karel Gardas                  kgardas@objectsecurity.com
ObjectSecurity Ltd.           http://www.objectsecurity.com




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 15:19       ` Karel Gardas
  2003-03-18 15:26         ` Richard Guenther
@ 2003-03-18 15:33         ` Richard Guenther
  1 sibling, 0 replies; 41+ messages in thread
From: Richard Guenther @ 2003-03-18 15:33 UTC (permalink / raw)
  To: Karel Gardas; +Cc: Michael Matz, gcc

On Tue, 18 Mar 2003, Karel Gardas wrote:

> On Tue, 18 Mar 2003, Karel Gardas wrote:
>
> > Please wait a moment, I've just found that I have deleted all gcc3.3/3.4
> > builds. I'm building right now...
> >

[snipped results showing that -fno-default-inline helps]

Just a thought: maybe this is due to libstdc++ changes and lazy use of
in class definitions of large methods that used to be out of line in 3.2?

Can we set different inlining limits for only libstdc++? I suspect no.

Just to limit the search, I'm only using <vector> and <iostream>, but from
the compile time distribution in my sources I'd point at <vector> for the
culprit.

Maybe this helps,

Richard.

--
Richard Guenther <richard.guenther@uni-tuebingen.de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 15:19       ` Karel Gardas
@ 2003-03-18 15:26         ` Richard Guenther
  2003-03-18 15:33         ` Richard Guenther
  1 sibling, 0 replies; 41+ messages in thread
From: Richard Guenther @ 2003-03-18 15:26 UTC (permalink / raw)
  To: Karel Gardas; +Cc: Michael Matz, gcc

On Tue, 18 Mar 2003, Karel Gardas wrote:

> On Tue, 18 Mar 2003, Karel Gardas wrote:
>
> > Please wait a moment, I've just found that I have deleted all gcc3.3/3.4
> > builds. I'm building right now...
> >
>
> Here are results:

Nice - so its probably the same problem and no need for me to further
investigate. But it makes gcc3.3 much less usable than 3.2 :/

> thinkpad:~/arch/devel/gcc33-expand-test/orb$ time c++ -ftime-report
> -fno-default-inline -I../include  -O2  -Wall -fpermissive   -DPIC -fPIC
> -c security/csiv2_impl.cc -o security/csiv2_impl.pic.o
> /home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/stl_alloc.h:652:
> warning: inline
>    function `std::allocator<_Alloc>::allocator() [with _Tp = char]' used
> but
>    never defined

Just in case anybody is wondering about these, they're a regression
against 3.2 and I filed a PR already (c++/10047).

Richard.

--
Richard Guenther <richard.guenther@uni-tuebingen.de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 15:07   ` Richard Guenther
@ 2003-03-18 15:25     ` Michael Matz
  2003-03-18 15:09       ` Richard Guenther
  2003-03-18 15:12       ` Richard Guenther
  2003-03-18 15:50     ` Karel Gardas
  1 sibling, 2 replies; 41+ messages in thread
From: Michael Matz @ 2003-03-18 15:25 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

Hi,

On Tue, 18 Mar 2003, Richard Guenther wrote:

> max-inline-insns-single
> Several parameters control the tree inliner used in gcc. This number sets
> the maximum number of instructions (counted in gcc's internal
> representation) in a single function that the tree inliner will consider
> for inlining. This only affects functions declared inline and methods
> implemented in a class declaration (C++). The default value is 300.

Your 3.3 branch is much too old.  This was fixed/introduced on 2003-03-06.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 15:12       ` Richard Guenther
@ 2003-03-18 15:20         ` Michael Matz
  0 siblings, 0 replies; 41+ messages in thread
From: Michael Matz @ 2003-03-18 15:20 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

Hi,

On Tue, 18 Mar 2003, Richard Guenther wrote:

> I also do not see differences in the current 3.3 sources - in fact,
> params.def reads:
>
> DEFPARAM (PARAM_MAX_INLINE_INSNS_SINGLE,
>           "max-inline-insns-single",

% cvs status params.def
File: params.def        Status: Up-to-date

   Working revision:    1.18.2.2
   Repository revision: 1.18.2.2        /cvs/gcc/gcc/gcc/params.def,v
   Sticky Tag:          gcc-3_3-branch (branch: 1.18.2)
% grep max-inline params.def
   "max-inline-insns" parameter) is exceeded, the acceptable size
          "max-inline-insns-single",
   class declaration in C++) given by the "max-inline-insns-single"
          "max-inline-insns-auto",
   This is done by a linear function, see "max-inline-slope" parameter.
   function limit (set by the "max-inline-insns-single" parameter) or
          "max-inline-insns",
   "max-inline-insns" parameter), a linear function is used to
          "max-inline-slope",
          "max-inline-insns-rtl",


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 15:09       ` Richard Guenther
@ 2003-03-18 15:19         ` Michael Matz
  0 siblings, 0 replies; 41+ messages in thread
From: Michael Matz @ 2003-03-18 15:19 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

Hi,

On Tue, 18 Mar 2003, Richard Guenther wrote:

> It is not (its from today), this is from the online-documentation.

Ahh, I see, in that case this the current docu is below.  But your chart
already showed the effect quite nicely.  OTOH if it's really the same
problem Karel is seeing, fiddling with the inlining defaults would just
paper over the problem, but it might be enough for you.  Simply use the
parameter where it's fast enough for you.  I'm currently looking into
Karel's problem, but I'm not sure if I come up with something, it's not
too easy after a first look.


Ciao,
Michael.


@item max-inline-insns-single
Several parameters control the tree inliner used in gcc.
This number sets the maximum number of instructions (counted in gcc's
internal representation) in a single function that the tree inliner
will consider for inlining.  This only affects functions declared
inline and methods implemented in a class declaration (C++).
The default value is 300.

@item max-inline-insns-auto
When you use @option{-finline-functions} (included in @option{-O3}),
a lot of functions that would otherwise not be considered for inlining
by the compiler will be investigated.  To those functions, a different
(more restrictive) limit compared to functions declared inline can
be applied.
The default value is 300.

@item max-inline-insns
The tree inliner does decrease the allowable size for single functions
to be inlined after we already inlined the number of instructions
given here by repeated inlining.  This number should be a factor of
two or more larger than the single function limit.
Higher numbers result in better runtime performance, but incur higher
compile-time resource (CPU time, memory) requirements and result in
larger binaries.  Very high values are not advisable, as too large
binaries may adversely affect runtime performance.
The default value is 600.

@item max-inline-slope
After exceeding the maximum number of inlined instructions by repeated
inlining, a linear function is used to decrease the allowable size
for single functions.  The slope of that function is the negative
reciprocal of the number specified here.
The default value is 32.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 14:50     ` Karel Gardas
@ 2003-03-18 15:19       ` Karel Gardas
  2003-03-18 15:26         ` Richard Guenther
  2003-03-18 15:33         ` Richard Guenther
  0 siblings, 2 replies; 41+ messages in thread
From: Karel Gardas @ 2003-03-18 15:19 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Michael Matz, gcc

On Tue, 18 Mar 2003, Karel Gardas wrote:

> Please wait a moment, I've just found that I have deleted all gcc3.3/3.4
> builds. I'm building right now...
>

Here are results:

thinkpad:~/arch/devel/gcc33-expand-test/orb$ c++ -v
Reading specs from
/home/karel/usr/local/gcc3.3.x/lib/gcc-lib/i686-pc-linux-gnu/3.3/specs
Configured with: /home/karel/cvs/gcc/gcc-3_3-branch/configure
--prefix=/home/karel/usr/local/gcc3.3.x --enable-shared --enable-threads
--enable-languages=c++ --disable-checking --enable-__cxa_atexit
Thread model: posix
gcc version 3.3 20030318 (prerelease)
thinkpad:~/arch/devel/gcc33-expand-test/orb$


thinkpad:~/arch/devel/gcc33-expand-test/orb$ time c++ -ftime-report
-I../include  -O2  -Wall -fpermissive   -DPIC -fPIC  -c
security/csiv2_impl.cc -o security/csiv2_impl.pic.o

Execution times (seconds)
 garbage collection    :   7.79 ( 1%) usr   0.17 ( 2%) sys   9.50 ( 1%) wall
 cfg construction      :   2.61 ( 0%) usr   0.12 ( 1%) sys   3.00 ( 0%) wall
 cfg cleanup           :  25.80 ( 3%) usr   0.06 ( 1%) sys  27.38 ( 2%) wall
 trivially dead code   :   1.84 ( 0%) usr   0.02 ( 0%) sys   2.00 ( 0%) wall
 life analysis         :   2.25 ( 0%) usr   0.13 ( 1%) sys   2.12 ( 0%) wall
 life info update      :   1.02 ( 0%) usr   0.06 ( 1%) sys   1.25 ( 0%) wall
 preprocessing         :   0.37 ( 0%) usr   0.11 ( 1%) sys   1.25 ( 0%) wall
 lexical analysis      :   0.42 ( 0%) usr   0.13 ( 1%) sys   0.12 ( 0%) wall
 parser                :  11.90 ( 1%) usr   0.51 ( 5%) sys  11.38 ( 1%) wall
 name lookup           :   3.15 ( 0%) usr   0.47 ( 5%) sys   4.12 ( 0%) wall
 expand                : 919.40 (90%) usr   2.53 (26%) sys 975.88 (87%) wall
 varconst              :   0.13 ( 0%) usr   0.01 ( 0%) sys   0.00 ( 0%) wall
 integration           :   1.03 ( 0%) usr   0.01 ( 0%) sys   1.00 ( 0%) wall
 jump                  :  10.39 ( 1%) usr   0.54 ( 6%) sys  23.62 ( 2%) wall
 CSE                   :   5.11 ( 0%) usr   0.11 ( 1%) sys   4.75 ( 0%) wall
 global CSE            :   4.18 ( 0%) usr   1.03 (11%) sys  14.12 ( 1%) wall
 loop analysis         :   7.54 ( 1%) usr   3.04 (31%) sys  11.00 ( 1%) wall
 CSE 2                 :   1.54 ( 0%) usr   0.06 ( 1%) sys   1.75 ( 0%) wall
 branch prediction     :   1.40 ( 0%) usr   0.06 ( 1%) sys   3.00 ( 0%) wall
 flow analysis         :   0.52 ( 0%) usr   0.06 ( 1%) sys   0.25 ( 0%) wall
 combiner              :   1.28 ( 0%) usr   0.06 ( 1%) sys   2.12 ( 0%) wall
 if-conversion         :   0.26 ( 0%) usr   0.00 ( 0%) sys   0.50 ( 0%) wall
 regmove               :   1.11 ( 0%) usr   0.01 ( 0%) sys   1.38 ( 0%) wall
 mode switching        :   0.19 ( 0%) usr   0.00 ( 0%) sys   0.38 ( 0%) wall
 local alloc           :   1.71 ( 0%) usr   0.21 ( 2%) sys   1.88 ( 0%) wall
 global alloc          :   2.82 ( 0%) usr   0.05 ( 1%) sys   2.88 ( 0%) wall
 reload CSE regs       :   2.37 ( 0%) usr   0.09 ( 1%) sys   2.25 ( 0%) wall
 flow 2                :   0.57 ( 0%) usr   0.03 ( 0%) sys   1.50 ( 0%) wall
 if-conversion 2       :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.25 ( 0%) wall
 peephole 2            :   0.14 ( 0%) usr   0.00 ( 0%) sys   0.38 ( 0%) wall
 rename registers      :   0.45 ( 0%) usr   0.00 ( 0%) sys   0.38 ( 0%) wall
 scheduling 2          :   1.42 ( 0%) usr   0.04 ( 0%) sys   2.00 ( 0%) wall
 reorder blocks        :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 shorten branches      :   0.35 ( 0%) usr   0.00 ( 0%) sys   0.50 ( 0%) wall
 final                 :   0.44 ( 0%) usr   0.01 ( 0%) sys   0.62 ( 0%) wall
 symout                :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 rest of compilation   :   3.77 ( 0%) usr   0.04 ( 0%) sys   4.00 ( 0%) wall
 TOTAL                 :1025.44             9.77          1118.75

real    18m39.924s
user    17m6.210s
sys     0m9.860s


and now with -fno-default-inline


thinkpad:~/arch/devel/gcc33-expand-test/orb$ time c++ -ftime-report
-fno-default-inline -I../include  -O2  -Wall -fpermissive   -DPIC -fPIC
-c security/csiv2_impl.cc -o security/csiv2_impl.pic.o
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/stl_alloc.h:652:
warning: inline
   function `std::allocator<_Alloc>::allocator() [with _Tp = char]' used
but
   never defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/stl_alloc.h:656:
warning: inline
   function `std::allocator<_Alloc>::~allocator() [with _Tp = char]' used
but
   never defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/stl_alloc.h:656:
warning: inline
   function `std::allocator<_Alloc>::~allocator() [with _Tp = char]' used
but
   never defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/stl_alloc.h:388:
warning: inline
   function `static void* std::__default_alloc_template<__threads,
   __inst>::allocate(unsigned int) [with bool __threads = true, int __inst
= 0]
   ' used but never defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/stl_alloc.h:429:
warning: inline
   function `static void std::__default_alloc_template<__threads,
   __inst>::deallocate(void*, unsigned int) [with bool __threads = true,
int
   __inst = 0]' used but never defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/basic_string.h:325:
warning: inline
   function `static std::basic_string<_CharT, _Traits, _Alloc>::_Rep&
   std::basic_string<_CharT, _Traits, _Alloc>::_S_empty_rep() [with _CharT
=
   char, _Traits = std::char_traits<char>, _Alloc = std::allocator<char>]'
used
   but never defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/basic_string.h:355:
warning: inline
   function `std::basic_string<_CharT, _Traits, _Alloc>::~basic_string()
[with
   _CharT = char, _Traits = std::char_traits<char>, _Alloc =
   std::allocator<char>]' used but never defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/basic_string.h:358:
warning: inline
   function `std::basic_string<_CharT, _Traits, _Alloc>&
   std::basic_string<_CharT, _Traits, _Alloc>::operator=(const
   std::basic_string<_CharT, _Traits, _Alloc>&) [with _CharT = char,
_Traits =
   std::char_traits<char>, _Alloc = std::allocator<char>]' used but never
   defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/basic_string.h:361:
warning: inline
   function `std::basic_string<_CharT, _Traits, _Alloc>&
   std::basic_string<_CharT, _Traits, _Alloc>::operator=(const _CharT*)
[with
   _CharT = char, _Traits = std::char_traits<char>, _Alloc =
   std::allocator<char>]' used but never defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/basic_string.h:411:
warning: inline
   function `typename _Alloc::size_type std::basic_string<_CharT, _Traits,
   _Alloc>::length() const [with _CharT = char, _Traits =
   std::char_traits<char>, _Alloc = std::allocator<char>]' used but never
   defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/basic_string.h:441:
warning: inline
   function `typename _Alloc::reference std::basic_string<_CharT, _Traits,
   _Alloc>::operator[](typename _Alloc::size_type) [with _CharT = char,
_Traits
   = std::char_traits<char>, _Alloc = std::allocator<char>]' used but
never
   defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/basic_string.h:471:
warning: inline
   function `std::basic_string<_CharT, _Traits, _Alloc>&
   std::basic_string<_CharT, _Traits, _Alloc>::operator+=(_CharT) [with
_CharT
   = char, _Traits = std::char_traits<char>, _Alloc =
std::allocator<char>]'
   used but never defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/basic_string.h:801:
warning: inline
   function `const _CharT* std::basic_string<_CharT, _Traits,
_Alloc>::c_str()
   const [with _CharT = char, _Traits = std::char_traits<char>, _Alloc =
   std::allocator<char>]' used but never defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/basic_string.h:911:
warning: inline
   function `int std::basic_string<_CharT, _Traits, _Alloc>::compare(const
   std::basic_string<_CharT, _Traits, _Alloc>&) const [with _CharT = char,
   _Traits = std::char_traits<char>, _Alloc = std::allocator<char>]' used
but
   never defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/basic_string.h:215:
warning: inline
   function `_CharT* std::basic_string<_CharT, _Traits,
   _Alloc>::_Rep::_M_refcopy() [with _CharT = char, _Traits =
   std::char_traits<char>, _Alloc = std::allocator<char>]' used but never
   defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/basic_string.h:228:
warning: inline
   function `std::basic_string<_CharT, _Traits,
   _Alloc>::_Alloc_hider::_Alloc_hider(_CharT*, const _Alloc&) [with
_CharT =
   char, _Traits = std::char_traits<char>, _Alloc = std::allocator<char>]'
used
   but never defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/bits/basic_ios.h:347:
warning: inline
   function `_CharT std::basic_ios<_CharT, _Traits>::fill(_CharT) [with
_CharT
   = char, _Traits = std::char_traits<char>]' used but never defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/ostream:178: warning:
inline
   function `std::basic_ostream<_CharT, _Traits>&
std::basic_ostream<_CharT,
   _Traits>::operator<<(short int) [with _CharT = char, _Traits =
   std::char_traits<char>]' used but never defined
/home/karel/usr/local/gcc3.3.x/include/c++/3.3/ostream:189: warning:
inline
   function `std::basic_ostream<_CharT, _Traits>&
std::basic_ostream<_CharT,
   _Traits>::operator<<(short unsigned int) [with _CharT = char, _Traits =
   std::char_traits<char>]' used but never defined

Execution times (seconds)
 garbage collection    :   1.45 ( 4%) usr   0.00 ( 0%) sys   1.62 ( 4%) wall
 cfg construction      :   0.55 ( 1%) usr   0.01 ( 1%) sys   0.62 ( 1%) wall
 cfg cleanup           :   0.43 ( 1%) usr   0.02 ( 1%) sys   0.38 ( 1%) wall
 trivially dead code   :   0.57 ( 1%) usr   0.01 ( 1%) sys   0.25 ( 1%) wall
 life analysis         :   1.89 ( 5%) usr   0.00 ( 0%) sys   1.38 ( 3%) wall
 life info update      :   0.58 ( 1%) usr   0.00 ( 0%) sys   0.50 ( 1%) wall
 preprocessing         :   0.42 ( 1%) usr   0.06 ( 3%) sys   1.25 ( 3%) wall
 lexical analysis      :   0.43 ( 1%) usr   0.07 ( 4%) sys   0.50 ( 1%) wall
 parser                :  12.19 (29%) usr   0.57 (32%) sys  14.12 (32%) wall
 name lookup           :   3.03 ( 7%) usr   0.52 (30%) sys   2.75 ( 6%) wall
 expand                :   2.46 ( 6%) usr   0.04 ( 2%) sys   3.12 ( 7%) wall
 varconst              :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
 integration           :   0.36 ( 1%) usr   0.00 ( 0%) sys   0.25 ( 1%) wall
 jump                  :   0.91 ( 2%) usr   0.09 ( 5%) sys   1.62 ( 4%) wall
 CSE                   :   2.33 ( 6%) usr   0.02 ( 1%) sys   2.38 ( 5%) wall
 global CSE            :   1.17 ( 3%) usr   0.04 ( 2%) sys   0.75 ( 2%) wall
 loop analysis         :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 CSE 2                 :   0.76 ( 2%) usr   0.01 ( 1%) sys   0.88 ( 2%) wall
 branch prediction     :   0.96 ( 2%) usr   0.01 ( 1%) sys   1.50 ( 3%) wall
 flow analysis         :   0.11 ( 0%) usr   0.02 ( 1%) sys   0.12 ( 0%) wall
 combiner              :   0.63 ( 2%) usr   0.00 ( 0%) sys   0.50 ( 1%) wall
 if-conversion         :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 regmove               :   0.27 ( 1%) usr   0.00 ( 0%) sys   0.25 ( 1%) wall
 mode switching        :   0.17 ( 0%) usr   0.01 ( 1%) sys   0.12 ( 0%) wall
 local alloc           :   1.03 ( 2%) usr   0.03 ( 2%) sys   0.88 ( 2%) wall
 global alloc          :   1.89 ( 5%) usr   0.02 ( 1%) sys   1.75 ( 4%) wall
 reload CSE regs       :   1.33 ( 3%) usr   0.04 ( 2%) sys   1.62 ( 4%) wall
 flow 2                :   0.25 ( 1%) usr   0.02 ( 1%) sys   0.75 ( 2%) wall
 if-conversion 2       :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 peephole 2            :   0.20 ( 0%) usr   0.02 ( 1%) sys   0.50 ( 1%) wall
 rename registers      :   0.65 ( 2%) usr   0.00 ( 0%) sys   0.50 ( 1%) wall
 scheduling 2          :   1.75 ( 4%) usr   0.04 ( 2%) sys   2.25 ( 5%) wall
 reorder blocks        :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
 shorten branches      :   0.29 ( 1%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 final                 :   0.81 ( 2%) usr   0.02 ( 1%) sys   0.75 ( 2%) wall
 symout                :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
 rest of compilation   :   0.91 ( 2%) usr   0.06 ( 3%) sys   0.50 ( 1%) wall
 TOTAL                 :  41.42             1.76            44.75

real    0m45.857s
user    0m42.460s
sys     0m1.850s



Which is much better IMHO. FYI: gcc3.2.2 looks:


thinkpad:~/arch/devel/gcc33-expand-test/orb$ time c++ -ftime-report
-I../include  -O2  -Wall -fpermissive   -DPIC -fPIC  -c
security/csiv2_impl.cc -o security/csiv2_impl.pic.o

Execution times (seconds)
 garbage collection    :   7.56 (21%) usr   0.01 ( 1%) sys   8.12 (21%) wall
 cfg construction      :   0.42 ( 1%) usr   0.00 ( 0%) sys   1.00 ( 3%) wall
 cfg cleanup           :   0.33 ( 1%) usr   0.00 ( 0%) sys   0.38 ( 1%) wall
 life analysis         :   0.99 ( 3%) usr   0.02 ( 3%) sys   0.88 ( 2%) wall
 life info update      :   0.16 ( 0%) usr   0.00 ( 0%) sys   0.25 ( 1%) wall
 preprocessing         :   0.37 ( 1%) usr   0.10 (14%) sys   0.38 ( 1%) wall
 lexical analysis      :   0.52 ( 1%) usr   0.18 (25%) sys   1.00 ( 3%) wall
 parser                :  11.44 (32%) usr   0.27 (38%) sys  11.75 (31%) wall
 expand                :   2.18 ( 6%) usr   0.02 ( 3%) sys   2.00 ( 5%) wall
 varconst              :   0.19 ( 1%) usr   0.00 ( 0%) sys   0.62 ( 2%) wall
 integration           :   0.58 ( 2%) usr   0.02 ( 3%) sys   0.25 ( 1%) wall
 jump                  :   0.43 ( 1%) usr   0.01 ( 1%) sys   0.25 ( 1%) wall
 CSE                   :   3.97 (11%) usr   0.00 ( 0%) sys   3.62 (10%) wall
 global CSE            :   0.41 ( 1%) usr   0.00 ( 0%) sys   0.50 ( 1%) wall
 loop analysis         :   0.23 ( 1%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
 CSE 2                 :   1.44 ( 4%) usr   0.00 ( 0%) sys   1.75 ( 5%) wall
 flow analysis         :   0.16 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 combiner              :   0.43 ( 1%) usr   0.02 ( 3%) sys   0.75 ( 2%) wall
 regmove               :   0.18 ( 1%) usr   0.00 ( 0%) sys   0.38 ( 1%) wall
 mode switching        :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
 local alloc           :   0.41 ( 1%) usr   0.00 ( 0%) sys   0.50 ( 1%) wall
 global alloc          :   0.75 ( 2%) usr   0.01 ( 1%) sys   0.50 ( 1%) wall
 reload CSE regs       :   0.58 ( 2%) usr   0.00 ( 0%) sys   1.25 ( 3%) wall
 flow 2                :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
 if-conversion 2       :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 peephole 2            :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
 rename registers      :   0.27 ( 1%) usr   0.00 ( 0%) sys   0.25 ( 1%) wall
 scheduling 2          :   0.65 ( 2%) usr   0.04 ( 6%) sys   0.62 ( 2%) wall
 reorder blocks        :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 shorten branches      :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 final                 :   0.26 ( 1%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall
 symout                :   0.00 ( 0%) usr   0.01 ( 1%) sys   0.00 ( 0%) wall
 rest of compilation   :   0.49 ( 1%) usr   0.00 ( 0%) sys   0.25 ( 1%) wall
 TOTAL                 :  35.90             0.71            37.88

real    0m38.234s
user    0m36.180s
sys     0m0.750s
thinkpad:~/arch/devel/gcc33-expand-test/orb$


thinkpad:~/arch/devel/gcc33-expand-test/orb$ c++ -v
Reading specs from
/home/karel/usr/local/gcc3.2.2/lib/gcc-lib/i686-pc-linux-gnu/3.2.2/specs
Configured with: ../gcc-3_2-branch/configure
--prefix=/home/karel/usr/local/gcc3.2.2 --enable-shared --enable-threads
--enable-languages=c++ --enable-__cxa_atexit
Thread model: posix
gcc version 3.2.2
thinkpad:~/arch/devel/gcc33-expand-test/orb$


Cheers,

Karel
--
Karel Gardas                  kgardas@objectsecurity.com
ObjectSecurity Ltd.           http://www.objectsecurity.com

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 15:25     ` Michael Matz
  2003-03-18 15:09       ` Richard Guenther
@ 2003-03-18 15:12       ` Richard Guenther
  2003-03-18 15:20         ` Michael Matz
  1 sibling, 1 reply; 41+ messages in thread
From: Richard Guenther @ 2003-03-18 15:12 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc

On Tue, 18 Mar 2003, Michael Matz wrote:

> Hi,
>
> On Tue, 18 Mar 2003, Richard Guenther wrote:
>
> > max-inline-insns-single
> > Several parameters control the tree inliner used in gcc. This number sets
> > the maximum number of instructions (counted in gcc's internal
> > representation) in a single function that the tree inliner will consider
> > for inlining. This only affects functions declared inline and methods
> > implemented in a class declaration (C++). The default value is 300.
>
> Your 3.3 branch is much too old.  This was fixed/introduced on 2003-03-06.

I also do not see differences in the current 3.3 sources - in fact,
params.def reads:

/* The single function inlining limit. This is the maximum size
   of a function counted in internal gcc instructions (not in
   real machine instructions) that is eligible for inlining
   by the tree inliner.
   The default value is 300.
   Only functions marked inline (or methods defined in the class
   definition for C++) are affected by this, unless you set the
   -finline-functions (included in -O3) compiler option.
   There are more restrictions to inlining: If inlined functions
   call other functions, the already inlined instructions are
   counted and once the recursive inline limit (see
   "max-inline-insns" parameter) is exceeded, the acceptable size
   gets decreased.  */
DEFPARAM (PARAM_MAX_INLINE_INSNS_SINGLE,
          "max-inline-insns-single",
          "The maximum number of instructions in a single function
eligible for inlining",
          300)

Richard.

--
Richard Guenther <richard.guenther@uni-tuebingen.de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 15:25     ` Michael Matz
@ 2003-03-18 15:09       ` Richard Guenther
  2003-03-18 15:19         ` Michael Matz
  2003-03-18 15:12       ` Richard Guenther
  1 sibling, 1 reply; 41+ messages in thread
From: Richard Guenther @ 2003-03-18 15:09 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc

On Tue, 18 Mar 2003, Michael Matz wrote:

> Hi,
>
> On Tue, 18 Mar 2003, Richard Guenther wrote:
>
> > max-inline-insns-single
> > Several parameters control the tree inliner used in gcc. This number sets
> > the maximum number of instructions (counted in gcc's internal
> > representation) in a single function that the tree inliner will consider
> > for inlining. This only affects functions declared inline and methods
> > implemented in a class declaration (C++). The default value is 300.
>
> Your 3.3 branch is much too old.  This was fixed/introduced on 2003-03-06.

It is not (its from today), this is from the online-documentation. I do
not have sufficiently recent makeinfo to build the documentation myself.

Sorry if I caused confusion this way,

Richard.

--
Richard Guenther <richard.guenther@uni-tuebingen.de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 13:21 ` Michael Matz
  2003-03-18 14:25   ` Richard Guenther
@ 2003-03-18 15:07   ` Richard Guenther
  2003-03-18 15:25     ` Michael Matz
  2003-03-18 15:50     ` Karel Gardas
  1 sibling, 2 replies; 41+ messages in thread
From: Richard Guenther @ 2003-03-18 15:07 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc

On Tue, 18 Mar 2003, Michael Matz wrote:

> Hi Richard,
>
> On Tue, 18 Mar 2003, Richard Guenther wrote:
>
> > The picture does change if I specify -fno-default-inline,
>
> This means, that probably all the in-class definitions of functions are
> the culprit (which are equivalent to being declared inline).  You might
> want to play with "--param bla=value", where "bla" is
> max-inline-insns-single (300) and max-inline-insns-auto (300).  The
> numbers are the default.  Try to make them smaller to limit the inlining.
> The first parameter should apply to these functions, but I'm not
> sure it does.

From the docs I read

max-inline-insns-single
Several parameters control the tree inliner used in gcc. This number sets
the maximum number of instructions (counted in gcc's internal
representation) in a single function that the tree inliner will consider
for inlining. This only affects functions declared inline and methods
implemented in a class declaration (C++). The default value is 300.

so I suspect gcc cannot distinguish between inline and not inline methods
defined in class :( In fact, passing --param max-inline-insns-auto=0
doesnt change compile time.

Interesting though is the max-inline-insns-single vs. compile time chart:

max-inline-insns-single      compile time
   100                          7.59
   150                          9.10
   200                         10.82
   250                         81.66
   300                       2237.98

And 300 is the default... ugh.

Richard.

--
Richard Guenther <richard.guenther@uni-tuebingen.de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 14:25   ` Richard Guenther
@ 2003-03-18 14:50     ` Karel Gardas
  2003-03-18 15:19       ` Karel Gardas
  0 siblings, 1 reply; 41+ messages in thread
From: Karel Gardas @ 2003-03-18 14:50 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Michael Matz, gcc

On Tue, 18 Mar 2003, Richard Guenther wrote:

> > Also you might want to create a profilable gcc, and find the hotspot
> > functions.  expand really shouldn't take _that_ much more time than all
> > the other passes, even if inlining goes crazy (because normally also some
> > other passes should take much time then).
>
> Yes, I wait for Karel to report if specifying -fno-default-inline helps
> for him first, though.

Please wait a moment, I've just found that I have deleted all gcc3.3/3.4
builds. I'm building right now...

Karel
--
Karel Gardas                  kgardas@objectsecurity.com
ObjectSecurity Ltd.           http://www.objectsecurity.com

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 13:22   ` Michael Matz
@ 2003-03-18 14:43     ` Karel Gardas
  0 siblings, 0 replies; 41+ messages in thread
From: Karel Gardas @ 2003-03-18 14:43 UTC (permalink / raw)
  To: Michael Matz; +Cc: Richard Guenther, gcc

On Tue, 18 Mar 2003, Michael Matz wrote:

> Hi Karel,
>
> On Tue, 18 Mar 2003, Karel Gardas wrote:
>
> > http://gcc.gnu.org/ml/gcc/2003-03/msg00083.html - flat profile
> > provided (cc1plus)
>
> Do you still have the call graph profile?  I would be interested to see
> the call structure for the things involving the fixup_var_refs() stuff (no
> need for the full tree).

I have only complete txt file which was output of running gprof ./cc1plus.
Is it what do you need? I have them for both gcc3.3 and gcc3.4.

Let me know, and I'll send them to you privatelly.

Cheers,

Karel
--
Karel Gardas                  kgardas@objectsecurity.com
ObjectSecurity Ltd.           http://www.objectsecurity.com

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 13:21 ` Michael Matz
@ 2003-03-18 14:25   ` Richard Guenther
  2003-03-18 14:50     ` Karel Gardas
  2003-03-18 15:07   ` Richard Guenther
  1 sibling, 1 reply; 41+ messages in thread
From: Richard Guenther @ 2003-03-18 14:25 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc

On Tue, 18 Mar 2003, Michael Matz wrote:

> Hi Richard,
>
> On Tue, 18 Mar 2003, Richard Guenther wrote:
>
> > The picture does change if I specify -fno-default-inline,
>
> This means, that probably all the in-class definitions of functions are
> the culprit (which are equivalent to being declared inline).  You might
> want to play with "--param bla=value", where "bla" is
> max-inline-insns-single (300) and max-inline-insns-auto (300).  The
> numbers are the default.  Try to make them smaller to limit the inlining.
> The first parameter should apply to these functions, but I'm not
> sure it does.

Heh - I usually use --param max-inline-insns=10000000 to get good code
output from gcc..., but I'll try.

> Also you might want to create a profilable gcc, and find the hotspot
> functions.  expand really shouldn't take _that_ much more time than all
> the other passes, even if inlining goes crazy (because normally also some
> other passes should take much time then).

Yes, I wait for Karel to report if specifying -fno-default-inline helps
for him first, though.

Richard.

--
Richard Guenther <richard.guenther@uni-tuebingen.de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 12:33 ` Karel Gardas
@ 2003-03-18 13:22   ` Michael Matz
  2003-03-18 14:43     ` Karel Gardas
  0 siblings, 1 reply; 41+ messages in thread
From: Michael Matz @ 2003-03-18 13:22 UTC (permalink / raw)
  To: Karel Gardas; +Cc: Richard Guenther, gcc

Hi Karel,

On Tue, 18 Mar 2003, Karel Gardas wrote:

> http://gcc.gnu.org/ml/gcc/2003-03/msg00083.html - flat profile
> provided (cc1plus)

Do you still have the call graph profile?  I would be interested to see
the call structure for the things involving the fixup_var_refs() stuff (no
need for the full tree).


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 12:28 Richard Guenther
  2003-03-18 12:33 ` Karel Gardas
@ 2003-03-18 13:21 ` Michael Matz
  2003-03-18 14:25   ` Richard Guenther
  2003-03-18 15:07   ` Richard Guenther
  2003-03-18 16:35 ` Albert Chin
  2 siblings, 2 replies; 41+ messages in thread
From: Michael Matz @ 2003-03-18 13:21 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

Hi Richard,

On Tue, 18 Mar 2003, Richard Guenther wrote:

> The picture does change if I specify -fno-default-inline,

This means, that probably all the in-class definitions of functions are
the culprit (which are equivalent to being declared inline).  You might
want to play with "--param bla=value", where "bla" is
max-inline-insns-single (300) and max-inline-insns-auto (300).  The
numbers are the default.  Try to make them smaller to limit the inlining.
The first parameter should apply to these functions, but I'm not
sure it does.

Also you might want to create a profilable gcc, and find the hotspot
functions.  expand really shouldn't take _that_ much more time than all
the other passes, even if inlining goes crazy (because normally also some
other passes should take much time then).


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 3.3 compile time regression (22400%)
  2003-03-18 12:28 Richard Guenther
@ 2003-03-18 12:33 ` Karel Gardas
  2003-03-18 13:22   ` Michael Matz
  2003-03-18 13:21 ` Michael Matz
  2003-03-18 16:35 ` Albert Chin
  2 siblings, 1 reply; 41+ messages in thread
From: Karel Gardas @ 2003-03-18 12:33 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc

On Tue, 18 Mar 2003, Richard Guenther wrote:

> Hi!
>
> I'm experiencing huge compile time regression with todays 3.3 compared to
> 3.2 during compiling of the POOMA library. -ftime-report shows that the
> culprit is expand:
>
> gcc-3.2:
>
>  expand           :   0.83 ( 8%) usr   0.04 ( 8%) sys   0.94 ( 9%) wall
>
> gcc-3.3:
>
>  expand           :2151.80 (96%) usr  11.35 (44%) sys2181.19 (95%) wall
>
> This is with -O2 -fomit-frame-pointer -funroll-loops -ftime-report
>

Isn't it the same like in case of MICO?

http://gcc.gnu.org/ml/gcc/2003-02/msg02020.html  - report
http://gcc.gnu.org/ml/gcc/2003-03/msg00083.html  - flat profile provided (cc1plus)

and here is answer from Zack Weinberg:

http://gcc.gnu.org/ml/gcc/2003-03/msg00101.html

> The picture does change if I specify -fno-default-inline, in this case

That's interesting. I'll give it a try.

Cheers,

Karel
--
Karel Gardas                  kgardas@objectsecurity.com
ObjectSecurity Ltd.           http://www.objectsecurity.com

^ permalink raw reply	[flat|nested] 41+ messages in thread

* 3.3 compile time regression (22400%)
@ 2003-03-18 12:28 Richard Guenther
  2003-03-18 12:33 ` Karel Gardas
                   ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Richard Guenther @ 2003-03-18 12:28 UTC (permalink / raw)
  To: gcc

Hi!

I'm experiencing huge compile time regression with todays 3.3 compared to
3.2 during compiling of the POOMA library. -ftime-report shows that the
culprit is expand:

gcc-3.2:

 expand           :   0.83 ( 8%) usr   0.04 ( 8%) sys   0.94 ( 9%) wall

gcc-3.3:

 expand           :2151.80 (96%) usr  11.35 (44%) sys2181.19 (95%) wall

This is with -O2 -fomit-frame-pointer -funroll-loops -ftime-report

The picture does change if I specify -fno-default-inline, in this case
compile times are comparable (expand time shrinks to 1.33s in this case).
The complete build is then also faster with gcc 3.3 (3m17) than with
gcc 3.2 (4m5).

I can provide a multi-megabyte .ii file as testcase, but maybe somebody
already has an idea what is happening.

Richard.

--
Richard Guenther <richard.guenther@uni-tuebingen.de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2003-03-25 14:18 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.44.0303181300580.22005-100000@bellatrix.tat.physik.uni-tuebingen .de>
2003-03-18 17:04 ` 3.3 compile time regression (22400%) Steven Bosscher
2003-03-18 17:31   ` Richard Guenther
2003-03-22 15:13 ` Steven Bosscher
2003-03-22 17:03 ` Steven Bosscher
2003-03-23  4:30   ` Janis Johnson
2003-03-23 22:14     ` Richard Guenther
2003-03-24  3:26       ` Richard Guenther
2003-03-24  8:47         ` Steven Bosscher
2003-03-24 10:06           ` Richard Guenther
2003-03-24 10:50           ` 3.3 Bootstrap failure (was Re: 3.3 compile time regression (22400%)) Richard Guenther
2003-03-24 15:21           ` 3.3 compile time regression (22400%) Richard Guenther
2003-03-25  2:27     ` Janis Johnson
2003-03-25 15:11       ` Richard Guenther
2003-03-19 23:19 S. Bosscher
  -- strict thread matches above, loose matches on Subject: below --
2003-03-18 22:57 John David Anglin
2003-03-18 23:54 ` Albert Chin
2003-03-19  0:29 ` Michael Matz
2003-03-19  0:40   ` John David Anglin
2003-03-19  0:40   ` Steven Bosscher
2003-03-19  1:20     ` John David Anglin
2003-03-20  6:23     ` Albert Chin
2003-03-19 20:12   ` John David Anglin
2003-03-18 12:28 Richard Guenther
2003-03-18 12:33 ` Karel Gardas
2003-03-18 13:22   ` Michael Matz
2003-03-18 14:43     ` Karel Gardas
2003-03-18 13:21 ` Michael Matz
2003-03-18 14:25   ` Richard Guenther
2003-03-18 14:50     ` Karel Gardas
2003-03-18 15:19       ` Karel Gardas
2003-03-18 15:26         ` Richard Guenther
2003-03-18 15:33         ` Richard Guenther
2003-03-18 15:07   ` Richard Guenther
2003-03-18 15:25     ` Michael Matz
2003-03-18 15:09       ` Richard Guenther
2003-03-18 15:19         ` Michael Matz
2003-03-18 15:12       ` Richard Guenther
2003-03-18 15:20         ` Michael Matz
2003-03-18 15:50     ` Karel Gardas
2003-03-18 16:31       ` Richard Guenther
2003-03-18 16:35 ` Albert Chin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).