public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: cc1 hog
@ 1997-10-01 15:35 Mike Stump
  1997-10-07 11:41 ` Jeffrey A Law
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Stump @ 1997-10-01 15:35 UTC (permalink / raw)
  To: egcs

The right way to handle this is to introduce a new dejagnu reporting
type, and use it, then add utilities to monitor and track those
values.

Something like:

PERF: 400000000 gcc.c-torture/compile/900313-1.c,  -O1 compvmsize
PERF: 196.32 gcc.c-torture/compile/900313-1.c,  -O1 comptime
PERF: 47.1 gcc.c-torture/execute/900409-1.c compilation,  -O0 runtime

or more generally:

PERF: %f %s

Where %f is a number (floating double say) and %s is the name of the
test, lower numbers being better.  The unit is arbitrary and is
dependent upon the name of the test (and unchanging for a specific
test).  The units I used above are, maxstack+maxdata in bytes, time to
compile is milliseconds, runtime in milliseconds.

One could then imagine an analysis tool that compares two runs, and
tells you line by line, the % increase or decrease...  The harder part
is to figure out how to scale them (sibling importance ranking), but
let us theorize a hand tuned ranking (possibly nonlinear), given that
we could then come up with an objective, this is better/this is worse
type of answer to a basic question, like, does this change improve
things or hurt things.  In fact, if you scale across host-target combos
(ix86xi86 is more important than rompxspur), then you can collect tons
of perf information from everyone, crunch it and reduce this
blackmagic we call maintaining a compiler into more of a science.

We could then objectively answer questions like, should that bool in
cpp be a char or an int instead of having experts making educated
guesses.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: cc1 hog
  1997-10-01 15:35 cc1 hog Mike Stump
@ 1997-10-07 11:41 ` Jeffrey A Law
  1997-10-07 23:14   ` Joel Sherrill
  1997-10-07 23:14   ` Torbjorn Granlund
  0 siblings, 2 replies; 13+ messages in thread
From: Jeffrey A Law @ 1997-10-07 11:41 UTC (permalink / raw)
  To: Mike Stump; +Cc: egcs

  In message < 199710012235.PAA01505@kankakee.wrs.com >you write:
  > The right way to handle this is to introduce a new dejagnu reporting
  > type, and use it, then add utilities to monitor and track those
  > values.
  > 
  > Something like:
  > 
  > PERF: 400000000 gcc.c-torture/compile/900313-1.c,  -O1 compvmsize
  > PERF: 196.32 gcc.c-torture/compile/900313-1.c,  -O1 comptime
  > PERF: 47.1 gcc.c-torture/execute/900409-1.c compilation,  -O0 runtime
  > 
  > or more generally:
  > 
  > PERF: %f %s
Yuk.  I think this is far more complicated than it needs to be, and
it's probably unmanagable.

I'd be happy with something that just capped the amount of memory/time
any particular test needed -- the vast majority of c-torture tests are
small.

jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: cc1 hog
  1997-10-07 11:41 ` Jeffrey A Law
@ 1997-10-07 23:14   ` Joel Sherrill
  1997-10-08 21:19     ` Jeffrey A Law
  1997-10-07 23:14   ` Torbjorn Granlund
  1 sibling, 1 reply; 13+ messages in thread
From: Joel Sherrill @ 1997-10-07 23:14 UTC (permalink / raw)
  To: Jeffrey A Law; +Cc: Mike Stump, egcs

On Tue, 7 Oct 1997, Jeffrey A Law wrote:

> 
>   In message < 199710012235.PAA01505@kankakee.wrs.com >you write:
>   > The right way to handle this is to introduce a new dejagnu reporting
>   > type, and use it, then add utilities to monitor and track those
>   > values.
>   > 
>   > Something like:
>   > 
>   > PERF: 400000000 gcc.c-torture/compile/900313-1.c,  -O1 compvmsize
>   > PERF: 196.32 gcc.c-torture/compile/900313-1.c,  -O1 comptime
>   > PERF: 47.1 gcc.c-torture/execute/900409-1.c compilation,  -O0 runtime
>   > 
>   > or more generally:
>   > 
>   > PERF: %f %s
> Yuk.  I think this is far more complicated than it needs to be, and
> it's probably unmanagable.
> 
> I'd be happy with something that just capped the amount of memory/time
> any particular test needed -- the vast majority of c-torture tests are
> small.

The "universal" cap is useful and important.  it may be enough for
c-torture but other freely available test suites may not be as good a
fit.

I have spent a lot of time running the Ada Compiler Validation suite over
the past year or so on CPU simulators.   There is a fairly large
difference between the longest and shortest tests in this suite.  I have
always regretted that I have had to use a single time/instruction limit
for all tests in the suite.

OTOH it does work and is simpler to manager. :)

--joel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: cc1 hog
  1997-10-07 11:41 ` Jeffrey A Law
  1997-10-07 23:14   ` Joel Sherrill
@ 1997-10-07 23:14   ` Torbjorn Granlund
  1 sibling, 0 replies; 13+ messages in thread
From: Torbjorn Granlund @ 1997-10-07 23:14 UTC (permalink / raw)
  To: law; +Cc: Mike Stump, egcs

I have been meaning to bring this up for a long time.

The current testing framework is good for catching regressions that cause
miscompilations and compiler crashes.  But regressions in code quality will
not be tested for.

I have seen countless examples over the years when a certain optimization is
not effective because some change more or less completely disabled it.  I
actually believe GCC would become much better if we spent a larger fraction
of our time studying a set of small code samples to make sure they give
reasonable code.  But perhaps we could make something semi-automatic?

I don't think trying to set tight time limits for the c-torture/execute
tests would work in practice.  That would be unmanagable.  And as Jeff
points out, most tests are tiny and take zero time.

Instead we could introduce a new test category, c-torture/speed.  Either we
could maintain a database of timing results, or, perhaps run these tests
using two compilers.  One `old' compiler and one `new'.  The test framework
would flag whenever the new compiler generates worse code than the old.
Simple and maintenance-free!

The only problem with the latter approach would be accurate-enough timing.
Some CPUs have great features for cycle-exact timing (alpha, perhaps
pentium, and sparcv9 but just under linux since slowaris hides the
register), while on other systems we would have to stick to `getrusage' or
`clock'.

Torbjorn

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: cc1 hog
  1997-10-07 23:14   ` Joel Sherrill
@ 1997-10-08 21:19     ` Jeffrey A Law
  1997-10-09  9:26       ` Joel Sherrill
  0 siblings, 1 reply; 13+ messages in thread
From: Jeffrey A Law @ 1997-10-08 21:19 UTC (permalink / raw)
  To: Joel Sherrill; +Cc: Mike Stump, egcs

  In message < Pine.BSF.3.96.971007153646.19846L-100000@vespucci.advicom.net >you
 write:
  > The "universal" cap is useful and important.  it may be enough for
  > c-torture but other freely available test suites may not be as good a
  > fit.
So true.

  > I have spent a lot of time running the Ada Compiler Validation suite over
  > the past year or so on CPU simulators.   There is a fairly large
  > difference between the longest and shortest tests in this suite.  I have
  > always regretted that I have had to use a single time/instruction limit
  > for all tests in the suite.
Simulators certainly complicate the problem.

We're currently using a pretty gross hack --- we set slow_simulator or
some such, which causes some of the bigger tests in libstdc++/libio to
be scaled down or skipped.

jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: cc1 hog
  1997-10-08 21:19     ` Jeffrey A Law
@ 1997-10-09  9:26       ` Joel Sherrill
  0 siblings, 0 replies; 13+ messages in thread
From: Joel Sherrill @ 1997-10-09  9:26 UTC (permalink / raw)
  To: law; +Cc: Mike Stump, egcs

On Wed, 8 Oct 1997, Jeffrey A Law wrote:

>   In message < Pine.BSF.3.96.971007153646.19846L-100000@vespucci.advicom.net >you
>  write:
>   > The "universal" cap is useful and important.  it may be enough for
>   > c-torture but other freely available test suites may not be as good a
>   > fit.
> So true.

It is difficulat to set a cap for computationally heavy tests.  The
longest running (wall-time) tests in the Ada test suites tend to do
"delays" and you can say that test X is supposed to run for about Y
seconds with some confidence.  But I can't say this about the numeric
tests.

>   > I have spent a lot of time running the Ada Compiler Validation suite over
>   > the past year or so on CPU simulators.   There is a fairly large
>   > difference between the longest and shortest tests in this suite.  I have
>   > always regretted that I have had to use a single time/instruction limit
>   > for all tests in the suite.
> Simulators certainly complicate the problem.
> 
> We're currently using a pretty gross hack --- we set slow_simulator or
> some such, which causes some of the bigger tests in libstdc++/libio to
> be scaled down or skipped.

I have thought of identifying a couple of limits and have a table which
says high or low limit.  But it takes test management time.  

--joel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: cc1 hog
  1997-10-01 12:12   ` Robert Lipe
@ 1997-10-01 14:23     ` Joe Buck
  0 siblings, 0 replies; 13+ messages in thread
From: Joe Buck @ 1997-10-01 14:23 UTC (permalink / raw)
  To: Robert Lipe; +Cc: egcs

[ ulimit ]

> Of course, deciding "how big is too big, while still allowing 'big enough'"
> is an exercise left for the astute reader.  I know that I arbitraily picked
> a number and found that many mysterious c-torture faliures were actually
> becuase the number I picked was too low.

Ah, but if a test successfully completes with a ulimit set one week, and
the next week requires triple the memory, isn't this, if not quite a
regression, something that should be looked at?  Until a few months ago
I ran a Linux box with 12M RAM and 16M swap.  I would have been really
annoyed if that machine could no longer compile the Linux kernel.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: cc1 hog
  1997-10-01 12:40 ` Jim Wilson
@ 1997-10-01 14:02   ` Joe Buck
  0 siblings, 0 replies; 13+ messages in thread
From: Joe Buck @ 1997-10-01 14:02 UTC (permalink / raw)
  To: Jim Wilson; +Cc: neal, egcs

 
> There is one particular testcase in compile (961203-1.c) that may cause
> all VM to be consumed before cc1 fails.

Understood (that one usually dies with an infinite recursion so stack
limits catch it).  One thing to watch for is that in the old days, Linux boxes
typically came configured without any virtual memory limits, to which
Linux might respond to a 961203-1.c failure by entering a thrashing mode
which finally gets resolved by killing processes at random.  Some users
may still be configured this way ("I don't need no stinking limits!").

Thus it's important for make check to set some kind of virtual memory
limit if none is set; if this is too difficult to do we at least need to
make sure that we warn people before doing the tests (though the
appropriate limit or ulimit command isn't hard).

> make check takes forever because of the sheer amount of work that is being
> done.

Yes, but if many tests result in a working set that is larger than real
memory, "forever" starts getting more literal.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: cc1 hog
  1997-10-01 10:23 Neal Becker
  1997-10-01 11:14 ` Joe Buck
  1997-10-01 12:39 ` Jim Wilson
@ 1997-10-01 12:40 ` Jim Wilson
  1997-10-01 14:02   ` Joe Buck
  2 siblings, 1 reply; 13+ messages in thread
From: Jim Wilson @ 1997-10-01 12:40 UTC (permalink / raw)
  To: Neal Becker; +Cc: egcs

There is one particular testcase in compile (961203-1.c) that may cause all VM
to be consumed before cc1 fails.  I suspect you happened to look at the cc1
process while this testcase was being compiled.  If so, then this result is not
typical, and hence not particularly useful, because it does not apply to
any other testcase.

make check takes forever because of the sheer amount of work that is being
done.  Running the compiler 20000+ times while interpreting shell scripts
and TCL scripts obviously takes a lot of time.

Jim

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: cc1 hog
  1997-10-01 10:23 Neal Becker
  1997-10-01 11:14 ` Joe Buck
@ 1997-10-01 12:39 ` Jim Wilson
  1997-10-01 12:40 ` Jim Wilson
  2 siblings, 0 replies; 13+ messages in thread
From: Jim Wilson @ 1997-10-01 12:39 UTC (permalink / raw)
  To: Neal Becker; +Cc: egcs

There is one particular testcase in compile that may cause all VM to be
consumed before cc1 fails.  I suspect you happened to look at the cc1 process
while this testcase was being compiled.  If so, then this result is not
typical, and hence not particularly useful, because it does not apply to
any other testcase.

make check takes forever because of the sheer amount of work that is being
done.  Running the compiler 20000+ times while interpreting shell scripts
and TCL scripts obviously takes a lot of time.

Jim

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: cc1 hog
  1997-10-01 11:14 ` Joe Buck
@ 1997-10-01 12:12   ` Robert Lipe
  1997-10-01 14:23     ` Joe Buck
  0 siblings, 1 reply; 13+ messages in thread
From: Robert Lipe @ 1997-10-01 12:12 UTC (permalink / raw)
  To: egcs

> > Running /src/egcs-970929/gcc/testsuite/gcc.c-torture/compile/compile.exp ...
> >   PID USERNAME PRI NICE   SIZE   RES STATE   TIME   WCPU    CPU COMMAND
> >  3538 neal      -5    0 83388K 42732K sleep   0:12 18.27% 16.34% cc1
> 
> Perhaps we need to enhance dejagnu to add memory limits, and count tests
> as failing if they exceed a certain amount of memory.  Otherwise we won't
> immediately notice when a change suddenly explodes the amount of memory

Becuase OpenServer handles the 961203 "I'm going to eat every byte of VM
I can find" failure, er, ungracefully, I found that just running runtest
under ksh so I could use ulimit to cap VM usage, core file size, and stack
size was a big help.   If the compiler got too big, the OS killed it and
it shows up as a runtest failure - exactly what I wanted.

Of course, deciding "how big is too big, while still allowing 'big enough'"
is an exercise left for the astute reader.  I know that I arbitraily picked
a number and found that many mysterious c-torture faliures were actually
becuase the number I picked was too low.

RJL

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: cc1 hog
  1997-10-01 10:23 Neal Becker
@ 1997-10-01 11:14 ` Joe Buck
  1997-10-01 12:12   ` Robert Lipe
  1997-10-01 12:39 ` Jim Wilson
  1997-10-01 12:40 ` Jim Wilson
  2 siblings, 1 reply; 13+ messages in thread
From: Joe Buck @ 1997-10-01 11:14 UTC (permalink / raw)
  To: Neal Becker; +Cc: egcs

> 970929 hppa1.1 hpux9.05
> 
> Running /src/egcs-970929/gcc/testsuite/gcc.c-torture/compile/compile.exp ...
>   PID USERNAME PRI NICE   SIZE   RES STATE   TIME   WCPU    CPU COMMAND
>  3538 neal      -5    0 83388K 42732K sleep   0:12 18.27% 16.34% cc1
> 
> No wonder make check takes *forever*.

Perhaps we need to enhance dejagnu to add memory limits, and count tests
as failing if they exceed a certain amount of memory.  Otherwise we won't
immediately notice when a change suddenly explodes the amount of memory
required.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* cc1 hog
@ 1997-10-01 10:23 Neal Becker
  1997-10-01 11:14 ` Joe Buck
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Neal Becker @ 1997-10-01 10:23 UTC (permalink / raw)
  To: egcs

970929 hppa1.1 hpux9.05

Running /src/egcs-970929/gcc/testsuite/gcc.c-torture/compile/compile.exp ...
  PID USERNAME PRI NICE   SIZE   RES STATE   TIME   WCPU    CPU COMMAND
 3538 neal      -5    0 83388K 42732K sleep   0:12 18.27% 16.34% cc1

No wonder make check takes *forever*.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~1997-10-09  9:26 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1997-10-01 15:35 cc1 hog Mike Stump
1997-10-07 11:41 ` Jeffrey A Law
1997-10-07 23:14   ` Joel Sherrill
1997-10-08 21:19     ` Jeffrey A Law
1997-10-09  9:26       ` Joel Sherrill
1997-10-07 23:14   ` Torbjorn Granlund
  -- strict thread matches above, loose matches on Subject: below --
1997-10-01 10:23 Neal Becker
1997-10-01 11:14 ` Joe Buck
1997-10-01 12:12   ` Robert Lipe
1997-10-01 14:23     ` Joe Buck
1997-10-01 12:39 ` Jim Wilson
1997-10-01 12:40 ` Jim Wilson
1997-10-01 14:02   ` Joe Buck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).