public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: profiling wierdness
@ 1998-09-25 19:05 Mike Stump
  0 siblings, 0 replies; 2+ messages in thread
From: Mike Stump @ 1998-09-25 19:05 UTC (permalink / raw)
  To: egcs, steffend

> Date: Wed, 23 Sep 1998 15:54:24 -0600
> From: Dave Steffen <steffend@helicon.physics.colostate.edu>

This is benchmarking 101 stuff...  Not appropriate for egcs,
and I probably shouldn't respond...

> 	I've got some numerical code I'm trying (desperately) to speed
> up. I'm compiling with 

> 	c++ -ansi -pg ....

> and (right now) am using no optimization, so I can tell what
> improvements I'm getting because of improving the algorithm.

Benchmarking and trying to speed up -O0 programs I don't think is as
useful or meaningful as speeding up -O3 programs.  I'd recompile.

> 	So I run my program:
> helicon: kubo -XYHxyh -e ".1" -O "-5 5 .1" -P .66
> 	And I profile:
> helicon: gprof kubo | c++filt > kubo.prof
> 	And I get
>   %   cumulative   self              self     total           
>  time   seconds   seconds    calls  us/call  us/call  name    
>  56.86      0.29     0.29 19000000     0.02     0.02  double 
> dot_product<Sparse_
> vector_STLvec<double>, double>(Sparse_vector_STLvec<double> const &, double 
> cons
> t *)
>  15.69      0.37     0.08 19001940     0.00     0.00  vector<ai<int, double>, 

You cannot meaningfully time routines this way when they run for too
little time.  It appears this is too little to measure.  Try numbers
for 100 seconds of run, then 50, then 25, then 8, then 4, 2, 1...  The
point where your numbers start to diverge and become unpredictable is
the point that you can't measure past with this method.

If you can't make your program consume more time, it isn't meaningful
to speed up the program.  :-)

> 	And in case you're wondering, 'time'ing the run gives results
> like "0.11user 47.19system 0:47.30elapsed 99%CPU", and these numbers
> are very consistent. Also, the code executes correctly and generates
> identical output for all the above runs.

That means that measuring the time it takes that way _is_ valid and it
is long enough.

> 	SO: I'm very confused. Does anyone know what's going on? Is
> there any way for me to get reasonably accurate profiling
> information?

You can either learn to lengthen runs or learn to use different
techniques.  Personally, I like to use tick counters on the chips to
measure runtimes, from within the software.  If you do this, then
you'll find that you can tell the difference between running 3 machine
instructions or 4, no other method will do that, unless you put them
in a loop and do it more than once.  I believe that your chip have
tock counters you can use for this type of timing, you just have to
learn how to use them.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* profiling wierdness
@ 1998-09-23 15:55 Dave Steffen
  0 siblings, 0 replies; 2+ messages in thread
From: Dave Steffen @ 1998-09-23 15:55 UTC (permalink / raw)
  To: egcs

Hi Folks,

First for the trivia: 

platform is a dual Pentium-II system running Linux (Redhat 5.1)
compiler is egcs-1.1b (Reading specs from /opt/egcs/lib/gcc-lib/i686-pc-linux-g
nu/egcs-2.91.57/specs gcc version egcs-2.91.57 19980901 (egcs-1.1 release))

Now for the problem:

	I've got some numerical code I'm trying (desperately) to speed
up. I'm compiling with 

	c++ -ansi -pg ....

and (right now) am using no optimization, so I can tell what
improvements I'm getting because of improving the algorithm.

	So I run my program:
helicon: kubo -XYHxyh -e ".1" -O "-5 5 .1" -P .66
	And I profile:
helicon: gprof kubo | c++filt > kubo.prof
	And I get
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  us/call  us/call  name    
 56.86      0.29     0.29 19000000     0.02     0.02  double 
dot_product<Sparse_
vector_STLvec<double>, double>(Sparse_vector_STLvec<double> const &, double 
cons
t *)
 15.69      0.37     0.08 19001940     0.00     0.00  vector<ai<int, double>, 
__default_alloc_template<true, 0> >::end(void) const

	and so forth, for a total run time of .51 seconds. Which I
don't believe, so I run my program again, and profile again:

helicon: kubo -XYHxyh -e ".1" -O "-5 5 .1" -P .66
helicon: gprof kubo | c++filt > kubo.prof

	This time I get  %   cumulative   self              self     total           
 time   seconds   seconds    calls  us/call  us/call  name    
 66.67      0.02     0.02        9  2222.22  2222.22  C_matrix<double>
::set(double const &)
 33.33      0.03     0.01     1440     6.94     6.94  ai<int, double>::ai(int 
co nst &, double const &)

	and so forth, for for a total run time of .03 secons, which I
don't believe either. Note that the #1 user of time is a different
function. So I repeat the process; this time I get

  %   cumulative   self              self     total           
 time   seconds   seconds    calls  us/call  us/call  name    
 44.21      0.42     0.42 19000000     0.02     0.04  double 
dot_product<Sparse_
vector_STLvec<double>, double>(Sparse_vector_STLvec<double> const &, double 
const *)
 15.79      0.57     0.15 30600640     0.00     0.00  ai<int, double>
::index(void) const

	and so forth, for a total run time of .95 seconds.

Hmm...

	Explicity deleting 'gmon.out' between runs doesn't seem to
help; on three runs, gprof gives me total run times of .36, .00
(apparently nothing executed?), and .01.

	And in case you're wondering, 'time'ing the run gives results
like "0.11user 47.19system 0:47.30elapsed 99%CPU", and these numbers
are very consistent. Also, the code executes correctly and generates
identical output for all the above runs.

	Compiling with "-p" instead of "-pg" gives similar results:
.99,.93, and .01 seconds for 3 runs. Compiling with "-profile" results
in a seg fault when I execute the program (not when compiling, though;
the compilation appears to be successful.)

	SO: I'm very confused. Does anyone know what's going on? Is
there any way for me to get reasonably accurate profiling information?

Any help would be greatly appreciated. Thanks.




^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~1998-09-25 19:05 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-09-25 19:05 profiling wierdness Mike Stump
  -- strict thread matches above, loose matches on Subject: below --
1998-09-23 15:55 Dave Steffen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).