public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Why is dynamic_cast so dam slow?
@ 1999-12-26 13:15 Kevin Atkinson
  1999-12-26 16:01 ` Martin v. Loewis
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Kevin Atkinson @ 1999-12-26 13:15 UTC (permalink / raw)
  To: gcc

When I was profiling one of my programs I discovered that a good deal of
the CPU time was spend in dcast or type_info calls.  I have attached the
relevant parts of gprof output.   I have several questions

1) Why is it so slow?

2) Why does there even need to be a function call for dynamic_cast. 
Can't gcc get the relevant information from looking at the vtables in an
inline function.

3) Will this be improved with the new API.

Thanks again.  For now I solved the problem by storing the necessary
information directly in the base object and simply checking with it and
then doing a static_cast instead of calling a dynamic_cast.  My program
ran a good 20% faster by eliminating the time consuming dynamic_casts.

This was compiled with gcc 2.95.1 with "-O2 -g -pg".  If this is due to
a KNOWN bug that got you KNOW was fixed in gcc 2.95.2 let me know and I
will upgrade and try again.

-- 
Kevin Atkinson
kevinatk@home.com
http://metalab.unc.edu/kevina/
 Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
  8.22      0.44     0.44                             __class_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const
  7.85      0.86     0.42    22419     0.02     0.02  autil::edit_distance(char const *, char const *, char const *, char const *, autil::EditDistanceWeights const &)
  5.23      1.14     0.28                             type_info::operator==(type_info const &) const
  4.86      1.40     0.26   667237     0.00     0.00  basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> >::replace(unsigned int, unsigned int, char const *, unsigned int)
  4.86      1.66     0.26   102137     0.00     0.02  aspell_default_suggest::Working::try_sound(basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> > const &, double)
  4.67      1.91     0.25   204274     0.00     0.00  aspell_default_writable_wl::WritableWS::words_w_soundslike(char const *) const
  4.67      2.16     0.25   144713     0.00     0.00  autil::FastEditDistanceStructure<autil::FastEditDistanceDoNothing, aspell_default_suggest::ToLower>::FastEditDistanceStructure(char const *, char const *, autil::FastEditDistanceDoNothing const &, aspell_default_suggest::ToLower const &)
  4.30      2.39     0.23   614484     0.00     0.00  basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> >::compare(basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> > const &, unsigned int, unsigned int) const
  3.93      2.60     0.21                             __si_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const
  3.18      2.77     0.17   102137     0.00     0.00  autil::VectorHashTable<aspell_default_readonly_ws::ReadOnlyWS::SoundslikeLookupParms>::find_item(char const *const &) const
  2.99      2.93     0.16  1231135     0.00     0.00  aspell_default_suggest::compare(aspell_default_suggest::ScoreWordSound const &, aspell_default_suggest::ScoreWordSound const &)
  2.99      3.09     0.16   289426     0.00     0.00  aspell_default_suggest::ScoreWordSound::~ScoreWordSound(void)
  2.62      3.23     0.14   147107     0.00     0.00  slist<aspell_default_suggest::ScoreWordSound, allocator<aspell_default_suggest::ScoreWordSound> >::merge(slist<aspell_default_suggest::ScoreWordSound, allocator<aspell_default_suggest::ScoreWordSound> > &)
  2.43      3.36     0.13                             __dynamic_cast
  2.34      3.48     0.12                             __builtin_new
  2.24      3.60     0.12   246736     0.00     0.00  autil::MakeMultiVirEmulation<aspell::ws::InnerParms, aspell::ws::OuterParms>::next(void)
  1.68      3.69     0.09  1924083     0.00     0.00  autil::ClonePtr<autil::VirEmulation<char const *> >::~ClonePtr(void)
  1.68      3.79     0.09                             __user_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const
  1.50      3.87     0.08   817096     0.00     0.00  autil::ClonePtr<autil::VirEmulation<char const *> >::operator=(autil::ClonePtr<autil::VirEmulation<char const *> > const &)
  1.50      3.94     0.08   252068     0.00     0.00  basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> >::replace(unsigned int, unsigned int, unsigned int, char)

index % time    self  children    called     name
                                                 <spontaneous>
[14]     8.2    0.44    0.00                 __class_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const [14]
-----------------------------------------------
                                                 <spontaneous>
[19]     5.2    0.28    0.00                 type_info::operator==(type_info const &) const [19]
-----------------------------------------------
                                                 <spontaneous>
[24]     4.5    0.13    0.11                 __dynamic_cast [24]
                0.04    0.00  425226/425226      aspell::BasicWordSet type_info function [48]
                0.03    0.00  833777/833779      aspell::WordSet type_info function [55]
                0.01    0.00  209834/209834      aspell_default_readonly_ws::ReadOnlyWS type_info function [70]
                0.01    0.00  204275/204275      aspell_default_writable_repl::WritableReplS type_info function [71]
                0.01    0.00  419668/419668      aspell_default_writable_wl::WritableWS type_info function [79]
                0.01    0.00  408549/408549      aspell::BasicReplacementSet type_info function [80]
                0.00    0.00       2/2           aspell::WritableWordSet type_info function [756]
-----------------------------------------------
                                                 <spontaneous>
[26]     3.9    0.21    0.00                 __si_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const [26]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Why is dynamic_cast so dam slow?
  1999-12-26 13:15 Why is dynamic_cast so dam slow? Kevin Atkinson
@ 1999-12-26 16:01 ` Martin v. Loewis
  1999-12-31 23:54   ` Martin v. Loewis
  1999-12-28  3:23 ` Nathan Sidwell
  1999-12-31 23:54 ` Kevin Atkinson
  2 siblings, 1 reply; 6+ messages in thread
From: Martin v. Loewis @ 1999-12-26 16:01 UTC (permalink / raw)
  To: kevinatk; +Cc: gcc

> 1) Why is it so slow?

It needs to walk the entire class hierarchy, potentially dealing with
the following issues:
- multiple inheritance, and adjustment of the pointer by offsets
- virtual inheritance, and finding the vbase pointer inside the object
- ambiguities (i.e multiple appearances of the same base in an object)
  need to produce a failure. That means that the entire class hierarchy
  must be traversed
- access control (conversion to private bases) needs to lead to
  failures. Therefore, private and protected inheritance must be
  available at run time as well.

I don't know whether that justifies it being "so" slow, as that is a
relative judgement.

> 2) Why does there even need to be a function call for dynamic_cast. 
> Can't gcc get the relevant information from looking at the vtables in an
> inline function.

No.

Or: make a proposal. Also, I doubt that the function calls make it
slow.  Please note that there are virtual calls involved as well,
which specialize the "simple" cases for speed (i.e. single inheritance).

> 3) Will this be improved with the new API.

The ABI itself is not slow or fast; implementations of it are.  I
think it has provisions for some speed improvements (e.g. vtable
points directly to type information, instead of to a function
returning type information).

> Thanks again.  For now I solved the problem by storing the necessary
> information directly in the base object and simply checking with it and
> then doing a static_cast instead of calling a dynamic_cast.  My program
> ran a good 20% faster by eliminating the time consuming dynamic_casts.

IMHO, if you have to convert to derived classes frequently and in many
places, something is wrong with the application design. This is of
course off-topic, here.

Regards,
Martin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Why is dynamic_cast so dam slow?
  1999-12-26 13:15 Why is dynamic_cast so dam slow? Kevin Atkinson
  1999-12-26 16:01 ` Martin v. Loewis
@ 1999-12-28  3:23 ` Nathan Sidwell
  1999-12-31 23:54   ` Nathan Sidwell
  1999-12-31 23:54 ` Kevin Atkinson
  2 siblings, 1 reply; 6+ messages in thread
From: Nathan Sidwell @ 1999-12-28  3:23 UTC (permalink / raw)
  To: Kevin Atkinson; +Cc: gcc

Kevin Atkinson wrote:
> [stuff about dynamic_cast]
I see Martin has answered these.

> This was compiled with gcc 2.95.1 with "-O2 -g -pg".  If this is due to
> a KNOWN bug that got you KNOW was fixed in gcc 2.95.2 let me know and I
> will upgrade and try again.
The current CVS tree has a different implementation of dynamic cast which
a) fixes bugs
b) should be faster in common cases

Would you like to try your benchmarks on the CVS compiler?

nathan
-- 
Dr Nathan Sidwell :: Computer Science Department :: Bristol University
Never hand someone a gun unless you are sure where they will point it
nathan@acm.org  http://www.cs.bris.ac.uk/~nathan/  nathan@cs.bris.ac.uk

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Why is dynamic_cast so dam slow?
  1999-12-26 13:15 Why is dynamic_cast so dam slow? Kevin Atkinson
  1999-12-26 16:01 ` Martin v. Loewis
  1999-12-28  3:23 ` Nathan Sidwell
@ 1999-12-31 23:54 ` Kevin Atkinson
  2 siblings, 0 replies; 6+ messages in thread
From: Kevin Atkinson @ 1999-12-31 23:54 UTC (permalink / raw)
  To: gcc

When I was profiling one of my programs I discovered that a good deal of
the CPU time was spend in dcast or type_info calls.  I have attached the
relevant parts of gprof output.   I have several questions

1) Why is it so slow?

2) Why does there even need to be a function call for dynamic_cast. 
Can't gcc get the relevant information from looking at the vtables in an
inline function.

3) Will this be improved with the new API.

Thanks again.  For now I solved the problem by storing the necessary
information directly in the base object and simply checking with it and
then doing a static_cast instead of calling a dynamic_cast.  My program
ran a good 20% faster by eliminating the time consuming dynamic_casts.

This was compiled with gcc 2.95.1 with "-O2 -g -pg".  If this is due to
a KNOWN bug that got you KNOW was fixed in gcc 2.95.2 let me know and I
will upgrade and try again.

-- 
Kevin Atkinson
kevinatk@home.com
http://metalab.unc.edu/kevina/
 Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
  8.22      0.44     0.44                             __class_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const
  7.85      0.86     0.42    22419     0.02     0.02  autil::edit_distance(char const *, char const *, char const *, char const *, autil::EditDistanceWeights const &)
  5.23      1.14     0.28                             type_info::operator==(type_info const &) const
  4.86      1.40     0.26   667237     0.00     0.00  basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> >::replace(unsigned int, unsigned int, char const *, unsigned int)
  4.86      1.66     0.26   102137     0.00     0.02  aspell_default_suggest::Working::try_sound(basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> > const &, double)
  4.67      1.91     0.25   204274     0.00     0.00  aspell_default_writable_wl::WritableWS::words_w_soundslike(char const *) const
  4.67      2.16     0.25   144713     0.00     0.00  autil::FastEditDistanceStructure<autil::FastEditDistanceDoNothing, aspell_default_suggest::ToLower>::FastEditDistanceStructure(char const *, char const *, autil::FastEditDistanceDoNothing const &, aspell_default_suggest::ToLower const &)
  4.30      2.39     0.23   614484     0.00     0.00  basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> >::compare(basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> > const &, unsigned int, unsigned int) const
  3.93      2.60     0.21                             __si_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const
  3.18      2.77     0.17   102137     0.00     0.00  autil::VectorHashTable<aspell_default_readonly_ws::ReadOnlyWS::SoundslikeLookupParms>::find_item(char const *const &) const
  2.99      2.93     0.16  1231135     0.00     0.00  aspell_default_suggest::compare(aspell_default_suggest::ScoreWordSound const &, aspell_default_suggest::ScoreWordSound const &)
  2.99      3.09     0.16   289426     0.00     0.00  aspell_default_suggest::ScoreWordSound::~ScoreWordSound(void)
  2.62      3.23     0.14   147107     0.00     0.00  slist<aspell_default_suggest::ScoreWordSound, allocator<aspell_default_suggest::ScoreWordSound> >::merge(slist<aspell_default_suggest::ScoreWordSound, allocator<aspell_default_suggest::ScoreWordSound> > &)
  2.43      3.36     0.13                             __dynamic_cast
  2.34      3.48     0.12                             __builtin_new
  2.24      3.60     0.12   246736     0.00     0.00  autil::MakeMultiVirEmulation<aspell::ws::InnerParms, aspell::ws::OuterParms>::next(void)
  1.68      3.69     0.09  1924083     0.00     0.00  autil::ClonePtr<autil::VirEmulation<char const *> >::~ClonePtr(void)
  1.68      3.79     0.09                             __user_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const
  1.50      3.87     0.08   817096     0.00     0.00  autil::ClonePtr<autil::VirEmulation<char const *> >::operator=(autil::ClonePtr<autil::VirEmulation<char const *> > const &)
  1.50      3.94     0.08   252068     0.00     0.00  basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> >::replace(unsigned int, unsigned int, unsigned int, char)

index % time    self  children    called     name
                                                 <spontaneous>
[14]     8.2    0.44    0.00                 __class_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const [14]
-----------------------------------------------
                                                 <spontaneous>
[19]     5.2    0.28    0.00                 type_info::operator==(type_info const &) const [19]
-----------------------------------------------
                                                 <spontaneous>
[24]     4.5    0.13    0.11                 __dynamic_cast [24]
                0.04    0.00  425226/425226      aspell::BasicWordSet type_info function [48]
                0.03    0.00  833777/833779      aspell::WordSet type_info function [55]
                0.01    0.00  209834/209834      aspell_default_readonly_ws::ReadOnlyWS type_info function [70]
                0.01    0.00  204275/204275      aspell_default_writable_repl::WritableReplS type_info function [71]
                0.01    0.00  419668/419668      aspell_default_writable_wl::WritableWS type_info function [79]
                0.01    0.00  408549/408549      aspell::BasicReplacementSet type_info function [80]
                0.00    0.00       2/2           aspell::WritableWordSet type_info function [756]
-----------------------------------------------
                                                 <spontaneous>
[26]     3.9    0.21    0.00                 __si_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const [26]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Why is dynamic_cast so dam slow?
  1999-12-26 16:01 ` Martin v. Loewis
@ 1999-12-31 23:54   ` Martin v. Loewis
  0 siblings, 0 replies; 6+ messages in thread
From: Martin v. Loewis @ 1999-12-31 23:54 UTC (permalink / raw)
  To: kevinatk; +Cc: gcc

> 1) Why is it so slow?

It needs to walk the entire class hierarchy, potentially dealing with
the following issues:
- multiple inheritance, and adjustment of the pointer by offsets
- virtual inheritance, and finding the vbase pointer inside the object
- ambiguities (i.e multiple appearances of the same base in an object)
  need to produce a failure. That means that the entire class hierarchy
  must be traversed
- access control (conversion to private bases) needs to lead to
  failures. Therefore, private and protected inheritance must be
  available at run time as well.

I don't know whether that justifies it being "so" slow, as that is a
relative judgement.

> 2) Why does there even need to be a function call for dynamic_cast. 
> Can't gcc get the relevant information from looking at the vtables in an
> inline function.

No.

Or: make a proposal. Also, I doubt that the function calls make it
slow.  Please note that there are virtual calls involved as well,
which specialize the "simple" cases for speed (i.e. single inheritance).

> 3) Will this be improved with the new API.

The ABI itself is not slow or fast; implementations of it are.  I
think it has provisions for some speed improvements (e.g. vtable
points directly to type information, instead of to a function
returning type information).

> Thanks again.  For now I solved the problem by storing the necessary
> information directly in the base object and simply checking with it and
> then doing a static_cast instead of calling a dynamic_cast.  My program
> ran a good 20% faster by eliminating the time consuming dynamic_casts.

IMHO, if you have to convert to derived classes frequently and in many
places, something is wrong with the application design. This is of
course off-topic, here.

Regards,
Martin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Why is dynamic_cast so dam slow?
  1999-12-28  3:23 ` Nathan Sidwell
@ 1999-12-31 23:54   ` Nathan Sidwell
  0 siblings, 0 replies; 6+ messages in thread
From: Nathan Sidwell @ 1999-12-31 23:54 UTC (permalink / raw)
  To: Kevin Atkinson; +Cc: gcc

Kevin Atkinson wrote:
> [stuff about dynamic_cast]
I see Martin has answered these.

> This was compiled with gcc 2.95.1 with "-O2 -g -pg".  If this is due to
> a KNOWN bug that got you KNOW was fixed in gcc 2.95.2 let me know and I
> will upgrade and try again.
The current CVS tree has a different implementation of dynamic cast which
a) fixes bugs
b) should be faster in common cases

Would you like to try your benchmarks on the CVS compiler?

nathan
-- 
Dr Nathan Sidwell :: Computer Science Department :: Bristol University
Never hand someone a gun unless you are sure where they will point it
nathan@acm.org  http://www.cs.bris.ac.uk/~nathan/  nathan@cs.bris.ac.uk

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~1999-12-31 23:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-12-26 13:15 Why is dynamic_cast so dam slow? Kevin Atkinson
1999-12-26 16:01 ` Martin v. Loewis
1999-12-31 23:54   ` Martin v. Loewis
1999-12-28  3:23 ` Nathan Sidwell
1999-12-31 23:54   ` Nathan Sidwell
1999-12-31 23:54 ` Kevin Atkinson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).