* Why is dynamic_cast so dam slow?
@ 1999-12-26 13:15 Kevin Atkinson
1999-12-26 16:01 ` Martin v. Loewis
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Kevin Atkinson @ 1999-12-26 13:15 UTC (permalink / raw)
To: gcc
When I was profiling one of my programs I discovered that a good deal of
the CPU time was spend in dcast or type_info calls. I have attached the
relevant parts of gprof output. I have several questions
1) Why is it so slow?
2) Why does there even need to be a function call for dynamic_cast.
Can't gcc get the relevant information from looking at the vtables in an
inline function.
3) Will this be improved with the new API.
Thanks again. For now I solved the problem by storing the necessary
information directly in the base object and simply checking with it and
then doing a static_cast instead of calling a dynamic_cast. My program
ran a good 20% faster by eliminating the time consuming dynamic_casts.
This was compiled with gcc 2.95.1 with "-O2 -g -pg". If this is due to
a KNOWN bug that got you KNOW was fixed in gcc 2.95.2 let me know and I
will upgrade and try again.
--
Kevin Atkinson
kevinatk@home.com
http://metalab.unc.edu/kevina/
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
8.22 0.44 0.44 __class_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const
7.85 0.86 0.42 22419 0.02 0.02 autil::edit_distance(char const *, char const *, char const *, char const *, autil::EditDistanceWeights const &)
5.23 1.14 0.28 type_info::operator==(type_info const &) const
4.86 1.40 0.26 667237 0.00 0.00 basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> >::replace(unsigned int, unsigned int, char const *, unsigned int)
4.86 1.66 0.26 102137 0.00 0.02 aspell_default_suggest::Working::try_sound(basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> > const &, double)
4.67 1.91 0.25 204274 0.00 0.00 aspell_default_writable_wl::WritableWS::words_w_soundslike(char const *) const
4.67 2.16 0.25 144713 0.00 0.00 autil::FastEditDistanceStructure<autil::FastEditDistanceDoNothing, aspell_default_suggest::ToLower>::FastEditDistanceStructure(char const *, char const *, autil::FastEditDistanceDoNothing const &, aspell_default_suggest::ToLower const &)
4.30 2.39 0.23 614484 0.00 0.00 basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> >::compare(basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> > const &, unsigned int, unsigned int) const
3.93 2.60 0.21 __si_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const
3.18 2.77 0.17 102137 0.00 0.00 autil::VectorHashTable<aspell_default_readonly_ws::ReadOnlyWS::SoundslikeLookupParms>::find_item(char const *const &) const
2.99 2.93 0.16 1231135 0.00 0.00 aspell_default_suggest::compare(aspell_default_suggest::ScoreWordSound const &, aspell_default_suggest::ScoreWordSound const &)
2.99 3.09 0.16 289426 0.00 0.00 aspell_default_suggest::ScoreWordSound::~ScoreWordSound(void)
2.62 3.23 0.14 147107 0.00 0.00 slist<aspell_default_suggest::ScoreWordSound, allocator<aspell_default_suggest::ScoreWordSound> >::merge(slist<aspell_default_suggest::ScoreWordSound, allocator<aspell_default_suggest::ScoreWordSound> > &)
2.43 3.36 0.13 __dynamic_cast
2.34 3.48 0.12 __builtin_new
2.24 3.60 0.12 246736 0.00 0.00 autil::MakeMultiVirEmulation<aspell::ws::InnerParms, aspell::ws::OuterParms>::next(void)
1.68 3.69 0.09 1924083 0.00 0.00 autil::ClonePtr<autil::VirEmulation<char const *> >::~ClonePtr(void)
1.68 3.79 0.09 __user_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const
1.50 3.87 0.08 817096 0.00 0.00 autil::ClonePtr<autil::VirEmulation<char const *> >::operator=(autil::ClonePtr<autil::VirEmulation<char const *> > const &)
1.50 3.94 0.08 252068 0.00 0.00 basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> >::replace(unsigned int, unsigned int, unsigned int, char)
index % time self children called name
<spontaneous>
[14] 8.2 0.44 0.00 __class_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const [14]
-----------------------------------------------
<spontaneous>
[19] 5.2 0.28 0.00 type_info::operator==(type_info const &) const [19]
-----------------------------------------------
<spontaneous>
[24] 4.5 0.13 0.11 __dynamic_cast [24]
0.04 0.00 425226/425226 aspell::BasicWordSet type_info function [48]
0.03 0.00 833777/833779 aspell::WordSet type_info function [55]
0.01 0.00 209834/209834 aspell_default_readonly_ws::ReadOnlyWS type_info function [70]
0.01 0.00 204275/204275 aspell_default_writable_repl::WritableReplS type_info function [71]
0.01 0.00 419668/419668 aspell_default_writable_wl::WritableWS type_info function [79]
0.01 0.00 408549/408549 aspell::BasicReplacementSet type_info function [80]
0.00 0.00 2/2 aspell::WritableWordSet type_info function [756]
-----------------------------------------------
<spontaneous>
[26] 3.9 0.21 0.00 __si_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const [26]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Why is dynamic_cast so dam slow?
1999-12-26 13:15 Why is dynamic_cast so dam slow? Kevin Atkinson
@ 1999-12-26 16:01 ` Martin v. Loewis
1999-12-31 23:54 ` Martin v. Loewis
1999-12-28 3:23 ` Nathan Sidwell
1999-12-31 23:54 ` Kevin Atkinson
2 siblings, 1 reply; 6+ messages in thread
From: Martin v. Loewis @ 1999-12-26 16:01 UTC (permalink / raw)
To: kevinatk; +Cc: gcc
> 1) Why is it so slow?
It needs to walk the entire class hierarchy, potentially dealing with
the following issues:
- multiple inheritance, and adjustment of the pointer by offsets
- virtual inheritance, and finding the vbase pointer inside the object
- ambiguities (i.e multiple appearances of the same base in an object)
need to produce a failure. That means that the entire class hierarchy
must be traversed
- access control (conversion to private bases) needs to lead to
failures. Therefore, private and protected inheritance must be
available at run time as well.
I don't know whether that justifies it being "so" slow, as that is a
relative judgement.
> 2) Why does there even need to be a function call for dynamic_cast.
> Can't gcc get the relevant information from looking at the vtables in an
> inline function.
No.
Or: make a proposal. Also, I doubt that the function calls make it
slow. Please note that there are virtual calls involved as well,
which specialize the "simple" cases for speed (i.e. single inheritance).
> 3) Will this be improved with the new API.
The ABI itself is not slow or fast; implementations of it are. I
think it has provisions for some speed improvements (e.g. vtable
points directly to type information, instead of to a function
returning type information).
> Thanks again. For now I solved the problem by storing the necessary
> information directly in the base object and simply checking with it and
> then doing a static_cast instead of calling a dynamic_cast. My program
> ran a good 20% faster by eliminating the time consuming dynamic_casts.
IMHO, if you have to convert to derived classes frequently and in many
places, something is wrong with the application design. This is of
course off-topic, here.
Regards,
Martin
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Why is dynamic_cast so dam slow?
1999-12-26 13:15 Why is dynamic_cast so dam slow? Kevin Atkinson
1999-12-26 16:01 ` Martin v. Loewis
@ 1999-12-28 3:23 ` Nathan Sidwell
1999-12-31 23:54 ` Nathan Sidwell
1999-12-31 23:54 ` Kevin Atkinson
2 siblings, 1 reply; 6+ messages in thread
From: Nathan Sidwell @ 1999-12-28 3:23 UTC (permalink / raw)
To: Kevin Atkinson; +Cc: gcc
Kevin Atkinson wrote:
> [stuff about dynamic_cast]
I see Martin has answered these.
> This was compiled with gcc 2.95.1 with "-O2 -g -pg". If this is due to
> a KNOWN bug that got you KNOW was fixed in gcc 2.95.2 let me know and I
> will upgrade and try again.
The current CVS tree has a different implementation of dynamic cast which
a) fixes bugs
b) should be faster in common cases
Would you like to try your benchmarks on the CVS compiler?
nathan
--
Dr Nathan Sidwell :: Computer Science Department :: Bristol University
Never hand someone a gun unless you are sure where they will point it
nathan@acm.org http://www.cs.bris.ac.uk/~nathan/ nathan@cs.bris.ac.uk
^ permalink raw reply [flat|nested] 6+ messages in thread
* Why is dynamic_cast so dam slow?
1999-12-26 13:15 Why is dynamic_cast so dam slow? Kevin Atkinson
1999-12-26 16:01 ` Martin v. Loewis
1999-12-28 3:23 ` Nathan Sidwell
@ 1999-12-31 23:54 ` Kevin Atkinson
2 siblings, 0 replies; 6+ messages in thread
From: Kevin Atkinson @ 1999-12-31 23:54 UTC (permalink / raw)
To: gcc
When I was profiling one of my programs I discovered that a good deal of
the CPU time was spend in dcast or type_info calls. I have attached the
relevant parts of gprof output. I have several questions
1) Why is it so slow?
2) Why does there even need to be a function call for dynamic_cast.
Can't gcc get the relevant information from looking at the vtables in an
inline function.
3) Will this be improved with the new API.
Thanks again. For now I solved the problem by storing the necessary
information directly in the base object and simply checking with it and
then doing a static_cast instead of calling a dynamic_cast. My program
ran a good 20% faster by eliminating the time consuming dynamic_casts.
This was compiled with gcc 2.95.1 with "-O2 -g -pg". If this is due to
a KNOWN bug that got you KNOW was fixed in gcc 2.95.2 let me know and I
will upgrade and try again.
--
Kevin Atkinson
kevinatk@home.com
http://metalab.unc.edu/kevina/
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
8.22 0.44 0.44 __class_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const
7.85 0.86 0.42 22419 0.02 0.02 autil::edit_distance(char const *, char const *, char const *, char const *, autil::EditDistanceWeights const &)
5.23 1.14 0.28 type_info::operator==(type_info const &) const
4.86 1.40 0.26 667237 0.00 0.00 basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> >::replace(unsigned int, unsigned int, char const *, unsigned int)
4.86 1.66 0.26 102137 0.00 0.02 aspell_default_suggest::Working::try_sound(basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> > const &, double)
4.67 1.91 0.25 204274 0.00 0.00 aspell_default_writable_wl::WritableWS::words_w_soundslike(char const *) const
4.67 2.16 0.25 144713 0.00 0.00 autil::FastEditDistanceStructure<autil::FastEditDistanceDoNothing, aspell_default_suggest::ToLower>::FastEditDistanceStructure(char const *, char const *, autil::FastEditDistanceDoNothing const &, aspell_default_suggest::ToLower const &)
4.30 2.39 0.23 614484 0.00 0.00 basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> >::compare(basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> > const &, unsigned int, unsigned int) const
3.93 2.60 0.21 __si_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const
3.18 2.77 0.17 102137 0.00 0.00 autil::VectorHashTable<aspell_default_readonly_ws::ReadOnlyWS::SoundslikeLookupParms>::find_item(char const *const &) const
2.99 2.93 0.16 1231135 0.00 0.00 aspell_default_suggest::compare(aspell_default_suggest::ScoreWordSound const &, aspell_default_suggest::ScoreWordSound const &)
2.99 3.09 0.16 289426 0.00 0.00 aspell_default_suggest::ScoreWordSound::~ScoreWordSound(void)
2.62 3.23 0.14 147107 0.00 0.00 slist<aspell_default_suggest::ScoreWordSound, allocator<aspell_default_suggest::ScoreWordSound> >::merge(slist<aspell_default_suggest::ScoreWordSound, allocator<aspell_default_suggest::ScoreWordSound> > &)
2.43 3.36 0.13 __dynamic_cast
2.34 3.48 0.12 __builtin_new
2.24 3.60 0.12 246736 0.00 0.00 autil::MakeMultiVirEmulation<aspell::ws::InnerParms, aspell::ws::OuterParms>::next(void)
1.68 3.69 0.09 1924083 0.00 0.00 autil::ClonePtr<autil::VirEmulation<char const *> >::~ClonePtr(void)
1.68 3.79 0.09 __user_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const
1.50 3.87 0.08 817096 0.00 0.00 autil::ClonePtr<autil::VirEmulation<char const *> >::operator=(autil::ClonePtr<autil::VirEmulation<char const *> > const &)
1.50 3.94 0.08 252068 0.00 0.00 basic_string<char, string_char_traits<char>, __default_alloc_template<true, 0> >::replace(unsigned int, unsigned int, unsigned int, char)
index % time self children called name
<spontaneous>
[14] 8.2 0.44 0.00 __class_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const [14]
-----------------------------------------------
<spontaneous>
[19] 5.2 0.28 0.00 type_info::operator==(type_info const &) const [19]
-----------------------------------------------
<spontaneous>
[24] 4.5 0.13 0.11 __dynamic_cast [24]
0.04 0.00 425226/425226 aspell::BasicWordSet type_info function [48]
0.03 0.00 833777/833779 aspell::WordSet type_info function [55]
0.01 0.00 209834/209834 aspell_default_readonly_ws::ReadOnlyWS type_info function [70]
0.01 0.00 204275/204275 aspell_default_writable_repl::WritableReplS type_info function [71]
0.01 0.00 419668/419668 aspell_default_writable_wl::WritableWS type_info function [79]
0.01 0.00 408549/408549 aspell::BasicReplacementSet type_info function [80]
0.00 0.00 2/2 aspell::WritableWordSet type_info function [756]
-----------------------------------------------
<spontaneous>
[26] 3.9 0.21 0.00 __si_type_info::dcast(type_info const &, int, void *, type_info const *, void *) const [26]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Why is dynamic_cast so dam slow?
1999-12-26 16:01 ` Martin v. Loewis
@ 1999-12-31 23:54 ` Martin v. Loewis
0 siblings, 0 replies; 6+ messages in thread
From: Martin v. Loewis @ 1999-12-31 23:54 UTC (permalink / raw)
To: kevinatk; +Cc: gcc
> 1) Why is it so slow?
It needs to walk the entire class hierarchy, potentially dealing with
the following issues:
- multiple inheritance, and adjustment of the pointer by offsets
- virtual inheritance, and finding the vbase pointer inside the object
- ambiguities (i.e multiple appearances of the same base in an object)
need to produce a failure. That means that the entire class hierarchy
must be traversed
- access control (conversion to private bases) needs to lead to
failures. Therefore, private and protected inheritance must be
available at run time as well.
I don't know whether that justifies it being "so" slow, as that is a
relative judgement.
> 2) Why does there even need to be a function call for dynamic_cast.
> Can't gcc get the relevant information from looking at the vtables in an
> inline function.
No.
Or: make a proposal. Also, I doubt that the function calls make it
slow. Please note that there are virtual calls involved as well,
which specialize the "simple" cases for speed (i.e. single inheritance).
> 3) Will this be improved with the new API.
The ABI itself is not slow or fast; implementations of it are. I
think it has provisions for some speed improvements (e.g. vtable
points directly to type information, instead of to a function
returning type information).
> Thanks again. For now I solved the problem by storing the necessary
> information directly in the base object and simply checking with it and
> then doing a static_cast instead of calling a dynamic_cast. My program
> ran a good 20% faster by eliminating the time consuming dynamic_casts.
IMHO, if you have to convert to derived classes frequently and in many
places, something is wrong with the application design. This is of
course off-topic, here.
Regards,
Martin
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Why is dynamic_cast so dam slow?
1999-12-28 3:23 ` Nathan Sidwell
@ 1999-12-31 23:54 ` Nathan Sidwell
0 siblings, 0 replies; 6+ messages in thread
From: Nathan Sidwell @ 1999-12-31 23:54 UTC (permalink / raw)
To: Kevin Atkinson; +Cc: gcc
Kevin Atkinson wrote:
> [stuff about dynamic_cast]
I see Martin has answered these.
> This was compiled with gcc 2.95.1 with "-O2 -g -pg". If this is due to
> a KNOWN bug that got you KNOW was fixed in gcc 2.95.2 let me know and I
> will upgrade and try again.
The current CVS tree has a different implementation of dynamic cast which
a) fixes bugs
b) should be faster in common cases
Would you like to try your benchmarks on the CVS compiler?
nathan
--
Dr Nathan Sidwell :: Computer Science Department :: Bristol University
Never hand someone a gun unless you are sure where they will point it
nathan@acm.org http://www.cs.bris.ac.uk/~nathan/ nathan@cs.bris.ac.uk
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~1999-12-31 23:54 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-12-26 13:15 Why is dynamic_cast so dam slow? Kevin Atkinson
1999-12-26 16:01 ` Martin v. Loewis
1999-12-31 23:54 ` Martin v. Loewis
1999-12-28 3:23 ` Nathan Sidwell
1999-12-31 23:54 ` Nathan Sidwell
1999-12-31 23:54 ` Kevin Atkinson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).