public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* performance of exception handling
@ 2020-05-11  8:14 Thomas Neumann
  2020-05-11 10:40 ` Florian Weimer
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Neumann @ 2020-05-11  8:14 UTC (permalink / raw)
  To: gcc

Hi,

I want to improve the performance of C++ exception handling, and I would
like to get some feedback on how to tackle that.

Currently, exception handling scales poorly due to global mutexes when
throwing. This can be seen with a small demo script here:
https://repl.it/repls/DeliriousPrivateProfiler
Using a thread count >1 is much slower than running single threaded.
This global locking is particular painful on a machine with more than a
hundred cores, as there mutexes are expensive and contention becomes
much more likely due to the high degree of parallelism.

Of course conventional wisdom is not to use exceptions when exceptions
can occur somewhat frequently. But I think that is a silly argument, see
the WG21 paper P0709 for a detailed discussion. In particular since
there is no technical reason why they have to be slow, it is just the
current implementation that is slow.

In the current gcc implementation on Linux the bottleneck is
_Unwind_Find_FDE, or more precisely, the function dl_iterate_phdr,
that is called for every frame and that iterates over all shared
libraries while holding a global lock.
That is inherently slow, both due to global locking and due to the data
structures involved.
And it is not easy to speed that up with, e.g., a thread local cache, as
glibc has no mechanism to notify us if a shared library is added or removed.

We therefore need a way to locate the exception frames that is
independent from glibc. One way to achieve that would be to explicitly
register exception frames with __register_frame_info_bases in a
constructor function (and deregister them in a destructor function).
Of course probing explicitly registered frame currently uses a global
lock, too, but that implementation is provided by libgcc, and we can
change that to something better, allowing for lock free reads.
In libgcc explicitly registered frames take precedence over the
dl_iterate_phdr mechanism, which means that we could mix future code
that does call __register_frame_info_bases explicitly with code that
does not. Code that does register will unwind faster than code that does
not, but both can coexist in one process.

Does that sound like a viable strategy to speed up exception handling? I
would be willing to contribute code for that, but I first wanted to know
if you are interested and if the strategy makes sense. Also, my
implementation makes use of atomics, which I hope are available on all
platforms that use unwind-dw2-fde.c, but I am not sure.

Thomas

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2020-05-13  9:13 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-11  8:14 performance of exception handling Thomas Neumann
2020-05-11 10:40 ` Florian Weimer
2020-05-11 13:59   ` Thomas Neumann
2020-05-11 14:22     ` Florian Weimer
2020-05-11 15:14     ` size of exception handling (Was: performance of exception handling) Moritz Strübe
2020-05-12  7:20       ` Freddie Chopin
2020-05-12  7:47         ` Oleg Endo
2020-05-13  9:13           ` Jonathan Wakely
2020-05-12  9:16         ` size of exception handling Florian Weimer
2020-05-12  9:44           ` Freddie Chopin
2020-05-12 11:11             ` Jonathan Wakely
2020-05-12 11:17             ` Moritz Strübe
2020-05-12 11:29               ` Florian Weimer
2020-05-12 12:01                 ` Moritz Strübe
2020-05-12 11:07         ` size of exception handling (Was: performance of exception handling) Jonathan Wakely
2020-05-12 20:56           ` Freddie Chopin
2020-05-12 22:39             ` Jonathan Wakely
2020-05-12 22:48               ` Jonathan Wakely
2020-05-13  8:04                 ` David Brown
2020-05-12  9:03       ` size of exception handling Florian Weimer
2020-05-11 14:36   ` performance " David Edelsohn
2020-05-11 14:52     ` Florian Weimer
2020-05-11 15:12       ` David Edelsohn
2020-05-11 15:24         ` Florian Weimer
2020-05-12  6:08     ` Thomas Neumann
2020-05-12  7:15       ` Richard Biener
2020-05-12  7:30         ` Thomas Neumann
2020-05-12  9:01       ` Richard Sandiford
2020-05-13  1:13         ` Thomas Neumann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).