public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Florian Weimer <fweimer@redhat.com>
To: Thomas Neumann via Gcc <gcc@gcc.gnu.org>
Cc: Thomas Neumann <tneumann@users.sourceforge.net>
Subject: Re: performance of exception handling
Date: Mon, 11 May 2020 12:40:40 +0200	[thread overview]
Message-ID: <874ksmg3fb.fsf@oldenburg2.str.redhat.com> (raw)
In-Reply-To: <0bbdaab7-c083-e14e-6227-27713dab9657@users.sourceforge.net> (Thomas Neumann via Gcc's message of "Mon, 11 May 2020 10:14:36 +0200")

* Thomas Neumann via Gcc:

> Currently, exception handling scales poorly due to global mutexes when
> throwing. This can be seen with a small demo script here:
> https://repl.it/repls/DeliriousPrivateProfiler
> Using a thread count >1 is much slower than running single threaded.
> This global locking is particular painful on a machine with more than a
> hundred cores, as there mutexes are expensive and contention becomes
> much more likely due to the high degree of parallelism.
>
> Of course conventional wisdom is not to use exceptions when exceptions
> can occur somewhat frequently. But I think that is a silly argument, see
> the WG21 paper P0709 for a detailed discussion.

Link: <https://wg21.link/P0709>

I'm not sure if your summary is correct.

The claim in the paper that program bugs should not result in catchable
exceptions is also not what matches my limited experience with
application servers: They tend to install an exception handler of last
resort to catch unexpected exceptions (“bugs”) from processed requests
and log them, instead of letting them terminate the entire application
server.

> In particular since there is no technical reason why they have to be
> slow, it is just the current implementation that is slow.

I agree, the present state is not inherently due to the exception
handling model, it's a consequence of the current implementation.

> In the current gcc implementation on Linux the bottleneck is
> _Unwind_Find_FDE, or more precisely, the function dl_iterate_phdr,
> that is called for every frame and that iterates over all shared
> libraries while holding a global lock.
> That is inherently slow, both due to global locking and due to the data
> structures involved.

In particular, the libgcc unwinder relies on the global lock to protect
its own cache, so we cannot remove the lock from glibc.

> And it is not easy to speed that up with, e.g., a thread local cache,
> as glibc has no mechanism to notify us if a shared library is added or
> removed.

It is of course possible to change glibc.

My current preferred solution is something that moves the entire code
that locates the relevant FDE table into glibc.  This is all the code in
_Unwind_IteratePhdrCallback until the first read_encoded_value_with_base
call.  And the callback mechanism would be gone, so _Unwind_Find_FDE
would call __dl_ehframe_find (see below) and then the reamining
processing in _Unwind_IteratePhdrCallback.

The glibc interface would look like this:

/* Data returned by dl_find_ehframe.  */
struct dl_ehframe_info
{
  /* The link map of the object which contains the address.  */
  const struct link_map *dlehf_map;

  /* A pointer to its dynamic section.  This is a null pointer in
     statically linked applications.  */
  const ElfW(Dyn) *dlehf_dynamic;

  /* A pointer to the start of the PT_GNU_EH_FRAME segment for the
     object.  This is a null pointer if the object does not contain
     such a segment.  */
  const void *dlehf_ehframe;

  /* The size of the segment, or zero if not present.  */
  size_t dlehf_ehframe_size;

  /* Text and data base for the DWARF information in the segment.  */
  ElfW(Addr) dlehf_text_base;
  ElfW(Addr) dlehf_data_base;
};

/* Find the PT_GNU_EH_FRAME segment of the object which contains
   ADDRESS and writes information to it to *RESULT.  Return -1 if
   nothing was found, or 0 on success.  (*RESULT can be written to on
   failure, too.)  */
int __dl_ehframe_find (ElfW(Addr) __address,
                       struct dl_ehframe_info *__result)
  __THROW __nonnull ((2)) __attribute_warn_unused_result__;

It is the responsiblity of the glibc implementation of __dl_ehframe_find
to provide proper synchronization with the dynamic loader.  We can start
out with a lock-based implementation, as we have it today, and optimize
it later.

Based on prior discussions, this works because unwinding with a corrupt
stack or a stack containing unmapped objects is already undefined today,
so the live stack keeps all pointers returned from __dl_ehframe_find
valid.

The cache as it exists today would be removed from libgcc, but we
probably want to add a small cache that avoids the need to call into
glibc while unwinding through the same object (in which case we probably
should add boundary information to struct dl_ehframe_info).

The advantage of doing it this way is that it does not require
recompiling and relinking objects.

Thanks,
Florian


  reply	other threads:[~2020-05-11 10:41 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-11  8:14 Thomas Neumann
2020-05-11 10:40 ` Florian Weimer [this message]
2020-05-11 13:59   ` Thomas Neumann
2020-05-11 14:22     ` Florian Weimer
2020-05-11 15:14     ` size of exception handling (Was: performance of exception handling) Moritz Strübe
2020-05-12  7:20       ` Freddie Chopin
2020-05-12  7:47         ` Oleg Endo
2020-05-13  9:13           ` Jonathan Wakely
2020-05-12  9:16         ` size of exception handling Florian Weimer
2020-05-12  9:44           ` Freddie Chopin
2020-05-12 11:11             ` Jonathan Wakely
2020-05-12 11:17             ` Moritz Strübe
2020-05-12 11:29               ` Florian Weimer
2020-05-12 12:01                 ` Moritz Strübe
2020-05-12 11:07         ` size of exception handling (Was: performance of exception handling) Jonathan Wakely
2020-05-12 20:56           ` Freddie Chopin
2020-05-12 22:39             ` Jonathan Wakely
2020-05-12 22:48               ` Jonathan Wakely
2020-05-13  8:04                 ` David Brown
2020-05-12  9:03       ` size of exception handling Florian Weimer
2020-05-11 14:36   ` performance " David Edelsohn
2020-05-11 14:52     ` Florian Weimer
2020-05-11 15:12       ` David Edelsohn
2020-05-11 15:24         ` Florian Weimer
2020-05-12  6:08     ` Thomas Neumann
2020-05-12  7:15       ` Richard Biener
2020-05-12  7:30         ` Thomas Neumann
2020-05-12  9:01       ` Richard Sandiford
2020-05-13  1:13         ` Thomas Neumann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874ksmg3fb.fsf@oldenburg2.str.redhat.com \
    --to=fweimer@redhat.com \
    --cc=gcc@gcc.gnu.org \
    --cc=tneumann@users.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).