public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* Strange exception handling behaviour with dlopen()
@ 2020-11-17 17:40 Nemeth, Laszlo02
  2020-11-17 18:56 ` David Hagood
  0 siblings, 1 reply; 9+ messages in thread
From: Nemeth, Laszlo02 @ 2020-11-17 17:40 UTC (permalink / raw)
  To: gcc-help

Hello,

I encountered strange aborts, caused by terminate(), probably because somehow an exception
was thrown from a "noexcept" function, or no exception handler was found.
All related to dlopen().

(I narrowed down this code from a bigger signal handling solution.)

1.
Let's start with a shared library, which throws a float during its static initialization phase:

// Compile with:
// g++ -shared -fPIC throw_in_static_init.cpp -o throw_in_static_init.so
#include <iostream>

static int throw314()
{
    std::cout << "throw314() called\n" << std::flush;
    throw 3.14f;
}

static int throwDuringInitialization = throw314();

2.
Load this library run-time with dlopen() and try to handle the exception:

// Compile with:
// g++ -Wl,-rpath,$PWD catch_load_exception_ABORTS_in_place.cpp -ldl
// ABORTS with terminate()
#include <iostream>
#include <dlfcn.h>

int main() {

    try {
        void* handle = dlopen("throw_in_static_init.so", RTLD_LAZY);
        std::cout << "Lib loading: " << (handle ? "successfull" : "failed") << "\n";
    } catch (float f) {
        std::cout << "Exception caught in main function: " << f << std::endl;
    }

    std::cout << "main() returns\n";
    return 0;
}

Result:
throw314() called
terminate called after throwing an instance of 'float'
Aborted (core dumped)

I checked this code with godbolt.org Compiler Explorer and found, that if I put the line

std::cout << "Lib loading: " << (handle ? "successfull" : "failed") << "\n";

in comment, then no exception handlig code was generated.
So maybe dlopen() was treated as "noexcept", and since an exception was thrown, it caused
terminate() to be called?

Let's move on:

3.
// Compile with:
// g++ -Wl,-rpath,$PWD catch_load_exception_OK_via_function.cpp -ldl
// Works as expected.
#include <iostream>
#include <dlfcn.h>

void loadFailing()
{
    void* handle = dlopen("throw_in_static_init.so", RTLD_LAZY);
    std::cout << "Lib loading: " << (handle ? "successfull" : "failed") << "\n";
}

int main() {

    try {
        loadFailing();
    } catch (float f) {
        std::cout << "Exception caught in main function: " << f << std::endl;
    }

    std::cout << "main() returns\n";
    return 0;
}

Result:
throw314() called
Exception caught in main function: 3.14
main() returns

WOW! It works now!

But why? How did dlopen() become a possible-throwing function now?

Let's move on and give it a small twist.

4.
// Compile with:
// g++ -Wl,-rpath,$PWD catch_load_exception_OK_in_place_w_guard_object.cpp -ldl
// Works as expected.
#include <iostream>
#include <dlfcn.h>

struct Guard
{
    Guard()
    {
        std::cout << "Guard() called\n";
    }
};

int main() {

    try {
        Guard g;
        
        void* handle = dlopen("throw_in_static_init.so", RTLD_LAZY);
        std::cout << "Lib loading: " << (handle ? "successfull" : "failed") << "\n";
    } catch (float f) {
        std::cout << "Exception caught in main function: " << f << std::endl;
    }

    std::cout << "main() returns\n";
    return 0;
}

Result:
Guard() called
throw314() called
Exception caught in main function: 3.14
main() returns

Works again, though dlopen() was called in the "try" block, and not through a function.
So why?

5.
Let's add a local variable:

// Compile with:
// g++ -Wl,-rpath,$PWD catch_load_exception_ABORTS_in_place_w_guard_object_w_awkward.cpp -ldl
// ABORTS with terminate()
#include <iostream>
#include <dlfcn.h>
#include <unordered_map>

struct Guard
{
    Guard()
    {
        std::cout << "Guard() called\n";
    }
};

int main() {

    try {
        Guard g;
        
        std::unordered_map<int, int> awkward;
        
        void* handle = dlopen("throw_in_static_init.so", RTLD_LAZY);
        std::cout << "Lib loading: " << (handle ? "successfull" : "failed") << "\n";
    } catch (float f) {
        std::cout << "Exception caught in main function: " << f << std::endl;
    }

    std::cout << "main() returns\n";
    return 0;
}

Result:
Guard() called
throw314() called
terminate called after throwing an instance of 'float'
Aborted (core dumped)

FAILS Again!

What did this "awkward" variable do to the code, so that it fails now?
According to Compiler Explorer code was generated to the try-catch block.

Can anybody please enlighten me, what is going on here?

Tested with:
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
gcc (Ubuntu 8.4.0-1ubuntu1~18.04) 8.4.0

Any help appreciated!
Regards,
--
Leslie
Software Developer
Deep Machine Learning Realization (AMS ADAS BUD ENG RL)
Advanced Driver Assistance Systems, Autonomous Mobility and Safety
Budapest, Continental Hungary Ltd.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Strange exception handling behaviour with dlopen()
  2020-11-17 17:40 Strange exception handling behaviour with dlopen() Nemeth, Laszlo02
@ 2020-11-17 18:56 ` David Hagood
  2020-11-17 19:16   ` Stefan Ring
  2020-11-17 19:17   ` Jonathan Wakely
  0 siblings, 2 replies; 9+ messages in thread
From: David Hagood @ 2020-11-17 18:56 UTC (permalink / raw)
  To: gcc-help


[-- Attachment #1.1.1: Type: text/plain, Size: 1740 bytes --]

You are squarely into "undefined behavior" territory here:


> static int throw314()
> {
>      std::cout << "throw314() called\n" << std::flush;
>      throw 3.14f;
> }
>
> static int throwDuringInitialization = throw314();

You are throwing from a constructor. That's STRONGLY discouraged - it 
leads to undefined behavior, as you cannot tell how much of an object 
has been constructed, and thus how much can be safely torn down. It's 
far better to have a function to be called after the constructor has 
completed, and let that function throw if needed, so that you can be 
sure all objects are fully constructed (and thus can be destructed) 
before the exception can cause problems.


> int main() {
>
>      try {
>          void* handle = dlopen("throw_in_static_init.so", RTLD_LAZY);
>          std::cout << "Lib loading: " << (handle ? "successfull" : "failed") << "\n";
>      } catch (float f) {
>          std::cout << "Exception caught in main function: " << f << std::endl;
>      }

It's also a really bad idea to throw an exception from within a C 
function (dlopen). Ideally, any function with extern "C" linkage should 
avoid any C++ specific concepts, like exceptions. The compiler won't 
have done all the homework in dlopen for things like unwinding the stack 
in an exception, since exceptions don't exist in C.


If you need your shared library to do some initialization that 
potentially could fail, you should probably define a C++ function within 
the shared library that does the exception, and will be called after the 
dlopen completes successfully.


I suggest you get the Scott Meyer books "Effective C++" and “More 
Effective C++” and read them thoroughly.


[-- Attachment #1.1.2: OpenPGP_0x5B9DC79986207D69.asc --]
[-- Type: application/pgp-keys, Size: 3217 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Strange exception handling behaviour with dlopen()
  2020-11-17 18:56 ` David Hagood
@ 2020-11-17 19:16   ` Stefan Ring
  2020-11-17 19:17   ` Jonathan Wakely
  1 sibling, 0 replies; 9+ messages in thread
From: Stefan Ring @ 2020-11-17 19:16 UTC (permalink / raw)
  To: gcc-help

On Tue, Nov 17, 2020 at 7:56 PM David Hagood via Gcc-help
<gcc-help@gcc.gnu.org> wrote:
>
> You are squarely into "undefined behavior" territory here:

I have to admit, I enjoyed reading the question and learning about the
associated surprises, though ;).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Strange exception handling behaviour with dlopen()
  2020-11-17 18:56 ` David Hagood
  2020-11-17 19:16   ` Stefan Ring
@ 2020-11-17 19:17   ` Jonathan Wakely
  2020-11-17 19:25     ` David Hagood
  2020-11-17 20:44     ` Florian Weimer
  1 sibling, 2 replies; 9+ messages in thread
From: Jonathan Wakely @ 2020-11-17 19:17 UTC (permalink / raw)
  To: David Hagood; +Cc: gcc-help

On Tue, 17 Nov 2020 at 18:58, David Hagood via Gcc-help
<gcc-help@gcc.gnu.org> wrote:
>
> You are squarely into "undefined behavior" territory here:
>
>
> > static int throw314()
> > {
> >      std::cout << "throw314() called\n" << std::flush;
> >      throw 3.14f;
> > }
> >
> > static int throwDuringInitialization = throw314();
>
> You are throwing from a constructor. That's STRONGLY discouraged - it

No it isn't.

> leads to undefined behavior,

No it doesn't.

> as you cannot tell how much of an object
> has been constructed, and thus how much can be safely torn down. It's

The C++ language is quite clear. The base class and member subobjects
that have already been constructed will get destroyed in reverse order
of construction.

> far better to have a function to be called after the constructor has
> completed, and let that function throw if needed, so that you can be
> sure all objects are fully constructed (and thus can be destructed)
> before the exception can cause problems.

Two-stage initialization is strongly discouraged. They create
partially-formed objects that can't be used. Far better to use
exceptions in a constructor to report failure. That's one of the main
reasons exceptions were added to C++ in the first place.

But that's irrelevant here, since the object being constructed is an int.


>
>
> > int main() {
> >
> >      try {
> >          void* handle = dlopen("throw_in_static_init.so", RTLD_LAZY);
> >          std::cout << "Lib loading: " << (handle ? "successfull" : "failed") << "\n";
> >      } catch (float f) {
> >          std::cout << "Exception caught in main function: " << f << std::endl;
> >      }
>
> It's also a really bad idea to throw an exception from within a C
> function (dlopen). Ideally, any function with extern "C" linkage should
> avoid any C++ specific concepts, like exceptions. The compiler won't
> have done all the homework in dlopen for things like unwinding the stack
> in an exception, since exceptions don't exist in C.

Nothing on the stack of dlopen has a destructor, so unwinding is a
no-op. GCC tries to ensure it works in general.


> If you need your shared library to do some initialization that
> potentially could fail, you should probably define a C++ function within
> the shared library that does the exception, and will be called after the
> dlopen completes successfully.
>
>
> I suggest you get the Scott Meyer books "Effective C++" and “More
> Effective C++” and read them thoroughly.

I don't think either of them covers dlopen, does it?

Item 10 in More Effective C++ specifically deals with what happens
when constructors throw, and doesn't say to avoid it.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Strange exception handling behaviour with dlopen()
  2020-11-17 19:17   ` Jonathan Wakely
@ 2020-11-17 19:25     ` David Hagood
  2020-11-17 20:12       ` Jonathan Wakely
  2020-11-17 20:44     ` Florian Weimer
  1 sibling, 1 reply; 9+ messages in thread
From: David Hagood @ 2020-11-17 19:25 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: gcc-help


[-- Attachment #1.1.1: Type: text/plain, Size: 577 bytes --]

I'm sorry, you have the right to your opinion, but I disagree, and so do 
many of the people on the ISO standards committee for the language, as 
do many people doing safety critical and mission critical work. While 
the behavior of the compiler generated code is specified in the case of 
a throw in a constructor, the problem is that the compiler has no way to 
understand the user generated code in the destructor, and thus can take 
incorrect action. Several software architecture standards codify 
two-phase construction and destruction for precisely that reason.

[-- Attachment #1.1.2: OpenPGP_0x5B9DC79986207D69.asc --]
[-- Type: application/pgp-keys, Size: 3217 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Strange exception handling behaviour with dlopen()
  2020-11-17 19:25     ` David Hagood
@ 2020-11-17 20:12       ` Jonathan Wakely
  0 siblings, 0 replies; 9+ messages in thread
From: Jonathan Wakely @ 2020-11-17 20:12 UTC (permalink / raw)
  To: David Hagood; +Cc: gcc-help

On Tue, 17 Nov 2020 at 19:25, David Hagood <david.hagood@gmail.com> wrote:
>
> I'm sorry, you have the right to your opinion, but I disagree, and so do
> many of the people on the ISO standards committee for the language, as

How many of them have you asked? You know you're talking to the chair
of the ISO standards committee's Library Working Group? Have you
checked what the C++ Core Guidelines say about this?
https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rnr-two-phase-init

> do many people doing safety critical and mission critical work. While
> the behavior of the compiler generated code is specified in the case of
> a throw in a constructor, the problem is that the compiler has no way to
> understand the user generated code in the destructor,

The destructor won't be run because the object hasn't been constructed
yet. I strongly suggest that you take your own advice and read More
Effective C++, specifically Item 10.

> and thus can take
> incorrect action. Several software architecture standards codify
> two-phase construction and destruction for precisely that reason.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Strange exception handling behaviour with dlopen()
  2020-11-17 19:17   ` Jonathan Wakely
  2020-11-17 19:25     ` David Hagood
@ 2020-11-17 20:44     ` Florian Weimer
  1 sibling, 0 replies; 9+ messages in thread
From: Florian Weimer @ 2020-11-17 20:44 UTC (permalink / raw)
  To: Jonathan Wakely via Gcc-help

* Jonathan Wakely via Gcc-help:

> On Tue, 17 Nov 2020 at 18:58, David Hagood via Gcc-help
> <gcc-help@gcc.gnu.org> wrote:
>>
>> You are squarely into "undefined behavior" territory here:
>>
>>
>> > static int throw314()
>> > {
>> >      std::cout << "throw314() called\n" << std::flush;
>> >      throw 3.14f;
>> > }
>> >
>> > static int throwDuringInitialization = throw314();
>>
>> You are throwing from a constructor. That's STRONGLY discouraged - it
>
> No it isn't.
>
>> leads to undefined behavior,
>
> No it doesn't.

The code throws from an *ELF* constructor (although it uses a C++
constructor to achieve this).  This is currently not supported by the
glibc implementation.

It's possible to correct this.  A really clean solution would require
that we move the unwinder into glibc.  Maybe it is possible to get the
desired effect by using the unwinder that was just loaded to catch the
exception, but that seems rather tricky.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Strange exception handling behaviour with dlopen()
  2020-11-18 11:54 Nemeth, Laszlo02
@ 2020-11-22 17:32 ` Florian Weimer
  0 siblings, 0 replies; 9+ messages in thread
From: Florian Weimer @ 2020-11-22 17:32 UTC (permalink / raw)
  To: Nemeth, Laszlo02; +Cc: gcc-help

* Laszlo Nemeth:

>>The code throws from an *ELF* constructor (although it uses a C++
>>constructor to achieve this).  This is currently not supported by the
>>glibc implementation.
>
> Can you please Florian elaborate what this "not supported" means?

dlopen is marked as not throwing.  This is easy enough to change.  But
the dynamic linker does not deallocate resources (including locks) if
an ELF constructor throws an exception.  And that is difficult to
correct.

>>It's possible to correct this.  A really clean solution would require
>>that we move the unwinder into glibc.  Maybe it is possible to get the
>>desired effect by using the unwinder that was just loaded to catch the
>>exception, but that seems rather tricky.
>
> Florian, can you please tell what that means?
>
> Is there any reliable way to resolve this with the current gcc and glibc ?

No, it requires glibc changes, and possibly libgcc_s changes.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Strange exception handling behaviour with dlopen()
@ 2020-11-18 11:54 Nemeth, Laszlo02
  2020-11-22 17:32 ` Florian Weimer
  0 siblings, 1 reply; 9+ messages in thread
From: Nemeth, Laszlo02 @ 2020-11-18 11:54 UTC (permalink / raw)
  To: gcc-help

> On Tue Nov 17 2020 at 20:44, Florian Weimer via Gcc-help <gcc-help@gcc.gnu.org> wrote:
>
>* Jonathan Wakely via Gcc-help:
>
>> On Tue, 17 Nov 2020 at 18:58, David Hagood via Gcc-help
>> <gcc-help@gcc.gnu.org> wrote:
>>>
>>> You are squarely into "undefined behavior" territory here:
>>>
>>>
>>> > static int throw314()
>>> > {
>>> >      std::cout << "throw314() called\n" << std::flush;
>>> >      throw 3.14f;
>>> > }
>>> >
>>> > static int throwDuringInitialization = throw314();
>>>
>>> You are throwing from a constructor. That's STRONGLY discouraged - it
>>
>> No it isn't.
>>
>>> leads to undefined behavior,
>>
>> No it doesn't.

Thanks Jonathan for stepping in and clarifying this :)
David, of course Jonathan is right, throwing exception is basically the only error handling
solution in ctors (beside logging). The standard clearly supports and describes this.

This wasn't the issue here.

>
>The code throws from an *ELF* constructor (although it uses a C++
>constructor to achieve this).  This is currently not supported by the
>glibc implementation.
>

Can you please Florian elaborate what this "not supported" means?
In some cases, which I presented, it works. Why?

My examples are only a bare-bone version of a larger error handling solution (attempt),
which tries to handle hardware exceptions by "translating" signals into C++ exception.

Others came up with similar solutions before, it's not my idea, see e.g.:
https://www.deadalnix.me/2012/03/24/get-an-exception-from-a-segfault-on-linux-x86-and-x86_64-using-some-black-magic/

One of those points when such a signal can be triggered is in the static initialization phase.
E.g. division by zero or seg. fault may occur in the static initialization code of a plugin library.

This example draft is closer what I actually do:

// Plugin library example of an erroneous code, compile with:
// g++ -shared -fPIC raise_SIGFPE_in_static_init.cpp -o raise_SIGFPE_in_static_init.so
#include <iostream>

static struct DivisionByZeroInCtor
{
  DivisionByZeroInCtor()
  {
    std::cout << "DivisionByZeroInCtor() called\n" << std::flush;
    int five = 5, zero = 0;
    int crash = five / zero;
  }
} divisionByZeroDuringInitialization;

// Main program, compile with:
// g++ -fnon-call-exceptions -Wl,-rpath,$PWD signal_translator_OK_in_place_load_w_signal_guard.cpp -ldl
// Works as expected.

#include <stdexcept>
#include <iostream>
#include <signal.h>
#include <dlfcn.h>

// Translate a signal into an exception
void signalTranslator(int sig)
{
    std::cout << "signalTranslator() called with signal=" << sig << "\n" << std::flush;
    throw std::runtime_error("Exception thrown from Signal Translator");
}

class SignalGuard
{
public:
    SignalGuard() {
        struct sigaction recovery_action;
        recovery_action.sa_handler = &signalTranslator;
        sigemptyset(&recovery_action.sa_mask);
        recovery_action.sa_flags = SA_NODEFER;
        sigaction(SIGFPE, &recovery_action, NULL);
    }
};

int main() {

    try {
        SignalGuard g;

        void* handle = dlopen("raise_SIGFPE_in_static_init.so", RTLD_LAZY);
        std::cout << "Lib loading: " << (handle ? "successfull" : "failed") << "\n";

    } catch (std::exception &e) {
        std::cout << "Exception caught in main function: " << e.what() << std::endl;
    }

    std::cout << "main() returns\n";
    return 0;
}
// This code actually works as expected.

This "signal translator" schema basically works in every other case for us, except in this
problematic case of dlopen() loading a plugin library, which raises a signal during static
init phase.

The crucial point of this solution is that it assumes that any C function can throw: since any
code can raise a signal, which is then translated into a C++ exception. And so I assume that the
compiler generates proper code for this situation.

We have an equivalent properly working solution on Windows (not with Posix signals). This is
possible (partly) because MSVC supports assuming C-code may throw.
See https://docs.microsoft.com/en-us/cpp/build/reference/eh-exception-handling-model?view=msvc-160#arguments

It would be good to have a working solution with Linux/gcc, too...

>It's possible to correct this.  A really clean solution would require
>that we move the unwinder into glibc.  Maybe it is possible to get the
>desired effect by using the unwinder that was just loaded to catch the
>exception, but that seems rather tricky.

Florian, can you please tell what that means?

Is there any reliable way to resolve this with the current gcc and glibc ?

Am I right that if gcc can assume that certain (or any) C function can throw, then it had been able to generate correct unwinder code ?

Regards
--
László
ADAS Software Developer, Budapest, Hungary, +36-20-2230357
Less code - less problem, no code - no problem

(Sorry for being out-of-sync with the first mail-thread, I subscribed to the mail-list too late...)


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-11-22 17:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-17 17:40 Strange exception handling behaviour with dlopen() Nemeth, Laszlo02
2020-11-17 18:56 ` David Hagood
2020-11-17 19:16   ` Stefan Ring
2020-11-17 19:17   ` Jonathan Wakely
2020-11-17 19:25     ` David Hagood
2020-11-17 20:12       ` Jonathan Wakely
2020-11-17 20:44     ` Florian Weimer
2020-11-18 11:54 Nemeth, Laszlo02
2020-11-22 17:32 ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).