public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
* dlopen: Segfault due to overwriting .so file after it was loaded and loading it again
@ 2020-12-04 10:39 Wendeborn, Jonathan
  2020-12-04 10:44 ` AW: " Wendeborn, Jonathan
  0 siblings, 1 reply; 6+ messages in thread
From: Wendeborn, Jonathan @ 2020-12-04 10:39 UTC (permalink / raw)
  To: libc-help

Hi,

I am a C++ developer but usually programming and debugging on Windows (so please excuse any wrong terms). Now I'm compiling my program on Linux (gcc 9.3.0 on Debian Bullseye with Boost 1.70) for the first time and get a Segfault in my unit tests.
Luckily I was able to write a reproducer and boil it down to my code overwriting the .so file after having it loaded (and unloaded):

#include <boost/filesystem/operations.hpp>
#include <boost/dll/shared_library.hpp>
#include <iostream>

void doit() {
    boost::filesystem::copy_file("~/project/target/references/bin/libSomething.so", "~/project/build/bin/ linux-x86_64-gcc9-debug/ libSomething.so", boost::filesystem::copy_option::overwrite_if_exists);

    boost::dll::shared_library l;
    std::cout << "pre load" << std::endl;
    l.load("./libSomething.so");
    std::cout << "loaded" << std::endl;
}
int main() {
    doit();
    doit();
    return 0;
}

Output:
pre load
loaded
pre load
loaded
Segmentation fault

When removing the copy_file() call everything is fine. The destructor ~shared_library() calls dlclose(), but I suspect the library stays loaded. Overwriting the file creates a new file node and my program wants to load the same library again (at the same location but with a different file node/handle).
This works on Windows because the library is really unloaded after ~shared_library() (otherwise copy_file() would fail as Windows does not support overwriting files in use anyway).

I did debug into dlopen() and think the error gets visible in dl_lookup_x(): In there the strtab and symtab pointers don't have valid pointers the second time, i.e. they have the quite small value from the beginning of elf_get_dynamic_info() (l.51), the l_addr offset from the second part of elf_get_dynamic_info() wasn't added (l.104).
Sure I'm going to rewrite my tests (I'm going to not copy the files at all anymore) but I thought this could be of interest for you.

Best regards,
Jonathan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* AW: dlopen: Segfault due to overwriting .so file after it was loaded and loading it again
  2020-12-04 10:39 dlopen: Segfault due to overwriting .so file after it was loaded and loading it again Wendeborn, Jonathan
@ 2020-12-04 10:44 ` Wendeborn, Jonathan
  0 siblings, 0 replies; 6+ messages in thread
From: Wendeborn, Jonathan @ 2020-12-04 10:44 UTC (permalink / raw)
  To: libc-help

Sorry, seems like this mail slept in my outbox on my second PC for a while. Please ignore it.


- confidential -

-----Ursprüngliche Nachricht-----
Von: Libc-help <libc-help-bounces@sourceware.org> Im Auftrag von Wendeborn, Jonathan via Libc-help
Gesendet: Freitag, 4. Dezember 2020 11:39
An: libc-help@sourceware.org
Betreff: dlopen: Segfault due to overwriting .so file after it was loaded and loading it again

**EXTERNAL EMAIL**

Hi,

I am a C++ developer but usually programming and debugging on Windows (so please excuse any wrong terms). Now I'm compiling my program on Linux (gcc 9.3.0 on Debian Bullseye with Boost 1.70) for the first time and get a Segfault in my unit tests.
Luckily I was able to write a reproducer and boil it down to my code overwriting the .so file after having it loaded (and unloaded):

#include <boost/filesystem/operations.hpp> #include <boost/dll/shared_library.hpp> #include <iostream>

void doit() {
    boost::filesystem::copy_file("~/project/target/references/bin/libSomething.so", "~/project/build/bin/ linux-x86_64-gcc9-debug/ libSomething.so", boost::filesystem::copy_option::overwrite_if_exists);

    boost::dll::shared_library l;
    std::cout << "pre load" << std::endl;
    l.load("./libSomething.so");
    std::cout << "loaded" << std::endl;
}
int main() {
    doit();
    doit();
    return 0;
}

Output:
pre load
loaded
pre load
loaded
Segmentation fault

When removing the copy_file() call everything is fine. The destructor ~shared_library() calls dlclose(), but I suspect the library stays loaded. Overwriting the file creates a new file node and my program wants to load the same library again (at the same location but with a different file node/handle).
This works on Windows because the library is really unloaded after ~shared_library() (otherwise copy_file() would fail as Windows does not support overwriting files in use anyway).

I did debug into dlopen() and think the error gets visible in dl_lookup_x(): In there the strtab and symtab pointers don't have valid pointers the second time, i.e. they have the quite small value from the beginning of elf_get_dynamic_info() (l.51), the l_addr offset from the second part of elf_get_dynamic_info() wasn't added (l.104).
Sure I'm going to rewrite my tests (I'm going to not copy the files at all anymore) but I thought this could be of interest for you.

Best regards,
Jonathan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AW: dlopen: Segfault due to overwriting .so file after it was loaded and loading it again
  2020-11-20 12:45   ` AW: " Wendeborn, Jonathan
@ 2020-11-20 12:48     ` Florian Weimer
  0 siblings, 0 replies; 6+ messages in thread
From: Florian Weimer @ 2020-11-20 12:48 UTC (permalink / raw)
  To: Wendeborn, Jonathan
  Cc: Wendeborn, Jonathan via Libc-help, Konstantin Kharlamov

* Jonathan Wendeborn:

> Thank you, that sounds reasonable. Indeed the library is marked
> NODELETE due to unique symbols. Is it forbidden in general to write
> executables which are loaded?

You need to follow a specific sequence: write the new version to a
temporary file, and after it has been written completely, rename that
version over the previous version.  This way, the old file is not
truncated, and mapped versions of it continue to work.

It won't address the dependency on a unique symbol, though, that's
unrelated.  It's not possible for now to unload such objects.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* AW: dlopen: Segfault due to overwriting .so file after it was loaded and loading it again
  2020-11-20 11:33 ` Florian Weimer
@ 2020-11-20 12:45   ` Wendeborn, Jonathan
  2020-11-20 12:48     ` Florian Weimer
  0 siblings, 1 reply; 6+ messages in thread
From: Wendeborn, Jonathan @ 2020-11-20 12:45 UTC (permalink / raw)
  To: Florian Weimer, Wendeborn, Jonathan via Libc-help, Konstantin Kharlamov

[-- Attachment #1: Type: text/plain, Size: 1422 bytes --]

Hi Florian,

Thank you, that sounds reasonable. Indeed the library is marked NODELETE due to unique symbols. Is it forbidden in general to write executables which are loaded?

I attached the outputs of strace and LD_DEBUG=all if you're interested.

Thanks!
Jonathan 

-----Ursprüngliche Nachricht-----
Von: Florian Weimer <fw@deneb.enyo.de> 
Gesendet: Freitag, 20. November 2020 12:33
An: Wendeborn, Jonathan via Libc-help <libc-help@sourceware.org>
Cc: Wendeborn, Jonathan <Jonathan.Wendeborn@bruker.com>
Betreff: Re: dlopen: Segfault due to overwriting .so file after it was loaded and loading it again

**EXTERNAL EMAIL**

* Jonathan via Libc-help Wendeborn:

> The destructor ~shared_library() calls dlclose(), but I suspect the 
> library stays loaded. Overwriting the file creates a new file node and 
> my program wants to load the same library again (at the same location 
> but with a different file node/handle).

This crash happens if the file is truncated on disk and rewritten.
All mapped data is reset to zero if that happens and relocations are gone.  The kernel does that, there is nothing that glibc can do about it (Linux does not support MAP_COPY).

So you have two issues here: The library stays loaded (maybe it is marked as NODELETE, LD_DEBUG=all output logs that in recent glibc versions), and the file is rewritten in place (and not a “new file node” is created).

[-- Attachment #2: output.tar.gz --]
[-- Type: application/x-gzip, Size: 253917 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AW: dlopen: Segfault due to overwriting .so file after it was loaded and loading it again
  2020-11-20  8:01   ` AW: " Wendeborn, Jonathan
@ 2020-11-20  8:15     ` Konstantin Kharlamov
  0 siblings, 0 replies; 6+ messages in thread
From: Konstantin Kharlamov @ 2020-11-20  8:15 UTC (permalink / raw)
  To: Wendeborn, Jonathan, libc-help

On Fri, 2020-11-20 at 08:01 +0000, Wendeborn, Jonathan wrote:
> Hi,
> 
> Thank you for your quick answer!
> I don't have Boost installed globally, so I had to adjust the command:
> g++9 -g3 -O0 -Wall -Wextra -Wsign-conversion -std=c++17 -fsanitize=address  -o
> test2 test.cpp  -I/home/Jonathan.Wendeborn/.boost/1.70/include/  -
> L/home/Jonathan.Wendeborn/.boost/1.70/bin/boost/linux-x86_64-gcc9-debug  -Wl,-
> Bstatic -lboost_filesystem -lboost_system  -Wl,-Bdynamic -ldl
> 
> This is the output:
> ./test2
> pre load
> loaded
> pre load
> loaded
> AddressSanitizer:DEADLYSIGNAL
> =================================================================
> ==2872455==ERROR: AddressSanitizer: SEGV on unknown address 0x000000657726 (pc
> 0x000000657726 bp 0x000000000000 sp 0x7ffc91e7d0a8 T0)
> ==2872455==The signal is caused by a READ memory access.
> AddressSanitizer:DEADLYSIGNAL
> AddressSanitizer: nested bug in the same thread, aborting.
> 
> I didn't test my program with a different .so before, so I copied
> libboost_regex.so to libSomething.so and get a Segfault, too:
> AddressSanitizer:DEADLYSIGNAL
> =================================================================
> ==2872492==ERROR: AddressSanitizer: SEGV on unknown address 0x000000019bd0 (pc
> 0x000000019bd0 bp 0x7ffe22f7b9f0 sp 0x7ffe22f7b938 T0)
> ==2872492==The signal is caused by a READ memory access.

Hmm, doesn't crash with libboost_regex.so for me either… Okay, could you please
provide an `strace` of the testcase when it crashes? Hopefully it could shine
some light into what's going on.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* AW: dlopen: Segfault due to overwriting .so file after it was loaded and loading it again
  2020-11-20  7:17 ` Konstantin Kharlamov
@ 2020-11-20  8:01   ` Wendeborn, Jonathan
  2020-11-20  8:15     ` Konstantin Kharlamov
  0 siblings, 1 reply; 6+ messages in thread
From: Wendeborn, Jonathan @ 2020-11-20  8:01 UTC (permalink / raw)
  To: Konstantin Kharlamov, libc-help

Hi,

Thank you for your quick answer!
I don't have Boost installed globally, so I had to adjust the command:
g++9 -g3 -O0 -Wall -Wextra -Wsign-conversion -std=c++17 -fsanitize=address  -o test2 test.cpp  -I/home/Jonathan.Wendeborn/.boost/1.70/include/  -L/home/Jonathan.Wendeborn/.boost/1.70/bin/boost/linux-x86_64-gcc9-debug  -Wl,-Bstatic -lboost_filesystem -lboost_system  -Wl,-Bdynamic -ldl

This is the output:
./test2
pre load
loaded
pre load
loaded
AddressSanitizer:DEADLYSIGNAL
=================================================================
==2872455==ERROR: AddressSanitizer: SEGV on unknown address 0x000000657726 (pc 0x000000657726 bp 0x000000000000 sp 0x7ffc91e7d0a8 T0)
==2872455==The signal is caused by a READ memory access.
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer: nested bug in the same thread, aborting.

I didn't test my program with a different .so before, so I copied libboost_regex.so to libSomething.so and get a Segfault, too:
AddressSanitizer:DEADLYSIGNAL
=================================================================
==2872492==ERROR: AddressSanitizer: SEGV on unknown address 0x000000019bd0 (pc 0x000000019bd0 bp 0x7ffe22f7b9f0 sp 0x7ffe22f7b938 T0)
==2872492==The signal is caused by a READ memory access.

It doesn't crash when doing this with libboost_system.so, though.

Best regards,
Jonathan


-----Ursprüngliche Nachricht-----
Von: Konstantin Kharlamov <hi-angel@yandex.ru> 
Gesendet: Freitag, 20. November 2020 08:18
An: Wendeborn, Jonathan <Jonathan.Wendeborn@bruker.com>; libc-help@sourceware.org
Betreff: Re: dlopen: Segfault due to overwriting .so file after it was loaded and loading it again

**EXTERNAL EMAIL**

On Fri, 2020-11-20 at 06:52 +0000, Wendeborn, Jonathan via Libc-help wrote:
> Hi,
>
> I am a C++ developer but usually programming and debugging on Windows 
> (so please excuse any wrong terms). Now I'm compiling my program on 
> Linux (gcc
> 9.3.0 on Debian Bullseye with Boost 1.70) for the first time and get a 
> Segfault in my unit tests.
> Luckily I was able to write a reproducer and boil it down to my code 
> overwriting the .so file after having it loaded (and unloaded):

I can't seem to reproduce it. I modified paths in your testcase as fololows:

    #include <boost/filesystem/operations.hpp>
    #include <boost/dll/shared_library.hpp>
    #include <iostream>

    void doit() {
        boost::filesystem::copy_file("/tmp/libSomething.so", "/tmp/libSomething2.so", boost::filesystem::copy_option::overwrite_if_exists);

        boost::dll::shared_library l;
        std::cout << "pre load" << std::endl;
        l.load("/tmp/libSomething2.so");
        std::cout << "loaded" << std::endl;
    }
    int main() {
        doit();
        doit();
        return 0;
    }

And I build it with

    g++ test.cpp -o a -g3 -O0 -Wall -Wextra -Wsign-conversion -std=c++17 -fsanitize=address -ldl -lboost_filesystem -lboost_system

Running it I get no segfault, just output:

    λ ./a
    pre load
    loaded
    pre load
    loaded

Please try placing the lib into `/tmp/libSomething` and running the app, do you still see crash?



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-12-04 10:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-04 10:39 dlopen: Segfault due to overwriting .so file after it was loaded and loading it again Wendeborn, Jonathan
2020-12-04 10:44 ` AW: " Wendeborn, Jonathan
  -- strict thread matches above, loose matches on Subject: below --
2020-11-20  6:52 Wendeborn, Jonathan
2020-11-20  7:17 ` Konstantin Kharlamov
2020-11-20  8:01   ` AW: " Wendeborn, Jonathan
2020-11-20  8:15     ` Konstantin Kharlamov
2020-11-20 11:33 ` Florian Weimer
2020-11-20 12:45   ` AW: " Wendeborn, Jonathan
2020-11-20 12:48     ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).