public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "pieterjan.briers at gmail dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sourceware.org
Subject: [Bug dynamic-link/28937] New: New DSO dependency resolver buggyness with dlclose()
Date: Thu, 03 Mar 2022 12:48:15 +0000	[thread overview]
Message-ID: <bug-28937-131@http.sourceware.org/bugzilla/> (raw)

https://sourceware.org/bugzilla/show_bug.cgi?id=28937

            Bug ID: 28937
           Summary: New DSO dependency resolver buggyness with dlclose()
           Product: glibc
           Version: 2.35
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: dynamic-link
          Assignee: unassigned at sourceware dot org
          Reporter: pieterjan.briers at gmail dot com
  Target Milestone: ---

Created attachment 14001
  --> https://sourceware.org/bugzilla/attachment.cgi?id=14001&action=edit
Both log files, since it seems I can't attach multiple files on bugzilla?

(I previously reported this on the Arch Linux bug tracker and it was decided to
be an upstream issue, https://bugs.archlinux.org/task/73967)

Description:

The new new dependency resolver for dynamic library loading seems to be causing
some sort of internal corruption/buggyness when paired with usage of dlclose().
I traced some bug reports[1] we were getting in our game to it after a lot of
debugging, and then retroactively we found a second problem one of our
maintainers had been dealing with that was caused by this. These all come from
Arch Linux (I think) because they're already on the latest glibc version.

The first problem we had is that dlclose()ing the result of dlopen(NULL) seemed
to cause specifically libbz2.so to erroneously get unloaded (in LD_DEBUG=libs).
This then caused future dynamic loading of other libs (specifically GTK) which
needed libbz2.so to break, and that caused the initial bug reports.

The full sequence of events, for context, can be best described with the
following C code and comments. Note that to trigger this I compiled it with
fluidsynth/ALSA/PortAudio available on an Arch Linux system, so I'm not sure
this will be easy to reproduce (I failed to produce a more minimal repro).
Compile instructions are just gcc main.c && ./a.out.

#include <dlfcn.h>
#include <stdio.h>
#include <unistd.h>

void main()
{
void* lib;

// Early on, we load freetype, which has a direct dependency on bzip2.
lib = dlopen("libfreetype.so.6", RTLD_NOW);
printf("#### freetype: %llX\n", lib);

// Then, much later (when the user opens the instrument menu), we load
fluidsynth
lib = dlopen("libfluidsynth.so.3", RTLD_NOW);
printf("#### fluidsynth: %llX\n", lib);

// Using dlsym so we don't link directly against fluidsynth and it's
dynamically loaded.
void* (*new_fluid_settings_fp)() = dlsym(lib, "new_fluid_settings");

// And initialize fluidsynth by calling new_fluid_settings()
printf("#### Initializing fluidsynth!\n");
void* settings = new_fluid_settings_fp();

// Fluidsynth will go through all its audio drivers, which ends up
test-initializing all its audio drives (there's a lot), one of which being ALSA
(through PortAudio when I was debugging this, didn't test if it happens
directly to ALSA)
// alsa-libs/libasound.so ends up doing many sequences of dlopen(NULL) ->
dlsym() -> dlclose() at config.c line 4017[2] running config hooks. This
dlclose call, for some reason, results in libbz2.so getting erroneously fini'd.

printf("#### Initialized fluidsynth!\n");

// GTK3 will fail to load (be zero) if loaded after fluidsynth initialized,
because libbz2.so was erroneously unloaded earlier.
// LD_DEBUG reports being unable to find the raw symbols provided by BZ2 (e.g.
BZ2_hbMakeCodeLengths) which seems to imply it's only partially corrupted?
lib = dlopen("libgtk-3.so", RTLD_NOW);
printf("#### GTK3: %llX\n", lib);

return;
}

When I run this on my system right now (glibc 2.35), GTK will fail to be
loaded, and LD_DEBUG=libs output shows libbz2.so getting unloaded during
FluidSynth initialization. If I then run it with
GLIBC_TUNABLES=glibc.rtld.dynamic_sort=1, this does not happen and GTK loads
fine.

I have attached the log of me running this with LD_DEBUG=libs as mine.log. I
understand there's tons of variables and distro-specific things at play to make
this reliably repro, but I've had multiple anecdotes from Arch at least (my
personal testing + 2 bug reports on our engine repo)

The second problem appears to be an explicit hard assert error: "Inconsistency
detected by ld.so: dl-close.c: 277: _dl_close_worker: Assertion `imap->l_type
== lt_loaded && !imap->l_nodelete_active' failed!". One of our maintainers
started getting this "recently" (probably when Arch pushed the 2.35 glibc
package) and it caused her game to fail to start most of the time (not 100%
consistent). After I discovered the first thing she tried
GLIBC_TUNABLES=glibc.rtld.dynamic_sort=1 and this problem also went away
completely. Attached is her LD_DEBUG=libs log as vera.log in case it's useful.
There is quite a lot of stuff happening in it because the game is .NET and it
has to load graphics drivers and such though.

[1]: https://github.com/space-wizards/RobustToolbox/issues/2563
[2]:
https://github.com/alsa-project/alsa-lib/blob/1454b5f118a3b92663923fe105daecfeb7e20f1b/src/conf.c#L3998-L4020

-- 
You are receiving this mail because:
You are on the CC list for the bug.

             reply	other threads:[~2022-03-03 12:48 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-03 12:48 pieterjan.briers at gmail dot com [this message]
2022-03-03 15:08 ` [Bug dynamic-link/28937] " fweimer at redhat dot com
2022-03-03 15:27 ` fweimer at redhat dot com
2022-03-03 22:55 ` pieterjan.briers at gmail dot com
2022-03-03 23:25 ` pieterjan.briers at gmail dot com
2022-03-04  1:07 ` pieterjan.briers at gmail dot com
2022-03-11 19:24 ` freswa at archlinux dot org
2022-04-11 23:19 ` woodard at redhat dot com
2022-04-12 10:29 ` fweimer at redhat dot com
2022-08-08 10:38 ` fweimer at redhat dot com
2022-08-08 19:59 ` sam at gentoo dot org
2022-08-14 20:06 ` pieterjan.briers at gmail dot com
2022-08-15  9:23 ` [Bug dynamic-link/28937] New DSO dependency sorter does not put new map first if in a cycle fweimer at redhat dot com
2022-09-20  9:07 ` fweimer at redhat dot com
2022-09-21  9:17 ` fweimer at redhat dot com
2023-08-22 10:49 ` fweimer at redhat dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-28937-131@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=glibc-bugs@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).