public inbox for glibc-cvs@sourceware.org
help / color / mirror / Atom feed
* [glibc] Detect ld.so and libc.so version inconsistency during startup
@ 2022-08-24 15:38 Florian Weimer
  0 siblings, 0 replies; only message in thread
From: Florian Weimer @ 2022-08-24 15:38 UTC (permalink / raw)
  To: glibc-cvs

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6f85dbf102ad7982409ba0fe96886caeb6389fef

commit 6f85dbf102ad7982409ba0fe96886caeb6389fef
Author: Florian Weimer <fweimer@redhat.com>
Date:   Wed Aug 24 17:35:36 2022 +0200

    Detect ld.so and libc.so version inconsistency during startup
    
    The files NEWS, include/link.h, and sysdeps/generic/ldsodefs.h
    contribute to the version fingerprint used for detection.  The
    fingerprint can be further refined using the --with-extra-version-id
    configure argument.
    
    _dl_call_libc_early_init is replaced with _dl_lookup_libc_early_init.
    The new function is used store a pointer to libc.so's
    __libc_early_init function in the libc_map_early_init member of the
    ld.so namespace structure.  This function pointer can then be called
    directly, so the separate invocation function is no longer needed.
    
    The versioned symbol lookup needs the symbol versioning data
    structures, so the initialization of libc_map and libc_map_early_init
    is now done from _dl_check_map_versions, after this information
    becomes available.  (_dl_map_object_from_fd does not set this up
    in time, so the initialization code had to be moved from there.)
    This means that the separate initialization code can be removed from
    dl_main because _dl_check_map_versions covers all maps, including
    the initial executable loaded by the kernel.  The lookup still happens
    before relocation and the invocation of IFUNC resolvers, so IFUNC
    resolvers are protected from ABI mismatch.
    
    The __libc_early_init function pointer is not protected because
    so little code runs between the pointer write and the invocation
    (only dynamic linker code and IFUNC resolvers).
    
    Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Diff:
---
 INSTALL                                            | 10 +++
 Makerules                                          | 14 ++++
 NEWS                                               |  7 +-
 config.make.in                                     |  1 +
 configure                                          | 11 +++
 configure.ac                                       |  5 ++
 elf/Makefile                                       |  2 +-
 elf/Versions                                       |  4 +-
 elf/dl-load.c                                      |  9 ---
 ...bc-early-init.c => dl-lookup_libc_early_init.c} | 23 +++---
 elf/dl-open.c                                      |  4 +-
 elf/dl-version.c                                   | 18 +++++
 elf/libc-early-init.h                              | 21 +++--
 elf/rtld.c                                         | 12 +--
 manual/install.texi                                |  9 +++
 scripts/libc_early_init_name.py                    | 89 ++++++++++++++++++++++
 sysdeps/generic/ldsodefs.h                         |  4 +
 17 files changed, 198 insertions(+), 45 deletions(-)

diff --git a/INSTALL b/INSTALL
index 659f75a97f..6470cd3d25 100644
--- a/INSTALL
+++ b/INSTALL
@@ -120,6 +120,16 @@ if 'CFLAGS' is specified it must enable optimization.  For example:
      compiler flags which target a later instruction set architecture
      (ISA).
 
+'--with-extra-version-id=STRING'
+     Use STRING as part of the fingerprint that is used by the dynamic
+     linker to detect an incompatible version of 'libc.so'.  For
+     example, STRING could be the full package version and release
+     string used by a distribution build of the GNU C Library.  This
+     way, concurrent process creation during a package update will fail
+     with an error message, _error while loading shared libraries:
+     /lib64/libc.so.6: ld.so/libc.so mismatch detected (upgrade in
+     progress?)_, rather than crashing mysteriously.
+
 '--with-timeoutfactor=NUM'
      Specify an integer NUM to scale the timeout of test programs.  This
      factor can be changed at run time using 'TIMEOUTFACTOR' environment
diff --git a/Makerules b/Makerules
index d1e139d03c..756c1f181c 100644
--- a/Makerules
+++ b/Makerules
@@ -112,6 +112,20 @@ before-compile := $(common-objpfx)first-versions.h \
 		  $(common-objpfx)ldbl-compat-choose.h $(before-compile)
 $(common-objpfx)first-versions.h: $(common-objpfx)versions.stmp
 $(common-objpfx)ldbl-compat-choose.h: $(common-objpfx)versions.stmp
+
+# libc_early_init_name.h provides the actual name of the
+# __libc_early_init function.  It is used as a protocol version marker
+# between ld.so and libc.so
+before-compile := $(common-objpfx)libc_early_init_name.h $(before-compile)
+libc_early_init_name-deps = \
+  $(..)NEWS $(..)sysdeps/generic/ldsodefs.h $(..)include/link.h
+$(common-objpfx)libc_early_init_name.h: $(..)scripts/libc_early_init_name.py \
+  $(common-objpfx)config.make $(libc_early_init_name-deps)
+	$(PYTHON) $(..)scripts/libc_early_init_name.py \
+	  --output=$@T \
+	  --extra-version-id="$(extra-version-id)" \
+	  $(libc_early_init_name-deps)
+	$(move-if-change) $@T $@
 endif # avoid-generated
 endif # $(build-shared) = yes
 
diff --git a/NEWS b/NEWS
index f9bef48a8f..9d3c8c5ed8 100644
--- a/NEWS
+++ b/NEWS
@@ -9,7 +9,12 @@ Version 2.37
 
 Major new features:
 
-  [Add new features here]
+* The dynamic loader now prints an error message, "ld.so/libc.so
+  mismatch detected (upgrade in progress?)" if it detects that the
+  version of libc.so it loaded comes from a different build of glibc.
+  The new configure option --with-extra-version-id can be used to
+  specify an arbitrary string that affects the computation of the
+  version fingerprint.
 
 Deprecated and removed features, and other changes affecting compatibility:
 
diff --git a/config.make.in b/config.make.in
index d7c416cbea..ecaffbfd4b 100644
--- a/config.make.in
+++ b/config.make.in
@@ -98,6 +98,7 @@ build-hardcoded-path-in-tests= @hardcoded_path_in_tests@
 build-pt-chown = @build_pt_chown@
 have-tunables = @have_tunables@
 pthread-in-libc = @pthread_in_libc@
+extra-version-id = @extra_version_id@
 
 # Build tools.
 CC = @CC@
diff --git a/configure b/configure
index ff2c406b3b..c576f9f133 100755
--- a/configure
+++ b/configure
@@ -760,6 +760,7 @@ with_headers
 with_default_link
 with_nonshared_cflags
 with_rtld_early_cflags
+with_extra_version_id
 with_timeoutfactor
 enable_sanity_checks
 enable_shared
@@ -1481,6 +1482,9 @@ Optional Packages:
                           build nonshared libraries with additional CFLAGS
   --with-rtld-early-cflags=CFLAGS
                           build early initialization with additional CFLAGS
+  --extra-version-id=STRING
+                          specify an extra version string to use in internal
+                          ABI checks
   --with-timeoutfactor=NUM
                           specify an integer to scale the timeout
   --with-cpu=CPU          select code for CPU variant
@@ -3397,6 +3401,13 @@ fi
 
 
 
+# Check whether --with-extra-version-id was given.
+if test "${with_extra_version_id+set}" = set; then :
+  withval=$with_extra_version_id; extra_version_id="$withval"
+fi
+
+
+
 # Check whether --with-timeoutfactor was given.
 if test "${with_timeoutfactor+set}" = set; then :
   withval=$with_timeoutfactor; timeoutfactor=$withval
diff --git a/configure.ac b/configure.ac
index eb5bc6a131..68baeee4d7 100644
--- a/configure.ac
+++ b/configure.ac
@@ -169,6 +169,11 @@ AC_ARG_WITH([rtld-early-cflags],
 	    [rtld_early_cflags=])
 AC_SUBST(rtld_early_cflags)
 
+AC_ARG_WITH([extra-version-id],
+	    AS_HELP_STRING([--extra-version-id=STRING],
+			   [specify an extra version string to use in internal ABI checks]),
+	    [extra_version_id="$withval"])
+
 AC_ARG_WITH([timeoutfactor],
 	    AS_HELP_STRING([--with-timeoutfactor=NUM],
 			   [specify an integer to scale the timeout]),
diff --git a/elf/Makefile b/elf/Makefile
index 3928a08787..bc68150a37 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -52,7 +52,6 @@ routines = \
 # The core dynamic linking functions are in libc for the static and
 # profiled libraries.
 dl-routines = \
-  dl-call-libc-early-init \
   dl-close \
   dl-debug \
   dl-debug-symbols \
@@ -65,6 +64,7 @@ dl-routines = \
   dl-load \
   dl-lookup \
   dl-lookup-direct \
+  dl-lookup_libc_early_init \
   dl-minimal-malloc \
   dl-misc \
   dl-object \
diff --git a/elf/Versions b/elf/Versions
index a9ff278de7..6260c0fe03 100644
--- a/elf/Versions
+++ b/elf/Versions
@@ -29,8 +29,8 @@ libc {
     __placeholder_only_for_empty_version_map;
   }
   GLIBC_PRIVATE {
-    # functions used in other libraries
-    __libc_early_init;
+    # A pattern is needed here because the suffix is dynamically generated.
+    __libc_early_init_*;
 
     # Internal error handling support.  Interposes the functions in ld.so.
     _dl_signal_exception; _dl_catch_exception;
diff --git a/elf/dl-load.c b/elf/dl-load.c
index 1ad0868dad..00e08b5500 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
@@ -31,7 +31,6 @@
 #include <sys/param.h>
 #include <sys/stat.h>
 #include <sys/types.h>
-#include <gnu/lib-names.h>
 
 /* Type for the buffer we put the ELF header and hopefully the program
    header.  This buffer does not really have to be too large.  In most
@@ -1466,14 +1465,6 @@ cannot enable executable stack as shared object requires");
     add_name_to_object (l, ((const char *) D_PTR (l, l_info[DT_STRTAB])
 			    + l->l_info[DT_SONAME]->d_un.d_val));
 
-  /* If we have newly loaded libc.so, update the namespace
-     description.  */
-  if (GL(dl_ns)[nsid].libc_map == NULL
-      && l->l_info[DT_SONAME] != NULL
-      && strcmp (((const char *) D_PTR (l, l_info[DT_STRTAB])
-		  + l->l_info[DT_SONAME]->d_un.d_val), LIBC_SO) == 0)
-    GL(dl_ns)[nsid].libc_map = l;
-
   /* _dl_close can only eventually undo the module ID assignment (via
      remove_slotinfo) if this function returns a pointer to a link
      map.  Therefore, delay this step until all possibilities for
diff --git a/elf/dl-call-libc-early-init.c b/elf/dl-lookup_libc_early_init.c
similarity index 66%
rename from elf/dl-call-libc-early-init.c
rename to elf/dl-lookup_libc_early_init.c
index ee9860e3ab..64bc287a05 100644
--- a/elf/dl-call-libc-early-init.c
+++ b/elf/dl-lookup_libc_early_init.c
@@ -1,4 +1,4 @@
-/* Invoke the early initialization function in libc.so.
+/* Find the address of the __libc_early_init function.
    Copyright (C) 2020-2022 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
@@ -16,26 +16,21 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-#include <assert.h>
 #include <ldsodefs.h>
 #include <libc-early-init.h>
 #include <link.h>
 #include <stddef.h>
 
-void
-_dl_call_libc_early_init (struct link_map *libc_map, _Bool initial)
+__typeof (__libc_early_init) *
+_dl_lookup_libc_early_init (struct link_map *libc_map)
 {
-  /* There is nothing to do if we did not actually load libc.so.  */
-  if (libc_map == NULL)
-    return;
-
   const ElfW(Sym) *sym
-    = _dl_lookup_direct (libc_map, "__libc_early_init",
-                         0x069682ac, /* dl_new_hash output.  */
+    = _dl_lookup_direct (libc_map, LIBC_EARLY_INIT_NAME_STRING,
+                         LIBC_EARLY_INIT_GNU_HASH,
                          "GLIBC_PRIVATE",
                          0x0963cf85); /* _dl_elf_hash output.  */
-  assert (sym != NULL);
-  __typeof (__libc_early_init) *early_init
-    = DL_SYMBOL_ADDRESS (libc_map, sym);
-  early_init (initial);
+  if (sym == NULL)
+    _dl_signal_error (0, libc_map->l_name, NULL, "\
+ld.so/libc.so mismatch detected (upgrade in progress?)");
+  return DL_SYMBOL_ADDRESS (libc_map, sym);
 }
diff --git a/elf/dl-open.c b/elf/dl-open.c
index a23e65926b..dcc24130fe 100644
--- a/elf/dl-open.c
+++ b/elf/dl-open.c
@@ -760,8 +760,8 @@ dl_open_worker_begin (void *a)
   if (!args->libc_already_loaded)
     {
       /* dlopen cannot be used to load an initial libc by design.  */
-      struct link_map *libc_map = GL(dl_ns)[args->nsid].libc_map;
-      _dl_call_libc_early_init (libc_map, false);
+      if (GL(dl_ns)[args->nsid].libc_map != NULL)
+	GL(dl_ns)[args->nsid].libc_map_early_init (false);
     }
 
   args->worker_continue = true;
diff --git a/elf/dl-version.c b/elf/dl-version.c
index cda0889209..d9ec44eed6 100644
--- a/elf/dl-version.c
+++ b/elf/dl-version.c
@@ -23,6 +23,8 @@
 #include <string.h>
 #include <ldsodefs.h>
 #include <_itoa.h>
+#include <gnu/lib-names.h>
+#include <libc-early-init.h>
 
 #include <assert.h>
 
@@ -359,6 +361,22 @@ _dl_check_map_versions (struct link_map *map, int verbose, int trace_mode)
 	}
     }
 
+  /* Detect a libc.so loaded into this namespace.  The
+     __libc_early_init lookup below means that we have to do this
+     after parsing the version data.  */
+  if (GL(dl_ns)[map->l_ns].libc_map == NULL
+      && map->l_info[DT_SONAME] != NULL
+      && strcmp (((const char *) D_PTR (map, l_info[DT_STRTAB])
+		  + map->l_info[DT_SONAME]->d_un.d_val), LIBC_SO) == 0)
+    {
+      /* Look up this symbol early to trigger a mismatch error before
+	 relocation (which may call IFUNC resolvers, and those can
+	 have an internal ABI dependency).  */
+      GL(dl_ns)[map->l_ns].libc_map_early_init
+	= _dl_lookup_libc_early_init (map);
+      GL(dl_ns)[map->l_ns].libc_map = map;
+    }
+
   /* When there is a DT_VERNEED entry with libc.so on DT_NEEDED, issue
      an error if there is a DT_RELR entry without GLIBC_ABI_DT_RELR
      dependency.  */
diff --git a/elf/libc-early-init.h b/elf/libc-early-init.h
index a8edfadfb0..ac8c204bc7 100644
--- a/elf/libc-early-init.h
+++ b/elf/libc-early-init.h
@@ -19,12 +19,9 @@
 #ifndef _LIBC_EARLY_INIT_H
 #define _LIBC_EARLY_INIT_H
 
-struct link_map;
+#include <libc_early_init_name.h>
 
-/* If LIBC_MAP is not NULL, look up the __libc_early_init symbol in it
-   and call this function, with INITIAL as the argument.  */
-void _dl_call_libc_early_init (struct link_map *libc_map, _Bool initial)
-  attribute_hidden;
+struct link_map;
 
 /* In the shared case, this function is defined in libc.so and invoked
    from ld.so (or on the fist static dlopen) after complete relocation
@@ -33,6 +30,18 @@ void _dl_call_libc_early_init (struct link_map *libc_map, _Bool initial)
    startup code.  If INITIAL is true, the libc being initialized is
    the libc for the main program.  INITIAL is false for libcs loaded
    for audit modules, dlmopen, and static dlopen.  */
-void __libc_early_init (_Bool initial);
+void __libc_early_init (_Bool initial)
+#ifdef SHARED
+/* Redirect to the actual implementation name.  */
+  __asm__ (LIBC_EARLY_INIT_NAME_STRING)
+#endif
+  ;
+
+/* Attempts to find the appropriately named __libc_early_init function
+   in LIBC_MAP.  On lookup failure, an exception is signaled,
+   indicating an ld.so/libc.so mismatch.  */
+__typeof (__libc_early_init) *_dl_lookup_libc_early_init (struct link_map *
+                                                          libc_map)
+  attribute_hidden;
 
 #endif /* _LIBC_EARLY_INIT_H */
diff --git a/elf/rtld.c b/elf/rtld.c
index cbbaf4a331..910075c37f 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -1707,15 +1707,6 @@ dl_main (const ElfW(Phdr) *phdr,
       /* Extract the contents of the dynamic section for easy access.  */
       elf_get_dynamic_info (main_map, false, false);
 
-      /* If the main map is libc.so, update the base namespace to
-	 refer to this map.  If libc.so is loaded later, this happens
-	 in _dl_map_object_from_fd.  */
-      if (main_map->l_info[DT_SONAME] != NULL
-	  && (strcmp (((const char *) D_PTR (main_map, l_info[DT_STRTAB])
-		      + main_map->l_info[DT_SONAME]->d_un.d_val), LIBC_SO)
-	      == 0))
-	GL(dl_ns)[LM_ID_BASE].libc_map = main_map;
-
       /* Set up our cache of pointers into the hash table.  */
       _dl_setup_hash (main_map);
     }
@@ -2386,7 +2377,8 @@ dl_main (const ElfW(Phdr) *phdr,
   /* Relocation is complete.  Perform early libc initialization.  This
      is the initial libc, even if audit modules have been loaded with
      other libcs.  */
-  _dl_call_libc_early_init (GL(dl_ns)[LM_ID_BASE].libc_map, true);
+  if (GL(dl_ns)[LM_ID_BASE].libc_map != NULL)
+    GL(dl_ns)[LM_ID_BASE].libc_map_early_init (true);
 
   /* Do any necessary cleanups for the startup OS interface code.
      We do these now so that no calls are made after rtld re-relocation
diff --git a/manual/install.texi b/manual/install.texi
index c775005581..6d43599a47 100644
--- a/manual/install.texi
+++ b/manual/install.texi
@@ -144,6 +144,15 @@ dynamic linker diagnostics to run on CPUs which are not compatible with
 the rest of @theglibc{}, for example, due to compiler flags which target
 a later instruction set architecture (ISA).
 
+@item --with-extra-version-id=@var{string}
+Use @var{string} as part of the fingerprint that is used by the dynamic
+linker to detect an incompatible version of @file{libc.so}.  For
+example, @var{string} could be the full package version and release
+string used by a distribution build of @theglibc{}.  This way,
+concurrent process creation during a package update will fail with an
+error message, @emph{ld.so/libc.so mismatch detected (upgrade in
+progress?)}, rather than crashing mysteriously.
+
 @item --with-timeoutfactor=@var{NUM}
 Specify an integer @var{NUM} to scale the timeout of test programs.
 This factor can be changed at run time using @env{TIMEOUTFACTOR}
diff --git a/scripts/libc_early_init_name.py b/scripts/libc_early_init_name.py
new file mode 100644
index 0000000000..a56c2008f3
--- /dev/null
+++ b/scripts/libc_early_init_name.py
@@ -0,0 +1,89 @@
+#!/usr/bin/python3
+# Compute the hash-based name of the __libc_early_init function.
+# Copyright (C) 2022 Free Software Foundation, Inc.
+# This file is part of the GNU C Library.
+#
+# The GNU C Library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# The GNU C Library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with the GNU C Library; if not, see
+# <https://www.gnu.org/licenses/>.
+
+"""Compute the name of the __libc_early_init function, which is used
+as a protocol version marker between ld.so and libc.so.
+
+The name contains a hash suffix, and the hash changes if certain key
+files in the source tree change.  Distributions can also configure
+with --with-extra-version-id, to make the computed hash dependent on
+the package version.
+
+"""
+
+import argparse
+import hashlib
+import os
+import string
+import sys
+
+def gnu_hash(s):
+    """Computes the GNU hash of the string."""
+    h = 5381
+    for ch in s:
+        if type(ch) is not int:
+            ch = ord(ch)
+        h = (h * 33 + ch) & 0xffffffff
+    return h
+
+# Parse the command line.
+parser = argparse.ArgumentParser(description=__doc__)
+parser.add_argument('--output', metavar='PATH',
+                    help='path to header file this tool generates')
+parser.add_argument('--extra-version-id', metavar='ID',
+                    help='extra string to influence hash computation')
+parser.add_argument('inputs', metavar='PATH', nargs='*',
+                    help='files whose contents influences the generated hash')
+opts = parser.parse_args()
+
+# Obtain the blobs that affect the generated hash.
+blobs = [(opts.extra_version_id or '').encode('UTF-8')]
+for path in opts.inputs:
+    with open(path, 'rb') as inp:
+        blobs.append(inp.read())
+
+# Hash the file boundaries.
+md = hashlib.sha256()
+md.update(repr([len(blob) for blob in blobs]).encode('UTF-8'))
+
+# And then hash the file contents.  Do not hash the paths, to avoid
+# impacting reproducibility.
+for blob in blobs:
+    md.update(blob)
+
+# These are the bits used to compute the suffix.
+derived_bits = int.from_bytes(md.digest(), byteorder='big', signed=False)
+
+# These digits are used in the suffix (should result in base-62 encoding).
+# They must be valid in C identifiers.
+digits = string.digits + string.ascii_letters
+
+# Generate eight digits as a suffix.  They should provide enough
+# uniqueness (47.6 bits).
+name = '__libc_early_init_'
+for n in range(8):
+    name += digits[derived_bits % len(digits)]
+    derived_bits //= len(digits)
+
+# Write the output file.
+with open(opts.output, 'w') if opts.output else sys.stdout as out:
+    out.write('#define LIBC_EARLY_INIT_NAME {}\n'.format(name))
+    out.write('#define LIBC_EARLY_INIT_NAME_STRING "{}"\n'.format(name))
+    out.write('#define LIBC_EARLY_INIT_GNU_HASH {}\n'.format(
+        gnu_hash(name)))
diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
index 050a3032de..275dbc95ce 100644
--- a/sysdeps/generic/ldsodefs.h
+++ b/sysdeps/generic/ldsodefs.h
@@ -333,6 +333,10 @@ struct rtld_global
        its link map.  */
     struct link_map *libc_map;
 
+    /* __libc_early_init function in libc_map.  Initialized at the
+       same time as libc_map.  */
+    void (*libc_map_early_init) (_Bool);
+
     /* Search table for unique objects.  */
     struct unique_sym_table
     {

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-08-24 15:38 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-24 15:38 [glibc] Detect ld.so and libc.so version inconsistency during startup Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).