From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 578B93858D32 for ; Tue, 23 Aug 2022 15:13:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 578B93858D32 Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-606-jhbQoE4rPy2KcPuDUeCQuQ-1; Tue, 23 Aug 2022 11:13:02 -0400 X-MC-Unique: jhbQoE4rPy2KcPuDUeCQuQ-1 Received: by mail-qt1-f200.google.com with SMTP id fv24-20020a05622a4a1800b003445e593889so10721579qtb.2 for ; Tue, 23 Aug 2022 08:13:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc; bh=2B6sM3cOm9El7Lw7ImcBXDqCBS0nnIzlLFK+DHjfdoE=; b=4B92sG/cPKib9PNM1elkFGpNmk1zQU5iHyDcV0QD+IGXHQQSBrWJaZyQcNWhudhdor WFC9j4M5SfXUGzJ303TnDy4+2wjesfcGdxV68R0ACLKzCSS57ivr7sKXtkLCpwikbEn8 oyTDualyRJADeMPB2Drw16WIGdlc/ujCPLlF/JdJ5QltoEB+JJRvvUygAXv9uf0J5jtT SoY6rEnA/rSLMQ2XeOyeOlluv/CDjfBnJ4aWwxparkQz9h+0gK4rINTgX2L/KpFFZZm+ 8vl8BMok+rAdp5G5QmUM0+Ai2VfguF/qiYjNB5tmIDD+9Iu1uDkI67MTqZUulevEVII0 F7Dw== X-Gm-Message-State: ACgBeo1jCEH43+NEWvpfLJdmvUVD/B5LTZw2iWrpI2D7LnKBQDQyHpBp VRI+v7Rk96CaEXXFis+wQqDsOR2F4rq2EmRb91eEH9Jw51rSTbpsk3Ly26uvglBTX9jP5P4P/YN /ZoSvG/Xpw/SfnM8g14nn X-Received: by 2002:ae9:df01:0:b0:6bb:4e95:6a59 with SMTP id t1-20020ae9df01000000b006bb4e956a59mr16432894qkf.339.1661267582043; Tue, 23 Aug 2022 08:13:02 -0700 (PDT) X-Google-Smtp-Source: AA6agR7OWxcExkfHOsPT8mBKo9fenj0QbX+gcSkGYycslxYlslibG8Bf9zkMCFtajFckQwPJOvpeUQ== X-Received: by 2002:ae9:df01:0:b0:6bb:4e95:6a59 with SMTP id t1-20020ae9df01000000b006bb4e956a59mr16432852qkf.339.1661267581471; Tue, 23 Aug 2022 08:13:01 -0700 (PDT) Received: from [192.168.0.241] (192-0-145-146.cpe.teksavvy.com. [192.0.145.146]) by smtp.gmail.com with ESMTPSA id w11-20020a05620a424b00b006b929a56a2bsm12862993qko.3.2022.08.23.08.13.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 23 Aug 2022 08:13:01 -0700 (PDT) Message-ID: <19fff4f1-ebf6-69f7-dcbd-df6e01908179@redhat.com> Date: Tue, 23 Aug 2022 11:13:00 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH 2/2] Detect ld.so and libc.so version inconsistency during startup To: Florian Weimer , libc-alpha@sourceware.org References: <62b42eeabb8699d5622e3c76561333395700e8be.1660903960.git.fweimer@redhat.com> From: Carlos O'Donell Organization: Red Hat In-Reply-To: <62b42eeabb8699d5622e3c76561333395700e8be.1660903960.git.fweimer@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, KAM_STOCKGEN, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Aug 2022 15:13:10 -0000 On 8/19/22 06:16, Florian Weimer via Libc-alpha wrote: > The files NEWS, include/link.h, and sysdeps/generic/ldsodefs.h > contribute to the version fingerprint used for detection. The > fingerprint can be further refined using the --with-extra-version-id > configure argument. It makes sense to use link.h, ldsodefs.h because they define the interface between ld.so and libc. NEWS and the additional --with-extra-version-id provide further differentiation, which allows per-distro NEVRA bumping to ensure this works reliably during upgrade. This solution will fail process starts during the upgrade window where there is a mismatch and the system will need to restart those processes. Needs a v2 to fix typos. Needs a v2 to fix the implementation. This solution doesn't quite work and I haven't debugged it. Works: [carlos@athas glibc-review]$ ./elf/ld.so --library-path /lib64 /bin/true Fatal glibc error: ld.so/libc.so mismatch detected Fails: [carlos@athas glibc-review]$ ./elf/ld.so --library-path /lib64 /bin/bash Segmentation fault (core dumped) I would expect the same mismatch detected message. This is possibly as difficult a problem as the early ISA detection diagnostic since it can involve resolving early ifuncs and running that code early means running mismatched ld.so/libc ifuncs. LD_DEBUG=all shows: 886519: symbol=gettimeofday; lookup in file=/bin/bash [0] 886519: symbol=gettimeofday; lookup in file=/lib64/libtinfo.so.6 [0] 886519: symbol=gettimeofday; lookup in file=/lib64/libc.so.6 [0] Segmentation fault (core dumped) I'm not sure what is going wrong here. Early vDSO setup? Early ifunc? > --- > INSTALL | 9 ++++ > Makerules | 14 ++++++ > NEWS | 7 ++- > config.make.in | 1 + > configure | 11 +++++ > configure.ac | 5 ++ > csu/libc-start.c | 1 + > elf/Versions | 4 +- > elf/dl-call-libc-early-init.c | 9 ++-- > elf/libc-early-init.h | 9 +++- > manual/install.texi | 9 ++++ > scripts/libc_early_init_name.py | 85 +++++++++++++++++++++++++++++++++ > 12 files changed, 156 insertions(+), 8 deletions(-) > create mode 100644 scripts/libc_early_init_name.py > > diff --git a/INSTALL b/INSTALL > index 659f75a97f..7ad8f96e63 100644 > --- a/INSTALL > +++ b/INSTALL > @@ -120,6 +120,15 @@ if 'CFLAGS' is specified it must enable optimization. For example: > compiler flags which target a later instruction set architecture > (ISA). > > +'--with-extra-version-id=STRING' > + Use STRING as part of the fingerprint that is used by the dynamic > + linker to detect an incompatible version of 'libc.so'. For > + example, STRING could be the full package version and release > + string used by a distribution build of the GNU C Library. This > + way, concurrent process creation during a package update will faill s/fail/fail/g > + with an error message, _Fatal glibc error: ld.so/libc.so mismatch > + detected_, rather than crashing mysteriously. OK. > + > '--with-timeoutfactor=NUM' > Specify an integer NUM to scale the timeout of test programs. This > factor can be changed at run time using 'TIMEOUTFACTOR' environment > diff --git a/Makerules b/Makerules > index d1e139d03c..756c1f181c 100644 > --- a/Makerules > +++ b/Makerules > @@ -112,6 +112,20 @@ before-compile := $(common-objpfx)first-versions.h \ > $(common-objpfx)ldbl-compat-choose.h $(before-compile) > $(common-objpfx)first-versions.h: $(common-objpfx)versions.stmp > $(common-objpfx)ldbl-compat-choose.h: $(common-objpfx)versions.stmp > + > +# libc_early_init_name.h provides the actual name of the > +# __libc_early_init function. It is used as a protocol version marker > +# between ld.so and libc.so OK. Good comment. > +before-compile := $(common-objpfx)libc_early_init_name.h $(before-compile) > +libc_early_init_name-deps = \ > + $(..)NEWS $(..)sysdeps/generic/ldsodefs.h $(..)include/link.h > +$(common-objpfx)libc_early_init_name.h: $(..)scripts/libc_early_init_name.py \ > + $(common-objpfx)config.make $(libc_early_init_name-deps) > + $(PYTHON) $(..)scripts/libc_early_init_name.py \ > + --output=$@T \ > + --extra-version-id="$(extra-version-id)" \ > + $(libc_early_init_name-deps) > + $(move-if-change) $@T $@ OK. > endif # avoid-generated > endif # $(build-shared) = yes > > diff --git a/NEWS b/NEWS > index f9bef48a8f..b8a9376e1e 100644 > --- a/NEWS > +++ b/NEWS > @@ -9,7 +9,12 @@ Version 2.37 > > Major new features: > > - [Add new features here] > +* The dynamic loader now prints an error message, "Fatal glibc error: > + ld.so/libc.so mismatch detected" if it detects that the version of > + libc.so it loaded comes from a different build of glibc. The new > + configure option --with-extra-version-id can be used to specify an > + arbitrary string that affects the comptuation of the version > + fingerprint. OK. > > Deprecated and removed features, and other changes affecting compatibility: > > diff --git a/config.make.in b/config.make.in > index d7c416cbea..ecaffbfd4b 100644 > --- a/config.make.in > +++ b/config.make.in > @@ -98,6 +98,7 @@ build-hardcoded-path-in-tests= @hardcoded_path_in_tests@ > build-pt-chown = @build_pt_chown@ > have-tunables = @have_tunables@ > pthread-in-libc = @pthread_in_libc@ > +extra-version-id = @extra_version_id@ OK. > > # Build tools. > CC = @CC@ > diff --git a/configure b/configure > index ff2c406b3b..c576f9f133 100755 > --- a/configure > +++ b/configure > @@ -760,6 +760,7 @@ with_headers > with_default_link > with_nonshared_cflags > with_rtld_early_cflags > +with_extra_version_id > with_timeoutfactor > enable_sanity_checks > enable_shared > @@ -1481,6 +1482,9 @@ Optional Packages: > build nonshared libraries with additional CFLAGS > --with-rtld-early-cflags=CFLAGS > build early initialization with additional CFLAGS > + --extra-version-id=STRING > + specify an extra version string to use in internal > + ABI checks OK. > --with-timeoutfactor=NUM > specify an integer to scale the timeout > --with-cpu=CPU select code for CPU variant > @@ -3397,6 +3401,13 @@ fi > > > > +# Check whether --with-extra-version-id was given. > +if test "${with_extra_version_id+set}" = set; then : > + withval=$with_extra_version_id; extra_version_id="$withval" > +fi > + > + > + > # Check whether --with-timeoutfactor was given. > if test "${with_timeoutfactor+set}" = set; then : > withval=$with_timeoutfactor; timeoutfactor=$withval > diff --git a/configure.ac b/configure.ac > index eb5bc6a131..68baeee4d7 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -169,6 +169,11 @@ AC_ARG_WITH([rtld-early-cflags], > [rtld_early_cflags=]) > AC_SUBST(rtld_early_cflags) > > +AC_ARG_WITH([extra-version-id], > + AS_HELP_STRING([--extra-version-id=STRING], > + [specify an extra version string to use in internal ABI checks]), > + [extra_version_id="$withval"]) > + OK. > AC_ARG_WITH([timeoutfactor], > AS_HELP_STRING([--with-timeoutfactor=NUM], > [specify an integer to scale the timeout]), > diff --git a/csu/libc-start.c b/csu/libc-start.c > index 543560f36c..522f14bbaf 100644 > --- a/csu/libc-start.c > +++ b/csu/libc-start.c > @@ -37,6 +37,7 @@ > #include > #include > > +#include OK. > #include > > extern void __libc_init_first (int argc, char **argv, char **envp); > diff --git a/elf/Versions b/elf/Versions > index a9ff278de7..6260c0fe03 100644 > --- a/elf/Versions > +++ b/elf/Versions > @@ -29,8 +29,8 @@ libc { > __placeholder_only_for_empty_version_map; > } > GLIBC_PRIVATE { > - # functions used in other libraries > - __libc_early_init; > + # A pattern is needed here because the suffix is dynamically generated. > + __libc_early_init_*; OK. I see this in the final build: 158: 0000000000148ed0 185 FUNC GLOBAL DEFAULT 10 __libc_early_init_iw1yORkW@@GLIBC_PRIVATE > > # Internal error handling support. Interposes the functions in ld.so. > _dl_signal_exception; _dl_catch_exception; > diff --git a/elf/dl-call-libc-early-init.c b/elf/dl-call-libc-early-init.c > index ee9860e3ab..1d9436ad50 100644 > --- a/elf/dl-call-libc-early-init.c > +++ b/elf/dl-call-libc-early-init.c > @@ -16,7 +16,6 @@ > License along with the GNU C Library; if not, see > . */ > > -#include > #include > #include > #include > @@ -30,11 +29,13 @@ _dl_call_libc_early_init (struct link_map *libc_map, _Bool initial) > return; > > const ElfW(Sym) *sym > - = _dl_lookup_direct (libc_map, "__libc_early_init", > - 0x069682ac, /* dl_new_hash output. */ > + = _dl_lookup_direct (libc_map, LIBC_EARLY_INIT_NAME_STRING, > + LIBC_EARLY_INIT_GNU_HASH, > "GLIBC_PRIVATE", OK. > 0x0963cf85); /* _dl_elf_hash output. */ > - assert (sym != NULL); > + if (sym == NULL) > + _dl_fatal_printf ( > +"Fatal glibc error: ld.so/libc.so mismatch detected\n"); OK. Look for the new version-specific symbol in libc. > __typeof (__libc_early_init) *early_init > = DL_SYMBOL_ADDRESS (libc_map, sym); > early_init (initial); > diff --git a/elf/libc-early-init.h b/elf/libc-early-init.h > index a8edfadfb0..6774c3e7c0 100644 > --- a/elf/libc-early-init.h > +++ b/elf/libc-early-init.h > @@ -19,6 +19,8 @@ > #ifndef _LIBC_EARLY_INIT_H > #define _LIBC_EARLY_INIT_H > > +#include > + > struct link_map; > > /* If LIBC_MAP is not NULL, look up the __libc_early_init symbol in it > @@ -33,6 +35,11 @@ void _dl_call_libc_early_init (struct link_map *libc_map, _Bool initial) > startup code. If INITIAL is true, the libc being initialized is > the libc for the main program. INITIAL is false for libcs loaded > for audit modules, dlmopen, and static dlopen. */ > -void __libc_early_init (_Bool initial); > +void __libc_early_init (_Bool initial) > +#ifdef SHARED > +/* Redirect to the actual implementation name. */ > + __asm__ (LIBC_EARLY_INIT_NAME_STRING) OK. > +#endif > + ; > > #endif /* _LIBC_EARLY_INIT_H */ > diff --git a/manual/install.texi b/manual/install.texi > index c775005581..8b225ab3bb 100644 > --- a/manual/install.texi > +++ b/manual/install.texi > @@ -144,6 +144,15 @@ dynamic linker diagnostics to run on CPUs which are not compatible with > the rest of @theglibc{}, for example, due to compiler flags which target > a later instruction set architecture (ISA). > > +@item --with-extra-version-id=@var{string} > +Use @var{string} as part of the fingerprint that is used by the dynamic > +linker to detect an incompatible version of @file{libc.so}. For > +example, @var{string} could be the full package version and release > +string used by a distribution build of @theglibc{}. This way, > +concurrent process creation during a package update will faill with an s/faill/fail/g > +error message, @emph{Fatal glibc error: ld.so/libc.so mismatch detected}, > +rather than crashing mysteriously. > + > @item --with-timeoutfactor=@var{NUM} > Specify an integer @var{NUM} to scale the timeout of test programs. > This factor can be changed at run time using @env{TIMEOUTFACTOR} > diff --git a/scripts/libc_early_init_name.py b/scripts/libc_early_init_name.py > new file mode 100644 > index 0000000000..465cbe9dae > --- /dev/null > +++ b/scripts/libc_early_init_name.py > @@ -0,0 +1,85 @@ > +#!/usr/bin/python3 > +# Compute the hash-based name of the __libc_early_init function. > +# Copyright (C) 2022 Free Software Foundation, Inc. > +# This file is part of the GNU C Library. > +# > +# The GNU C Library is free software; you can redistribute it and/or > +# modify it under the terms of the GNU Lesser General Public > +# License as published by the Free Software Foundation; either > +# version 2.1 of the License, or (at your option) any later version. > +# > +# The GNU C Library is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > +# Lesser General Public License for more details. > +# > +# You should have received a copy of the GNU Lesser General Public > +# License along with the GNU C Library; if not, see > +# . > + > +"""Compute the name of the __libc_early_init function, which is used > +as a protocol version marker between ld.so and libc.so. > + > +The name contains a hash suffix, and the hash changes if certain key > +files in the source tree change. Distributions can also configure > +with --with-extra-version-id, to make the computed hash dependent on > +the package version. OK. > + > +""" > + > +import argparse > +import hashlib > +import os > +import string > +import sys > + > +# Make available glibc Python modules. > +sys.path.append(os.path.dirname(os.path.realpath(__file__))) > + > +import glibcelf > + > +# Parse the command line. > +parser = argparse.ArgumentParser(description=__doc__) > +parser.add_argument('--output', metavar='PATH', > + help='path to header file this tool generates') > +parser.add_argument('--extra-version-id', metavar='ID', > + help='extra string to influence hash computation') > +parser.add_argument('inputs', metavar='PATH', nargs='*', > + help='files whose contents influences the generated hash') > +opts = parser.parse_args() > + > +# Obtain the blobs that affect the generated hash. > +blobs = [(opts.extra_version_id or '').encode('UTF-8')] > +for path in opts.inputs: > + with open(path, 'rb') as inp: > + blobs.append(inp.read()) > + > +# Hash the file boundaries. > +md = hashlib.sha256() > +md.update(repr([len(blob) for blob in blobs]).encode('UTF-8')) > + > +# And then hash the file contents. Do not hash the paths, to avoid > +# impacting reproducibility. > +for blob in blobs: > + md.update(blob) > + > +# These are the bits used to compute the suffix. > +derived_bits = int.from_bytes(md.digest(), byteorder='big', signed=False) > + > +# These digits are used in the suffix (should result in base-62 encoding). > +# They must be valid in C identifiers. > +digits = string.digits + string.ascii_letters > + > +# Generate eight digits as a suffix. They should provide enough > +# uniqueness (47.6 bits). > +name = '__libc_early_init_' > +for n in range(8): > + name += digits[derived_bits % len(digits)] > + derived_bits //= len(digits) > + > +# Write the output file. > +with open(opts.output, 'w') if opts.output else sys.stdout as out: > + out.write('#define LIBC_EARLY_INIT_NAME {}\n'.format(name)) > + out.write('#define LIBC_EARLY_INIT_NAME_STRING "{}"\n'.format(name)) > + out.write('#define LIBC_EARLY_INIT_GNU_HASH {}\n'.format( > + glibcelf.gnu_hash(name))) -- Cheers, Carlos.