* [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output @ 2023-08-23 7:00 Florian Weimer 2023-08-23 7:04 ` [PATCH v2 2/2] elf: Check that --list-diagnostics output has the expected syntax Florian Weimer 2023-08-23 13:47 ` [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output Adhemerval Zanella Netto 0 siblings, 2 replies; 4+ messages in thread From: Florian Weimer @ 2023-08-23 7:00 UTC (permalink / raw) To: libc-alpha --- v2: Drop Python code. @cindex/@item fix as suggested by Arsen. manual/dynlink.texi | 207 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 207 insertions(+) diff --git a/manual/dynlink.texi b/manual/dynlink.texi index 45bf5a5b55..df41c56bfc 100644 --- a/manual/dynlink.texi +++ b/manual/dynlink.texi @@ -13,9 +13,216 @@ as plugins) later at run time. Dynamic linkers are sometimes called @dfn{dynamic loaders}. @menu +* Dynamic Linker Invocation:: Explicit invocation of the dynamic linker. * Dynamic Linker Introspection:: Interfaces for querying mapping information. @end menu +@node Dynamic Linker Invocation + +@cindex program interpreter +When a dynamically linked program starts, the operating system +automatically loads the dynamic linker along with the program. +@Theglibc{} also supports invoking the dynamic linker explicitly to +launch a program. This command uses the implied dynamic linker +(also sometimes called the @dfn{program interpreter}): + +@smallexample +sh -c 'echo "Hello, world!"' +@end smallexample + +This command specifies the dynamic linker explicitly: + +@smallexample +ld.so /bin/sh -c 'echo "Hello, world!"' +@end smallexample + +Note that @command{ld.so} does not search the @env{PATH} environment +variable, so the full file name of the executable needs to be specified. + +The @command{ld.so} program supports various options. Options start +@samp{--} and need to come before the program that is being launched. +Some of the supported options are listed below. + +@table @code +@item --list-diagnostics +Print system diagnostic information in a machine-readable format. +@xref{Dynamic Linker Diagnostics}. +@end table + +@menu +* Dynamic Linker Diagnostics:: Obtaining system diagnostic information. +@end menu + +@node Dynamic Linker Diagnostics +@section Dynamic Linker Diagnostics +@cindex diagnostics (dynamic linker) + +The @samp{ld.so --list-diagnostics} produces machine-readable +diagnostics output. This output contains system data that affects the +behavior of @theglibc{}, and potentially application behavior as well. + +The exact set of diagnostic items can change between releases of +@theglibc{}. The output format itself is not expected to change +radically. + +The following table shows some example lines that can be written by the +diagnostics command. + +@table @code +@item dl_pagesize=0x1000 +The system page size is 4096 bytes. + +@item env[0x14]="LANG=en_US.UTF-8" +This item indicates that the 21st environment variable at process +startup contains a setting for @code{LANG}. + +@item env_filtered[0x22]="DISPLAY" +The 35th environment variable is @code{DISPLAY}. Its value is not +included in the output for privacy reasons because it is not recognized +as harmless by the diagnostics code. + +@item path.prefix="/usr" +This means that @theglibc{} was configured with @code{--prefix=/usr}. + +@item path.system_dirs[0x0]="/lib64/" +@itemx path.system_dirs[0x1]="/usr/lib64/" +The built-in dynamic linker search path contains two directories, +@code{/lib64} and @code{/usr/lib64}. +@end table + +@subsection Dynamic Linker Diagnostics Output Format + +As seen above, diagnostic lines assign values (integers or strings) to a +sequence of labeled subscripts, separated by @samp{.}. Some subscripts +have integer indices associated with them. The subscript indices are +not necessarily contiguous or small, so an associative array should be +used to store them. Currently, all integers fit into the 64-bit +unsigned integer range. Every access path to a value has a fixed type +(string or integer) independent of subscript index values. Likewise, +whether a subscript is indexed does not depend on previous indices (but +may depend on previous subscript labels). + +A syntax description in ABNF (RFC 5234) follows. Note that +@code{%x30-39} denotes the range of decimal digits. Diagnostic output +lines are expected to match the @code{line} production. + +@c ABNF-START +@smallexample +HEXDIG = %x30-39 / %x61-6f ; lowercase a-f only +ALPHA = %x41-5a / %x61-7a / %x7f ; letters and underscore +ALPHA-NUMERIC = ALPHA / %x30-39 / "_" +DQUOTE = %x22 ; " + +; Numbers are always hexadecimal and use a 0x prefix. +hex-value-prefix = %x30 %x78 +hex-value = hex-value-prefix 1*HEXDIG + +; Strings use octal escape sequences and \\, \". +string-char = %x20-21 / %x23-5c / %x5d-7e ; printable but not "\ +string-quoted-octal = %x30-33 2*2%x30-37 +string-quoted = "\" ("\" / DQUOTE / string-quoted-octal) +string-value = DQUOTE *(string-char / string-quoted) DQUOTE + +value = hex-value / string-value + +label = ALPHA *ALPHA-NUMERIC +index = "[" hex-value "]" +subscript = label [index] + +line = subscript *("." subscript) "=" value +@end smallexample + +@subsection Dynamic Linker Diagnostics Values + +As mentioned above, the set of diagnostics may change between +@theglibc{} releases. Nevertheless, the following table documents a few +common diagnostic items. All numbers are in hexadecimal, with a +@samp{0x} prefix. + +@table @code +@item dl_dst_lib=@var{string} +The @code{$LIB} dynamic string token expands to @var{string}. + +@cindex HWCAP (diagnostics) +@item dl_hwcap=@var{integer} +@itemx dl_hwcap2=@var{integer} +The HWCAP and HWCAP2 values, as returned for @code{getauxval}, and as +used in other places depending on the architecture. + +@cindex page size (diagnostics) +@item dl_pagesize=@var{integer} +The system page size is @var{integer} bytes. + +@item dl_platform=@var{string} +The @code{$PLATFORM} dynamic string token expands to @var{string}. + +@item dso.libc=@var{string} +This is the soname of the shared @code{libc} object that is part of +@theglibc{}. On most architectures, this is @code{libc.so.6}. + +@item env[@var{index}]=@var{string} +@itemx env_filtered[@var{index}]=@var{string} +An environment variable from the process environment. The integer +@var{index} is the array index in the environment array. Variables +under @code{env} include the variable value after the @samp{=} (assuming +that it was present), variables under @code{env_filtered} do not. + +@item path.prefix=@var{string} +This indicates that @theglibc{} was configured using +@samp{--prefix=@var{string}}. + +@item path.sysconfdir=@var{string} +@Theglibc{} was configured (perhaps implicitly) with +@samp{--sysconfdir=@var{string}} (typically @code{/etc}). + +@item path.system_dirs[@var{index}]=@var{string} +These items list the elements of the built-in array that describes the +default library search path. The value @var{string} is a directory file +name with a trailing @samp{/}. + +@item path.rtld=@var{string} +This string indicates the application binary interface (ABI) file name +of the run-time dynamic linker. + +@item version.release="stable" +@itemx version.release="development" +The value @code{"stable"} indicates that this build of @theglibc{} is +from a release branch. Releases labeled as @code{"development"} are +unreleased development versions. + +@cindex version (diagnostics) +@item version.version="@var{major}.@var{minor}" +@itemx version.version="@var{major}.@var{minor}.9000" +@Theglibc{} version. Development releases end in @samp{.9000}. + +@cindex auxiliary vector (diagnostics) +@item auxv[@var{index}].a_type=@var{type} +@itemx auxv[@var{index}].a_val=@var{integer} +@itemx auxv[@var{index}].a_val_string=@var{string} +An entry in the auxiliary vector (specific to Linux). The values +@var{type} (an integer) and @var{integer} correspond to the members of +@code{struct auxv}. If the value is a string, @code{a_val_string} is +used instead of @code{a_val}, so that values have consistent types. + +The @code{AT_HWCAP} and @code{AT_HWCAP2} values in this output do not +reflect adjustment by @theglibc{}. + +@item uname.sysname=@var{string} +@itemx uname.nodename=@var{string} +@itemx uname.release=@var{string} +@itemx uname.version=@var{string} +@itemx uname.machine=@var{string} +@itemx uname.domain=@var{string} +These Linux-specific items show the values of @code{struct utsname}, as +reported by the @code{uname} function. @xref{Platform Type}. + +@cindex CPUID (diagnostics) +@item x86.cpu_features.@dots{} +These items are specific to the i386 and x86-64 architectures. They +reflect supported CPU features and information on cache geometry, mostly +collected using the @code{CPUID} instruction. +@end table + @node Dynamic Linker Introspection @section Dynamic Linker Introspection base-commit: 65a5112ede9ba3e37e165cf6c9c432f46b903936 -- 2.41.0 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 2/2] elf: Check that --list-diagnostics output has the expected syntax 2023-08-23 7:00 [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output Florian Weimer @ 2023-08-23 7:04 ` Florian Weimer 2023-08-23 13:53 ` Adhemerval Zanella Netto 2023-08-23 13:47 ` [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output Adhemerval Zanella Netto 1 sibling, 1 reply; 4+ messages in thread From: Florian Weimer @ 2023-08-23 7:04 UTC (permalink / raw) To: libc-alpha Parts of elf/tst-rtld-list-diagnostics.py have been copied from scripts/tst-ld-trace.py. The abnf module is entirely optional and used to verify the ABNF grammar as included in the manual. --- v2: Clarify the optional nature of the abnf module. Fixes from Adhemerval. INSTALL | 6 + elf/Makefile | 9 + elf/tst-rtld-list-diagnostics.py | 303 +++++++++++++++++++++++++++++++ manual/install.texi | 7 + 4 files changed, 325 insertions(+) create mode 100644 elf/tst-rtld-list-diagnostics.py diff --git a/INSTALL b/INSTALL index 268acadd75..e5152a4ae7 100644 --- a/INSTALL +++ b/INSTALL @@ -585,6 +585,12 @@ build the GNU C Library: in your system. As of release time PExpect 4.8.0 is the newest verified to work to test the pretty printers. + • The Python ‘abnf’ module. + + This module is optional used to verify some ABNF grammars in the + manual. Version 2.2.0 has been confirmed to work as expected. A + missing ‘abnf’ does not reduce test coverage of the library itself. + • GDB 7.8 or later with support for Python 2.7/3.4 or later GDB itself needs to be configured with Python support in order to diff --git a/elf/Makefile b/elf/Makefile index c00e2ccfc5..9176cbf1e3 100644 --- a/elf/Makefile +++ b/elf/Makefile @@ -1123,6 +1123,7 @@ tests-special += \ $(objpfx)argv0test.out \ $(objpfx)tst-pathopt.out \ $(objpfx)tst-rtld-help.out \ + $(objpfx)tst-rtld-list-diagnostics.out \ $(objpfx)tst-rtld-load-self.out \ $(objpfx)tst-rtld-preload.out \ $(objpfx)tst-sprof-basic.out \ @@ -2799,6 +2800,14 @@ $(objpfx)tst-ro-dynamic-mod.so: $(objpfx)tst-ro-dynamic-mod.os \ -Wl,--script=tst-ro-dynamic-mod.map \ $(objpfx)tst-ro-dynamic-mod.os +$(objpfx)tst-rtld-list-diagnostics.out: tst-rtld-list-diagnostics.py \ + $(..)manual/dynlink.texi $(objpfx)$(rtld-installed-name) + $(PYTHON) tst-rtld-list-diagnostics.py \ + --manual=$(..)manual/dynlink.texi \ + "$(test-wrapper-env) $(objpfx)$(rtld-installed-name) --list-diagnostics" \ + > $@; \ + $(evaluate-test) + $(objpfx)tst-rtld-run-static.out: $(objpfx)/ldconfig $(objpfx)tst-dl_find_object.out: \ diff --git a/elf/tst-rtld-list-diagnostics.py b/elf/tst-rtld-list-diagnostics.py new file mode 100644 index 0000000000..e9ba9e1798 --- /dev/null +++ b/elf/tst-rtld-list-diagnostics.py @@ -0,0 +1,303 @@ +#!/usr/bin/python3 +# Test that the ld.so --list-diagnostics output has the expected syntax. +# Copyright (C) 2022-2023 Free Software Foundation, Inc. +# Copyright The GNU Toolchain Authors. +# This file is part of the GNU C Library. +# +# The GNU C Library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. +# +# The GNU C Library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with the GNU C Library; if not, see +# <https://www.gnu.org/licenses/>. + +import argparse +import collections +import subprocess +import sys + +try: + subprocess.run +except: + class _CompletedProcess: + def __init__(self, args, returncode, stdout=None, stderr=None): + self.args = args + self.returncode = returncode + self.stdout = stdout + self.stderr = stderr + + def _run(*popenargs, input=None, timeout=None, check=False, **kwargs): + assert(timeout is None) + with subprocess.Popen(*popenargs, **kwargs) as process: + try: + stdout, stderr = process.communicate(input) + except: + process.kill() + process.wait() + raise + returncode = process.poll() + if check and returncode: + raise subprocess.CalledProcessError(returncode, popenargs) + return _CompletedProcess(popenargs, returncode, stdout, stderr) + + subprocess.run = _run + +# Number of errors encountered. Zero means no errors (test passes). +errors = 0 + +def parse_line(line): + """Parse a line of --list-diagnostics output. + + This function returns a pair (SUBSCRIPTS, VALUE). VALUE is either + a byte string or an integer. SUBSCRIPT is a tuple of (LABEL, + INDEX) pairs, where LABEL is a field identifier (a string), and + INDEX is an integer or None, to indicate that this field is not + indexed. + + """ + + # Extract the list of subscripts before the value. + idx = 0 + subscripts = [] + while line[idx] != '=': + start_idx = idx + + # Extract the label. + while line[idx] not in '[.=': + idx += 1 + label = line[start_idx:idx] + + if line[idx] == '[': + # Subscript with a 0x index. + assert label + close_bracket = line.index(']', idx) + index = line[idx + 1:close_bracket] + assert index.startswith('0x') + index = int(index, 0) + subscripts.append((label, index)) + idx = close_bracket + 1 + else: # '.' or '='. + if label: + subscripts.append((label, None)) + if line[idx] == '.': + idx += 1 + + # The value is either a string or a 0x number. + value = line[idx + 1:] + if value[0] == '"': + # Decode the escaped string into a byte string. + assert value[-1] == '"' + idx = 1 + result = [] + while True: + ch = value[idx] + if ch == '\\': + if value[idx + 1] in '"\\': + result.append(ord(value[idx + 1])) + idx += 2 + else: + result.append(int(value[idx + 1:idx + 4], 8)) + idx += 4 + elif ch == '"': + assert idx == len(value) - 1 + break + else: + result.append(ord(value[idx])) + idx += 1 + value = bytes(result) + else: + # Convert the value into an integer. + assert value.startswith('0x') + value = int(value, 0) + return (tuple(subscripts), value) + +assert parse_line('a.b[0x1]=0x2') == ((('a', None), ('b', 1)), 2) +assert parse_line(r'b[0x3]="four\040\"\\"') == ((('b', 3),), b'four \"\\') + +# ABNF for a line of --list-diagnostics output. +diagnostics_abnf = r""" +HEXDIG = %x30-39 / %x61-6f ; lowercase a-f only +ALPHA = %x41-5a / %x61-7a / %x7f ; letters and underscore +ALPHA-NUMERIC = ALPHA / %x30-39 / "_" +DQUOTE = %x22 ; " + +; Numbers are always hexadecimal and use a 0x prefix. +hex-value-prefix = %x30 %x78 +hex-value = hex-value-prefix 1*HEXDIG + +; Strings use octal escape sequences and \\, \". +string-char = %x20-21 / %x23-5c / %x5d-7e ; printable but not "\ +string-quoted-octal = %x30-33 2*2%x30-37 +string-quoted = "\" ("\" / DQUOTE / string-quoted-octal) +string-value = DQUOTE *(string-char / string-quoted) DQUOTE + +value = hex-value / string-value + +label = ALPHA *ALPHA-NUMERIC +index = "[" hex-value "]" +subscript = label [index] + +line = subscript *("." subscript) "=" value +""" + +def check_consistency_with_manual(manual_path): + """Verify that the code fragments in the manual match this script. + + The code fragments are duplicated to clarify the dual license. + """ + + global errors + + def extract_lines(path, start_line, end_line, skip_lines=()): + result = [] + with open(path) as inp: + capturing = False + for line in inp: + if line.strip() == start_line: + capturing = True + elif not capturing or line.strip() in skip_lines: + continue + elif line.strip() == end_line: + capturing = False + else: + result.append(line) + if not result: + raise ValueError('{!r} not found in {!r}'.format(start_line, path)) + if capturing: + raise ValueError('{!r} not found in {!r}'.format(end_line, path)) + return result + + def check(name, manual, script): + global errors + + if manual == script: + return + print('error: {} fragment in manual is different'.format(name)) + import difflib + sys.stdout.writelines(difflib.unified_diff( + manual, script, fromfile='manual', tofile='script')) + errors += 1 + + manual_abnf = extract_lines(manual_path, + '@c ABNF-START', '@end smallexample', + skip_lines=('@smallexample',)) + check('ABNF', diagnostics_abnf.splitlines(keepends=True)[1:], manual_abnf) + +# If the abnf module can be imported, run an additional check that the +# 'line' production from the ABNF grammar matches --list-diagnostics +# output lines. +try: + import abnf +except ImportError: + abnf = None + print('info: skipping ABNF validation because the abnf module is missing') + +if abnf is not None: + class Grammar(abnf.Rule): + pass + + Grammar.load_grammar(diagnostics_abnf) + + def parse_abnf(line): + global errors + + # Just verify that the line parses. + try: + Grammar('line').parse_all(line) + except abnf.ParseError: + print('error: ABNF parse error:', repr(line)) + errors += 1 +else: + def parse_abnf(line): + pass + + +def parse_diagnostics(cmd): + global errors + diag_out = subprocess.run(cmd, stdout=subprocess.PIPE, check=True, + universal_newlines=True).stdout + if diag_out[-1] != '\n': + print('error: ld.so output does not end in newline') + errors += 1 + + PathType = collections.namedtuple('PathType', + 'has_index value_type original_line') + # Mapping tuples of labels to PathType values. + path_types = {} + + seen_subscripts = {} + + for line in diag_out.splitlines(): + parse_abnf(line) + subscripts, value = parse_line(line) + + # Check for duplicates. + if subscripts in seen_subscripts: + print('error: duplicate value assignment:', repr(line)) + print(' previous line:,', repr(seen_subscripts[line])) + errors += 1 + else: + seen_subscripts[subscripts] = line + + # Compare types against the previously seen labels. + labels = tuple([label for label, index in subscripts]) + has_index = tuple([index is not None for label, index in subscripts]) + value_type = type(value) + if labels in path_types: + previous_type = path_types[labels] + if has_index != previous_type.has_index: + print('error: line has mismatch of indexing:', repr(line)) + print(' index types:', has_index) + print(' previous: ', previous_type.has_index) + print(' previous line:', repr(previous_type.original_line)) + errors += 1 + if value_type != previous_type.value_type: + print('error: line has mismatch of value type:', repr(line)) + print(' value type:', value_type.__name__) + print(' previous: ', previous_type.value_type.__name__) + print(' previous line:', repr(previous_type.original_line)) + errors += 1 + else: + path_types[labels] = PathType(has_index, value_type, line) + + # Check that this line does not add indexing to a previous value. + for idx in range(1, len(subscripts) - 1): + if subscripts[:idx] in path_types: + print('error: line assigns to atomic value:', repr(line)) + print(' previous line:', repr(previous_type.original_line)) + errors += 1 + + if errors: + sys.exit(1) + +def get_parser(): + parser = argparse.ArgumentParser(description=__doc__) + parser.add_argument('--manual', + help='path to .texi file for consistency checks') + parser.add_argument('command', + help='comand to run') + return parser + + +def main(argv): + parser = get_parser() + opts = parser.parse_args(argv) + + if opts.manual: + check_consistency_with_manual(opts.manual) + + # Remove the initial 'env' command. + parse_diagnostics(opts.command.split()[1:]) + + if errors: + sys.exit(1) + +if __name__ == '__main__': + main(sys.argv[1:]) diff --git a/manual/install.texi b/manual/install.texi index e8f36d5726..4c4e76fedf 100644 --- a/manual/install.texi +++ b/manual/install.texi @@ -632,6 +632,13 @@ GDB, and should be compatible with the Python version in your system. As of release time PExpect 4.8.0 is the newest verified to work to test the pretty printers. +@item +The Python @code{abnf} module. + +This module is optional used to verify some ABNF grammars in the manual. +Version 2.2.0 has been confirmed to work as expected. A missing +@code{abnf} does not reduce test coverage of the library itself. + @item GDB 7.8 or later with support for Python 2.7/3.4 or later -- 2.41.0 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 2/2] elf: Check that --list-diagnostics output has the expected syntax 2023-08-23 7:04 ` [PATCH v2 2/2] elf: Check that --list-diagnostics output has the expected syntax Florian Weimer @ 2023-08-23 13:53 ` Adhemerval Zanella Netto 0 siblings, 0 replies; 4+ messages in thread From: Adhemerval Zanella Netto @ 2023-08-23 13:53 UTC (permalink / raw) To: libc-alpha, Florian Weimer On 23/08/23 04:04, Florian Weimer via Libc-alpha wrote: > Parts of elf/tst-rtld-list-diagnostics.py have been copied from > scripts/tst-ld-trace.py. > > The abnf module is entirely optional and used to verify the > ABNF grammar as included in the manual. LGTM, some minor notes below. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> > --- > v2: Clarify the optional nature of the abnf module. Fixes > from Adhemerval. > > INSTALL | 6 + > elf/Makefile | 9 + > elf/tst-rtld-list-diagnostics.py | 303 +++++++++++++++++++++++++++++++ > manual/install.texi | 7 + > 4 files changed, 325 insertions(+) > create mode 100644 elf/tst-rtld-list-diagnostics.py > > diff --git a/INSTALL b/INSTALL > index 268acadd75..e5152a4ae7 100644 > --- a/INSTALL > +++ b/INSTALL > @@ -585,6 +585,12 @@ build the GNU C Library: > in your system. As of release time PExpect 4.8.0 is the newest > verified to work to test the pretty printers. > > + • The Python ‘abnf’ module. > + > + This module is optional used to verify some ABNF grammars in the [...] is optional *and* used [...]. > + manual. Version 2.2.0 has been confirmed to work as expected. A > + missing ‘abnf’ does not reduce test coverage of the library itself. I think it should be 'reduce the test'. > + > • GDB 7.8 or later with support for Python 2.7/3.4 or later > > GDB itself needs to be configured with Python support in order to > diff --git a/elf/Makefile b/elf/Makefile > index c00e2ccfc5..9176cbf1e3 100644 > --- a/elf/Makefile > +++ b/elf/Makefile > @@ -1123,6 +1123,7 @@ tests-special += \ > $(objpfx)argv0test.out \ > $(objpfx)tst-pathopt.out \ > $(objpfx)tst-rtld-help.out \ > + $(objpfx)tst-rtld-list-diagnostics.out \ > $(objpfx)tst-rtld-load-self.out \ > $(objpfx)tst-rtld-preload.out \ > $(objpfx)tst-sprof-basic.out \ > @@ -2799,6 +2800,14 @@ $(objpfx)tst-ro-dynamic-mod.so: $(objpfx)tst-ro-dynamic-mod.os \ > -Wl,--script=tst-ro-dynamic-mod.map \ > $(objpfx)tst-ro-dynamic-mod.os > > +$(objpfx)tst-rtld-list-diagnostics.out: tst-rtld-list-diagnostics.py \ > + $(..)manual/dynlink.texi $(objpfx)$(rtld-installed-name) > + $(PYTHON) tst-rtld-list-diagnostics.py \ > + --manual=$(..)manual/dynlink.texi \ > + "$(test-wrapper-env) $(objpfx)$(rtld-installed-name) --list-diagnostics" \ > + > $@; \ > + $(evaluate-test) > + > $(objpfx)tst-rtld-run-static.out: $(objpfx)/ldconfig > > $(objpfx)tst-dl_find_object.out: \ > diff --git a/elf/tst-rtld-list-diagnostics.py b/elf/tst-rtld-list-diagnostics.py > new file mode 100644 > index 0000000000..e9ba9e1798 > --- /dev/null > +++ b/elf/tst-rtld-list-diagnostics.py > @@ -0,0 +1,303 @@ > +#!/usr/bin/python3 > +# Test that the ld.so --list-diagnostics output has the expected syntax. > +# Copyright (C) 2022-2023 Free Software Foundation, Inc. > +# Copyright The GNU Toolchain Authors. > +# This file is part of the GNU C Library. > +# > +# The GNU C Library is free software; you can redistribute it and/or > +# modify it under the terms of the GNU Lesser General Public > +# License as published by the Free Software Foundation; either > +# version 2.1 of the License, or (at your option) any later version. > +# > +# The GNU C Library is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > +# Lesser General Public License for more details. > +# > +# You should have received a copy of the GNU Lesser General Public > +# License along with the GNU C Library; if not, see > +# <https://www.gnu.org/licenses/>. > + > +import argparse > +import collections > +import subprocess > +import sys > + > +try: > + subprocess.run > +except: > + class _CompletedProcess: > + def __init__(self, args, returncode, stdout=None, stderr=None): > + self.args = args > + self.returncode = returncode > + self.stdout = stdout > + self.stderr = stderr > + > + def _run(*popenargs, input=None, timeout=None, check=False, **kwargs): > + assert(timeout is None) > + with subprocess.Popen(*popenargs, **kwargs) as process: > + try: > + stdout, stderr = process.communicate(input) > + except: > + process.kill() > + process.wait() > + raise > + returncode = process.poll() > + if check and returncode: > + raise subprocess.CalledProcessError(returncode, popenargs) > + return _CompletedProcess(popenargs, returncode, stdout, stderr) > + > + subprocess.run = _run As side note, I think we should move this snippet to a different module since it is now replicated in two other files (build-many-glibcs.py and tst-ld-trace.py). > + > +# Number of errors encountered. Zero means no errors (test passes). > +errors = 0 > + > +def parse_line(line): > + """Parse a line of --list-diagnostics output. > + > + This function returns a pair (SUBSCRIPTS, VALUE). VALUE is either > + a byte string or an integer. SUBSCRIPT is a tuple of (LABEL, > + INDEX) pairs, where LABEL is a field identifier (a string), and > + INDEX is an integer or None, to indicate that this field is not > + indexed. > + > + """ > + > + # Extract the list of subscripts before the value. > + idx = 0 > + subscripts = [] > + while line[idx] != '=': > + start_idx = idx > + > + # Extract the label. > + while line[idx] not in '[.=': > + idx += 1 > + label = line[start_idx:idx] > + > + if line[idx] == '[': > + # Subscript with a 0x index. > + assert label > + close_bracket = line.index(']', idx) > + index = line[idx + 1:close_bracket] > + assert index.startswith('0x') > + index = int(index, 0) > + subscripts.append((label, index)) > + idx = close_bracket + 1 > + else: # '.' or '='. > + if label: > + subscripts.append((label, None)) > + if line[idx] == '.': > + idx += 1 > + > + # The value is either a string or a 0x number. > + value = line[idx + 1:] > + if value[0] == '"': > + # Decode the escaped string into a byte string. > + assert value[-1] == '"' > + idx = 1 > + result = [] > + while True: > + ch = value[idx] > + if ch == '\\': > + if value[idx + 1] in '"\\': > + result.append(ord(value[idx + 1])) > + idx += 2 > + else: > + result.append(int(value[idx + 1:idx + 4], 8)) > + idx += 4 > + elif ch == '"': > + assert idx == len(value) - 1 > + break > + else: > + result.append(ord(value[idx])) > + idx += 1 > + value = bytes(result) > + else: > + # Convert the value into an integer. > + assert value.startswith('0x') > + value = int(value, 0) > + return (tuple(subscripts), value) > + > +assert parse_line('a.b[0x1]=0x2') == ((('a', None), ('b', 1)), 2) > +assert parse_line(r'b[0x3]="four\040\"\\"') == ((('b', 3),), b'four \"\\') > + > +# ABNF for a line of --list-diagnostics output. > +diagnostics_abnf = r""" > +HEXDIG = %x30-39 / %x61-6f ; lowercase a-f only > +ALPHA = %x41-5a / %x61-7a / %x7f ; letters and underscore > +ALPHA-NUMERIC = ALPHA / %x30-39 / "_" > +DQUOTE = %x22 ; " > + > +; Numbers are always hexadecimal and use a 0x prefix. > +hex-value-prefix = %x30 %x78 > +hex-value = hex-value-prefix 1*HEXDIG > + > +; Strings use octal escape sequences and \\, \". > +string-char = %x20-21 / %x23-5c / %x5d-7e ; printable but not "\ > +string-quoted-octal = %x30-33 2*2%x30-37 > +string-quoted = "\" ("\" / DQUOTE / string-quoted-octal) > +string-value = DQUOTE *(string-char / string-quoted) DQUOTE > + > +value = hex-value / string-value > + > +label = ALPHA *ALPHA-NUMERIC > +index = "[" hex-value "]" > +subscript = label [index] > + > +line = subscript *("." subscript) "=" value > +""" > + > +def check_consistency_with_manual(manual_path): > + """Verify that the code fragments in the manual match this script. > + > + The code fragments are duplicated to clarify the dual license. > + """ > + > + global errors > + > + def extract_lines(path, start_line, end_line, skip_lines=()): > + result = [] > + with open(path) as inp: > + capturing = False > + for line in inp: > + if line.strip() == start_line: > + capturing = True > + elif not capturing or line.strip() in skip_lines: > + continue > + elif line.strip() == end_line: > + capturing = False > + else: > + result.append(line) > + if not result: > + raise ValueError('{!r} not found in {!r}'.format(start_line, path)) > + if capturing: > + raise ValueError('{!r} not found in {!r}'.format(end_line, path)) > + return result > + > + def check(name, manual, script): > + global errors > + > + if manual == script: > + return > + print('error: {} fragment in manual is different'.format(name)) > + import difflib > + sys.stdout.writelines(difflib.unified_diff( > + manual, script, fromfile='manual', tofile='script')) > + errors += 1 > + > + manual_abnf = extract_lines(manual_path, > + '@c ABNF-START', '@end smallexample', > + skip_lines=('@smallexample',)) > + check('ABNF', diagnostics_abnf.splitlines(keepends=True)[1:], manual_abnf) > + > +# If the abnf module can be imported, run an additional check that the > +# 'line' production from the ABNF grammar matches --list-diagnostics > +# output lines. > +try: > + import abnf > +except ImportError: > + abnf = None > + print('info: skipping ABNF validation because the abnf module is missing') > + > +if abnf is not None: > + class Grammar(abnf.Rule): > + pass > + > + Grammar.load_grammar(diagnostics_abnf) > + > + def parse_abnf(line): > + global errors > + > + # Just verify that the line parses. > + try: > + Grammar('line').parse_all(line) > + except abnf.ParseError: > + print('error: ABNF parse error:', repr(line)) > + errors += 1 > +else: > + def parse_abnf(line): > + pass > + > + > +def parse_diagnostics(cmd): > + global errors > + diag_out = subprocess.run(cmd, stdout=subprocess.PIPE, check=True, > + universal_newlines=True).stdout > + if diag_out[-1] != '\n': > + print('error: ld.so output does not end in newline') > + errors += 1 > + > + PathType = collections.namedtuple('PathType', > + 'has_index value_type original_line') > + # Mapping tuples of labels to PathType values. > + path_types = {} > + > + seen_subscripts = {} > + > + for line in diag_out.splitlines(): > + parse_abnf(line) > + subscripts, value = parse_line(line) > + > + # Check for duplicates. > + if subscripts in seen_subscripts: > + print('error: duplicate value assignment:', repr(line)) > + print(' previous line:,', repr(seen_subscripts[line])) > + errors += 1 > + else: > + seen_subscripts[subscripts] = line > + > + # Compare types against the previously seen labels. > + labels = tuple([label for label, index in subscripts]) > + has_index = tuple([index is not None for label, index in subscripts]) > + value_type = type(value) > + if labels in path_types: > + previous_type = path_types[labels] > + if has_index != previous_type.has_index: > + print('error: line has mismatch of indexing:', repr(line)) > + print(' index types:', has_index) > + print(' previous: ', previous_type.has_index) > + print(' previous line:', repr(previous_type.original_line)) > + errors += 1 > + if value_type != previous_type.value_type: > + print('error: line has mismatch of value type:', repr(line)) > + print(' value type:', value_type.__name__) > + print(' previous: ', previous_type.value_type.__name__) > + print(' previous line:', repr(previous_type.original_line)) > + errors += 1 > + else: > + path_types[labels] = PathType(has_index, value_type, line) > + > + # Check that this line does not add indexing to a previous value. > + for idx in range(1, len(subscripts) - 1): > + if subscripts[:idx] in path_types: > + print('error: line assigns to atomic value:', repr(line)) > + print(' previous line:', repr(previous_type.original_line)) > + errors += 1 > + > + if errors: > + sys.exit(1) > + > +def get_parser(): > + parser = argparse.ArgumentParser(description=__doc__) > + parser.add_argument('--manual', > + help='path to .texi file for consistency checks') > + parser.add_argument('command', > + help='comand to run') > + return parser > + > + > +def main(argv): > + parser = get_parser() > + opts = parser.parse_args(argv) > + > + if opts.manual: > + check_consistency_with_manual(opts.manual) > + > + # Remove the initial 'env' command. > + parse_diagnostics(opts.command.split()[1:]) > + > + if errors: > + sys.exit(1) > + > +if __name__ == '__main__': > + main(sys.argv[1:]) > diff --git a/manual/install.texi b/manual/install.texi > index e8f36d5726..4c4e76fedf 100644 > --- a/manual/install.texi > +++ b/manual/install.texi > @@ -632,6 +632,13 @@ GDB, and should be compatible with the Python version in your system. > As of release time PExpect 4.8.0 is the newest verified to work to test > the pretty printers. > > +@item > +The Python @code{abnf} module. > + > +This module is optional used to verify some ABNF grammars in the manual. > +Version 2.2.0 has been confirmed to work as expected. A missing > +@code{abnf} does not reduce test coverage of the library itself. Same issues as the INSTALL file. > + > @item > GDB 7.8 or later with support for Python 2.7/3.4 or later > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output 2023-08-23 7:00 [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output Florian Weimer 2023-08-23 7:04 ` [PATCH v2 2/2] elf: Check that --list-diagnostics output has the expected syntax Florian Weimer @ 2023-08-23 13:47 ` Adhemerval Zanella Netto 1 sibling, 0 replies; 4+ messages in thread From: Adhemerval Zanella Netto @ 2023-08-23 13:47 UTC (permalink / raw) To: Florian Weimer, libc-alpha LGTM, thanks. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> On 23/08/23 04:00, Florian Weimer via Libc-alpha wrote: > --- > v2: Drop Python code. @cindex/@item fix as suggested by Arsen. > > manual/dynlink.texi | 207 ++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 207 insertions(+) > > diff --git a/manual/dynlink.texi b/manual/dynlink.texi > index 45bf5a5b55..df41c56bfc 100644 > --- a/manual/dynlink.texi > +++ b/manual/dynlink.texi > @@ -13,9 +13,216 @@ as plugins) later at run time. > Dynamic linkers are sometimes called @dfn{dynamic loaders}. > > @menu > +* Dynamic Linker Invocation:: Explicit invocation of the dynamic linker. > * Dynamic Linker Introspection:: Interfaces for querying mapping information. > @end menu > > +@node Dynamic Linker Invocation > + > +@cindex program interpreter > +When a dynamically linked program starts, the operating system > +automatically loads the dynamic linker along with the program. > +@Theglibc{} also supports invoking the dynamic linker explicitly to > +launch a program. This command uses the implied dynamic linker > +(also sometimes called the @dfn{program interpreter}): > + > +@smallexample > +sh -c 'echo "Hello, world!"' > +@end smallexample > + > +This command specifies the dynamic linker explicitly: > + > +@smallexample > +ld.so /bin/sh -c 'echo "Hello, world!"' > +@end smallexample > + > +Note that @command{ld.so} does not search the @env{PATH} environment > +variable, so the full file name of the executable needs to be specified. > + > +The @command{ld.so} program supports various options. Options start > +@samp{--} and need to come before the program that is being launched. > +Some of the supported options are listed below. > + > +@table @code > +@item --list-diagnostics > +Print system diagnostic information in a machine-readable format. > +@xref{Dynamic Linker Diagnostics}. > +@end table > + > +@menu > +* Dynamic Linker Diagnostics:: Obtaining system diagnostic information. > +@end menu > + > +@node Dynamic Linker Diagnostics > +@section Dynamic Linker Diagnostics > +@cindex diagnostics (dynamic linker) > + > +The @samp{ld.so --list-diagnostics} produces machine-readable > +diagnostics output. This output contains system data that affects the > +behavior of @theglibc{}, and potentially application behavior as well. > + > +The exact set of diagnostic items can change between releases of > +@theglibc{}. The output format itself is not expected to change > +radically. > + > +The following table shows some example lines that can be written by the > +diagnostics command. > + > +@table @code > +@item dl_pagesize=0x1000 > +The system page size is 4096 bytes. > + > +@item env[0x14]="LANG=en_US.UTF-8" > +This item indicates that the 21st environment variable at process > +startup contains a setting for @code{LANG}. > + > +@item env_filtered[0x22]="DISPLAY" > +The 35th environment variable is @code{DISPLAY}. Its value is not > +included in the output for privacy reasons because it is not recognized > +as harmless by the diagnostics code. > + > +@item path.prefix="/usr" > +This means that @theglibc{} was configured with @code{--prefix=/usr}. > + > +@item path.system_dirs[0x0]="/lib64/" > +@itemx path.system_dirs[0x1]="/usr/lib64/" > +The built-in dynamic linker search path contains two directories, > +@code{/lib64} and @code{/usr/lib64}. > +@end table > + > +@subsection Dynamic Linker Diagnostics Output Format > + > +As seen above, diagnostic lines assign values (integers or strings) to a > +sequence of labeled subscripts, separated by @samp{.}. Some subscripts > +have integer indices associated with them. The subscript indices are > +not necessarily contiguous or small, so an associative array should be > +used to store them. Currently, all integers fit into the 64-bit > +unsigned integer range. Every access path to a value has a fixed type > +(string or integer) independent of subscript index values. Likewise, > +whether a subscript is indexed does not depend on previous indices (but > +may depend on previous subscript labels). > + > +A syntax description in ABNF (RFC 5234) follows. Note that > +@code{%x30-39} denotes the range of decimal digits. Diagnostic output > +lines are expected to match the @code{line} production. > + > +@c ABNF-START > +@smallexample > +HEXDIG = %x30-39 / %x61-6f ; lowercase a-f only > +ALPHA = %x41-5a / %x61-7a / %x7f ; letters and underscore > +ALPHA-NUMERIC = ALPHA / %x30-39 / "_" > +DQUOTE = %x22 ; " > + > +; Numbers are always hexadecimal and use a 0x prefix. > +hex-value-prefix = %x30 %x78 > +hex-value = hex-value-prefix 1*HEXDIG > + > +; Strings use octal escape sequences and \\, \". > +string-char = %x20-21 / %x23-5c / %x5d-7e ; printable but not "\ > +string-quoted-octal = %x30-33 2*2%x30-37 > +string-quoted = "\" ("\" / DQUOTE / string-quoted-octal) > +string-value = DQUOTE *(string-char / string-quoted) DQUOTE > + > +value = hex-value / string-value > + > +label = ALPHA *ALPHA-NUMERIC > +index = "[" hex-value "]" > +subscript = label [index] > + > +line = subscript *("." subscript) "=" value > +@end smallexample > + > +@subsection Dynamic Linker Diagnostics Values > + > +As mentioned above, the set of diagnostics may change between > +@theglibc{} releases. Nevertheless, the following table documents a few > +common diagnostic items. All numbers are in hexadecimal, with a > +@samp{0x} prefix. > + > +@table @code > +@item dl_dst_lib=@var{string} > +The @code{$LIB} dynamic string token expands to @var{string}. > + > +@cindex HWCAP (diagnostics) > +@item dl_hwcap=@var{integer} > +@itemx dl_hwcap2=@var{integer} > +The HWCAP and HWCAP2 values, as returned for @code{getauxval}, and as > +used in other places depending on the architecture. > + > +@cindex page size (diagnostics) > +@item dl_pagesize=@var{integer} > +The system page size is @var{integer} bytes. > + > +@item dl_platform=@var{string} > +The @code{$PLATFORM} dynamic string token expands to @var{string}. > + > +@item dso.libc=@var{string} > +This is the soname of the shared @code{libc} object that is part of > +@theglibc{}. On most architectures, this is @code{libc.so.6}. > + > +@item env[@var{index}]=@var{string} > +@itemx env_filtered[@var{index}]=@var{string} > +An environment variable from the process environment. The integer > +@var{index} is the array index in the environment array. Variables > +under @code{env} include the variable value after the @samp{=} (assuming > +that it was present), variables under @code{env_filtered} do not. > + > +@item path.prefix=@var{string} > +This indicates that @theglibc{} was configured using > +@samp{--prefix=@var{string}}. > + > +@item path.sysconfdir=@var{string} > +@Theglibc{} was configured (perhaps implicitly) with > +@samp{--sysconfdir=@var{string}} (typically @code{/etc}). > + > +@item path.system_dirs[@var{index}]=@var{string} > +These items list the elements of the built-in array that describes the > +default library search path. The value @var{string} is a directory file > +name with a trailing @samp{/}. > + > +@item path.rtld=@var{string} > +This string indicates the application binary interface (ABI) file name > +of the run-time dynamic linker. > + > +@item version.release="stable" > +@itemx version.release="development" > +The value @code{"stable"} indicates that this build of @theglibc{} is > +from a release branch. Releases labeled as @code{"development"} are > +unreleased development versions. > + > +@cindex version (diagnostics) > +@item version.version="@var{major}.@var{minor}" > +@itemx version.version="@var{major}.@var{minor}.9000" > +@Theglibc{} version. Development releases end in @samp{.9000}. > + > +@cindex auxiliary vector (diagnostics) > +@item auxv[@var{index}].a_type=@var{type} > +@itemx auxv[@var{index}].a_val=@var{integer} > +@itemx auxv[@var{index}].a_val_string=@var{string} > +An entry in the auxiliary vector (specific to Linux). The values > +@var{type} (an integer) and @var{integer} correspond to the members of > +@code{struct auxv}. If the value is a string, @code{a_val_string} is > +used instead of @code{a_val}, so that values have consistent types. > + > +The @code{AT_HWCAP} and @code{AT_HWCAP2} values in this output do not > +reflect adjustment by @theglibc{}. > + > +@item uname.sysname=@var{string} > +@itemx uname.nodename=@var{string} > +@itemx uname.release=@var{string} > +@itemx uname.version=@var{string} > +@itemx uname.machine=@var{string} > +@itemx uname.domain=@var{string} > +These Linux-specific items show the values of @code{struct utsname}, as > +reported by the @code{uname} function. @xref{Platform Type}. > + > +@cindex CPUID (diagnostics) > +@item x86.cpu_features.@dots{} > +These items are specific to the i386 and x86-64 architectures. They > +reflect supported CPU features and information on cache geometry, mostly > +collected using the @code{CPUID} instruction. > +@end table > + > @node Dynamic Linker Introspection > @section Dynamic Linker Introspection > > > base-commit: 65a5112ede9ba3e37e165cf6c9c432f46b903936 ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-08-23 13:53 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-08-23 7:00 [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output Florian Weimer 2023-08-23 7:04 ` [PATCH v2 2/2] elf: Check that --list-diagnostics output has the expected syntax Florian Weimer 2023-08-23 13:53 ` Adhemerval Zanella Netto 2023-08-23 13:47 ` [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output Adhemerval Zanella Netto
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).