* [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output
@ 2023-08-23 7:00 Florian Weimer
2023-08-23 7:04 ` [PATCH v2 2/2] elf: Check that --list-diagnostics output has the expected syntax Florian Weimer
2023-08-23 13:47 ` [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output Adhemerval Zanella Netto
0 siblings, 2 replies; 4+ messages in thread
From: Florian Weimer @ 2023-08-23 7:00 UTC (permalink / raw)
To: libc-alpha
---
v2: Drop Python code. @cindex/@item fix as suggested by Arsen.
manual/dynlink.texi | 207 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 207 insertions(+)
diff --git a/manual/dynlink.texi b/manual/dynlink.texi
index 45bf5a5b55..df41c56bfc 100644
--- a/manual/dynlink.texi
+++ b/manual/dynlink.texi
@@ -13,9 +13,216 @@ as plugins) later at run time.
Dynamic linkers are sometimes called @dfn{dynamic loaders}.
@menu
+* Dynamic Linker Invocation:: Explicit invocation of the dynamic linker.
* Dynamic Linker Introspection:: Interfaces for querying mapping information.
@end menu
+@node Dynamic Linker Invocation
+
+@cindex program interpreter
+When a dynamically linked program starts, the operating system
+automatically loads the dynamic linker along with the program.
+@Theglibc{} also supports invoking the dynamic linker explicitly to
+launch a program. This command uses the implied dynamic linker
+(also sometimes called the @dfn{program interpreter}):
+
+@smallexample
+sh -c 'echo "Hello, world!"'
+@end smallexample
+
+This command specifies the dynamic linker explicitly:
+
+@smallexample
+ld.so /bin/sh -c 'echo "Hello, world!"'
+@end smallexample
+
+Note that @command{ld.so} does not search the @env{PATH} environment
+variable, so the full file name of the executable needs to be specified.
+
+The @command{ld.so} program supports various options. Options start
+@samp{--} and need to come before the program that is being launched.
+Some of the supported options are listed below.
+
+@table @code
+@item --list-diagnostics
+Print system diagnostic information in a machine-readable format.
+@xref{Dynamic Linker Diagnostics}.
+@end table
+
+@menu
+* Dynamic Linker Diagnostics:: Obtaining system diagnostic information.
+@end menu
+
+@node Dynamic Linker Diagnostics
+@section Dynamic Linker Diagnostics
+@cindex diagnostics (dynamic linker)
+
+The @samp{ld.so --list-diagnostics} produces machine-readable
+diagnostics output. This output contains system data that affects the
+behavior of @theglibc{}, and potentially application behavior as well.
+
+The exact set of diagnostic items can change between releases of
+@theglibc{}. The output format itself is not expected to change
+radically.
+
+The following table shows some example lines that can be written by the
+diagnostics command.
+
+@table @code
+@item dl_pagesize=0x1000
+The system page size is 4096 bytes.
+
+@item env[0x14]="LANG=en_US.UTF-8"
+This item indicates that the 21st environment variable at process
+startup contains a setting for @code{LANG}.
+
+@item env_filtered[0x22]="DISPLAY"
+The 35th environment variable is @code{DISPLAY}. Its value is not
+included in the output for privacy reasons because it is not recognized
+as harmless by the diagnostics code.
+
+@item path.prefix="/usr"
+This means that @theglibc{} was configured with @code{--prefix=/usr}.
+
+@item path.system_dirs[0x0]="/lib64/"
+@itemx path.system_dirs[0x1]="/usr/lib64/"
+The built-in dynamic linker search path contains two directories,
+@code{/lib64} and @code{/usr/lib64}.
+@end table
+
+@subsection Dynamic Linker Diagnostics Output Format
+
+As seen above, diagnostic lines assign values (integers or strings) to a
+sequence of labeled subscripts, separated by @samp{.}. Some subscripts
+have integer indices associated with them. The subscript indices are
+not necessarily contiguous or small, so an associative array should be
+used to store them. Currently, all integers fit into the 64-bit
+unsigned integer range. Every access path to a value has a fixed type
+(string or integer) independent of subscript index values. Likewise,
+whether a subscript is indexed does not depend on previous indices (but
+may depend on previous subscript labels).
+
+A syntax description in ABNF (RFC 5234) follows. Note that
+@code{%x30-39} denotes the range of decimal digits. Diagnostic output
+lines are expected to match the @code{line} production.
+
+@c ABNF-START
+@smallexample
+HEXDIG = %x30-39 / %x61-6f ; lowercase a-f only
+ALPHA = %x41-5a / %x61-7a / %x7f ; letters and underscore
+ALPHA-NUMERIC = ALPHA / %x30-39 / "_"
+DQUOTE = %x22 ; "
+
+; Numbers are always hexadecimal and use a 0x prefix.
+hex-value-prefix = %x30 %x78
+hex-value = hex-value-prefix 1*HEXDIG
+
+; Strings use octal escape sequences and \\, \".
+string-char = %x20-21 / %x23-5c / %x5d-7e ; printable but not "\
+string-quoted-octal = %x30-33 2*2%x30-37
+string-quoted = "\" ("\" / DQUOTE / string-quoted-octal)
+string-value = DQUOTE *(string-char / string-quoted) DQUOTE
+
+value = hex-value / string-value
+
+label = ALPHA *ALPHA-NUMERIC
+index = "[" hex-value "]"
+subscript = label [index]
+
+line = subscript *("." subscript) "=" value
+@end smallexample
+
+@subsection Dynamic Linker Diagnostics Values
+
+As mentioned above, the set of diagnostics may change between
+@theglibc{} releases. Nevertheless, the following table documents a few
+common diagnostic items. All numbers are in hexadecimal, with a
+@samp{0x} prefix.
+
+@table @code
+@item dl_dst_lib=@var{string}
+The @code{$LIB} dynamic string token expands to @var{string}.
+
+@cindex HWCAP (diagnostics)
+@item dl_hwcap=@var{integer}
+@itemx dl_hwcap2=@var{integer}
+The HWCAP and HWCAP2 values, as returned for @code{getauxval}, and as
+used in other places depending on the architecture.
+
+@cindex page size (diagnostics)
+@item dl_pagesize=@var{integer}
+The system page size is @var{integer} bytes.
+
+@item dl_platform=@var{string}
+The @code{$PLATFORM} dynamic string token expands to @var{string}.
+
+@item dso.libc=@var{string}
+This is the soname of the shared @code{libc} object that is part of
+@theglibc{}. On most architectures, this is @code{libc.so.6}.
+
+@item env[@var{index}]=@var{string}
+@itemx env_filtered[@var{index}]=@var{string}
+An environment variable from the process environment. The integer
+@var{index} is the array index in the environment array. Variables
+under @code{env} include the variable value after the @samp{=} (assuming
+that it was present), variables under @code{env_filtered} do not.
+
+@item path.prefix=@var{string}
+This indicates that @theglibc{} was configured using
+@samp{--prefix=@var{string}}.
+
+@item path.sysconfdir=@var{string}
+@Theglibc{} was configured (perhaps implicitly) with
+@samp{--sysconfdir=@var{string}} (typically @code{/etc}).
+
+@item path.system_dirs[@var{index}]=@var{string}
+These items list the elements of the built-in array that describes the
+default library search path. The value @var{string} is a directory file
+name with a trailing @samp{/}.
+
+@item path.rtld=@var{string}
+This string indicates the application binary interface (ABI) file name
+of the run-time dynamic linker.
+
+@item version.release="stable"
+@itemx version.release="development"
+The value @code{"stable"} indicates that this build of @theglibc{} is
+from a release branch. Releases labeled as @code{"development"} are
+unreleased development versions.
+
+@cindex version (diagnostics)
+@item version.version="@var{major}.@var{minor}"
+@itemx version.version="@var{major}.@var{minor}.9000"
+@Theglibc{} version. Development releases end in @samp{.9000}.
+
+@cindex auxiliary vector (diagnostics)
+@item auxv[@var{index}].a_type=@var{type}
+@itemx auxv[@var{index}].a_val=@var{integer}
+@itemx auxv[@var{index}].a_val_string=@var{string}
+An entry in the auxiliary vector (specific to Linux). The values
+@var{type} (an integer) and @var{integer} correspond to the members of
+@code{struct auxv}. If the value is a string, @code{a_val_string} is
+used instead of @code{a_val}, so that values have consistent types.
+
+The @code{AT_HWCAP} and @code{AT_HWCAP2} values in this output do not
+reflect adjustment by @theglibc{}.
+
+@item uname.sysname=@var{string}
+@itemx uname.nodename=@var{string}
+@itemx uname.release=@var{string}
+@itemx uname.version=@var{string}
+@itemx uname.machine=@var{string}
+@itemx uname.domain=@var{string}
+These Linux-specific items show the values of @code{struct utsname}, as
+reported by the @code{uname} function. @xref{Platform Type}.
+
+@cindex CPUID (diagnostics)
+@item x86.cpu_features.@dots{}
+These items are specific to the i386 and x86-64 architectures. They
+reflect supported CPU features and information on cache geometry, mostly
+collected using the @code{CPUID} instruction.
+@end table
+
@node Dynamic Linker Introspection
@section Dynamic Linker Introspection
base-commit: 65a5112ede9ba3e37e165cf6c9c432f46b903936
--
2.41.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 2/2] elf: Check that --list-diagnostics output has the expected syntax
2023-08-23 7:00 [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output Florian Weimer
@ 2023-08-23 7:04 ` Florian Weimer
2023-08-23 13:53 ` Adhemerval Zanella Netto
2023-08-23 13:47 ` [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output Adhemerval Zanella Netto
1 sibling, 1 reply; 4+ messages in thread
From: Florian Weimer @ 2023-08-23 7:04 UTC (permalink / raw)
To: libc-alpha
Parts of elf/tst-rtld-list-diagnostics.py have been copied from
scripts/tst-ld-trace.py.
The abnf module is entirely optional and used to verify the
ABNF grammar as included in the manual.
---
v2: Clarify the optional nature of the abnf module. Fixes
from Adhemerval.
INSTALL | 6 +
elf/Makefile | 9 +
elf/tst-rtld-list-diagnostics.py | 303 +++++++++++++++++++++++++++++++
manual/install.texi | 7 +
4 files changed, 325 insertions(+)
create mode 100644 elf/tst-rtld-list-diagnostics.py
diff --git a/INSTALL b/INSTALL
index 268acadd75..e5152a4ae7 100644
--- a/INSTALL
+++ b/INSTALL
@@ -585,6 +585,12 @@ build the GNU C Library:
in your system. As of release time PExpect 4.8.0 is the newest
verified to work to test the pretty printers.
+ • The Python ‘abnf’ module.
+
+ This module is optional used to verify some ABNF grammars in the
+ manual. Version 2.2.0 has been confirmed to work as expected. A
+ missing ‘abnf’ does not reduce test coverage of the library itself.
+
• GDB 7.8 or later with support for Python 2.7/3.4 or later
GDB itself needs to be configured with Python support in order to
diff --git a/elf/Makefile b/elf/Makefile
index c00e2ccfc5..9176cbf1e3 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -1123,6 +1123,7 @@ tests-special += \
$(objpfx)argv0test.out \
$(objpfx)tst-pathopt.out \
$(objpfx)tst-rtld-help.out \
+ $(objpfx)tst-rtld-list-diagnostics.out \
$(objpfx)tst-rtld-load-self.out \
$(objpfx)tst-rtld-preload.out \
$(objpfx)tst-sprof-basic.out \
@@ -2799,6 +2800,14 @@ $(objpfx)tst-ro-dynamic-mod.so: $(objpfx)tst-ro-dynamic-mod.os \
-Wl,--script=tst-ro-dynamic-mod.map \
$(objpfx)tst-ro-dynamic-mod.os
+$(objpfx)tst-rtld-list-diagnostics.out: tst-rtld-list-diagnostics.py \
+ $(..)manual/dynlink.texi $(objpfx)$(rtld-installed-name)
+ $(PYTHON) tst-rtld-list-diagnostics.py \
+ --manual=$(..)manual/dynlink.texi \
+ "$(test-wrapper-env) $(objpfx)$(rtld-installed-name) --list-diagnostics" \
+ > $@; \
+ $(evaluate-test)
+
$(objpfx)tst-rtld-run-static.out: $(objpfx)/ldconfig
$(objpfx)tst-dl_find_object.out: \
diff --git a/elf/tst-rtld-list-diagnostics.py b/elf/tst-rtld-list-diagnostics.py
new file mode 100644
index 0000000000..e9ba9e1798
--- /dev/null
+++ b/elf/tst-rtld-list-diagnostics.py
@@ -0,0 +1,303 @@
+#!/usr/bin/python3
+# Test that the ld.so --list-diagnostics output has the expected syntax.
+# Copyright (C) 2022-2023 Free Software Foundation, Inc.
+# Copyright The GNU Toolchain Authors.
+# This file is part of the GNU C Library.
+#
+# The GNU C Library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# The GNU C Library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with the GNU C Library; if not, see
+# <https://www.gnu.org/licenses/>.
+
+import argparse
+import collections
+import subprocess
+import sys
+
+try:
+ subprocess.run
+except:
+ class _CompletedProcess:
+ def __init__(self, args, returncode, stdout=None, stderr=None):
+ self.args = args
+ self.returncode = returncode
+ self.stdout = stdout
+ self.stderr = stderr
+
+ def _run(*popenargs, input=None, timeout=None, check=False, **kwargs):
+ assert(timeout is None)
+ with subprocess.Popen(*popenargs, **kwargs) as process:
+ try:
+ stdout, stderr = process.communicate(input)
+ except:
+ process.kill()
+ process.wait()
+ raise
+ returncode = process.poll()
+ if check and returncode:
+ raise subprocess.CalledProcessError(returncode, popenargs)
+ return _CompletedProcess(popenargs, returncode, stdout, stderr)
+
+ subprocess.run = _run
+
+# Number of errors encountered. Zero means no errors (test passes).
+errors = 0
+
+def parse_line(line):
+ """Parse a line of --list-diagnostics output.
+
+ This function returns a pair (SUBSCRIPTS, VALUE). VALUE is either
+ a byte string or an integer. SUBSCRIPT is a tuple of (LABEL,
+ INDEX) pairs, where LABEL is a field identifier (a string), and
+ INDEX is an integer or None, to indicate that this field is not
+ indexed.
+
+ """
+
+ # Extract the list of subscripts before the value.
+ idx = 0
+ subscripts = []
+ while line[idx] != '=':
+ start_idx = idx
+
+ # Extract the label.
+ while line[idx] not in '[.=':
+ idx += 1
+ label = line[start_idx:idx]
+
+ if line[idx] == '[':
+ # Subscript with a 0x index.
+ assert label
+ close_bracket = line.index(']', idx)
+ index = line[idx + 1:close_bracket]
+ assert index.startswith('0x')
+ index = int(index, 0)
+ subscripts.append((label, index))
+ idx = close_bracket + 1
+ else: # '.' or '='.
+ if label:
+ subscripts.append((label, None))
+ if line[idx] == '.':
+ idx += 1
+
+ # The value is either a string or a 0x number.
+ value = line[idx + 1:]
+ if value[0] == '"':
+ # Decode the escaped string into a byte string.
+ assert value[-1] == '"'
+ idx = 1
+ result = []
+ while True:
+ ch = value[idx]
+ if ch == '\\':
+ if value[idx + 1] in '"\\':
+ result.append(ord(value[idx + 1]))
+ idx += 2
+ else:
+ result.append(int(value[idx + 1:idx + 4], 8))
+ idx += 4
+ elif ch == '"':
+ assert idx == len(value) - 1
+ break
+ else:
+ result.append(ord(value[idx]))
+ idx += 1
+ value = bytes(result)
+ else:
+ # Convert the value into an integer.
+ assert value.startswith('0x')
+ value = int(value, 0)
+ return (tuple(subscripts), value)
+
+assert parse_line('a.b[0x1]=0x2') == ((('a', None), ('b', 1)), 2)
+assert parse_line(r'b[0x3]="four\040\"\\"') == ((('b', 3),), b'four \"\\')
+
+# ABNF for a line of --list-diagnostics output.
+diagnostics_abnf = r"""
+HEXDIG = %x30-39 / %x61-6f ; lowercase a-f only
+ALPHA = %x41-5a / %x61-7a / %x7f ; letters and underscore
+ALPHA-NUMERIC = ALPHA / %x30-39 / "_"
+DQUOTE = %x22 ; "
+
+; Numbers are always hexadecimal and use a 0x prefix.
+hex-value-prefix = %x30 %x78
+hex-value = hex-value-prefix 1*HEXDIG
+
+; Strings use octal escape sequences and \\, \".
+string-char = %x20-21 / %x23-5c / %x5d-7e ; printable but not "\
+string-quoted-octal = %x30-33 2*2%x30-37
+string-quoted = "\" ("\" / DQUOTE / string-quoted-octal)
+string-value = DQUOTE *(string-char / string-quoted) DQUOTE
+
+value = hex-value / string-value
+
+label = ALPHA *ALPHA-NUMERIC
+index = "[" hex-value "]"
+subscript = label [index]
+
+line = subscript *("." subscript) "=" value
+"""
+
+def check_consistency_with_manual(manual_path):
+ """Verify that the code fragments in the manual match this script.
+
+ The code fragments are duplicated to clarify the dual license.
+ """
+
+ global errors
+
+ def extract_lines(path, start_line, end_line, skip_lines=()):
+ result = []
+ with open(path) as inp:
+ capturing = False
+ for line in inp:
+ if line.strip() == start_line:
+ capturing = True
+ elif not capturing or line.strip() in skip_lines:
+ continue
+ elif line.strip() == end_line:
+ capturing = False
+ else:
+ result.append(line)
+ if not result:
+ raise ValueError('{!r} not found in {!r}'.format(start_line, path))
+ if capturing:
+ raise ValueError('{!r} not found in {!r}'.format(end_line, path))
+ return result
+
+ def check(name, manual, script):
+ global errors
+
+ if manual == script:
+ return
+ print('error: {} fragment in manual is different'.format(name))
+ import difflib
+ sys.stdout.writelines(difflib.unified_diff(
+ manual, script, fromfile='manual', tofile='script'))
+ errors += 1
+
+ manual_abnf = extract_lines(manual_path,
+ '@c ABNF-START', '@end smallexample',
+ skip_lines=('@smallexample',))
+ check('ABNF', diagnostics_abnf.splitlines(keepends=True)[1:], manual_abnf)
+
+# If the abnf module can be imported, run an additional check that the
+# 'line' production from the ABNF grammar matches --list-diagnostics
+# output lines.
+try:
+ import abnf
+except ImportError:
+ abnf = None
+ print('info: skipping ABNF validation because the abnf module is missing')
+
+if abnf is not None:
+ class Grammar(abnf.Rule):
+ pass
+
+ Grammar.load_grammar(diagnostics_abnf)
+
+ def parse_abnf(line):
+ global errors
+
+ # Just verify that the line parses.
+ try:
+ Grammar('line').parse_all(line)
+ except abnf.ParseError:
+ print('error: ABNF parse error:', repr(line))
+ errors += 1
+else:
+ def parse_abnf(line):
+ pass
+
+
+def parse_diagnostics(cmd):
+ global errors
+ diag_out = subprocess.run(cmd, stdout=subprocess.PIPE, check=True,
+ universal_newlines=True).stdout
+ if diag_out[-1] != '\n':
+ print('error: ld.so output does not end in newline')
+ errors += 1
+
+ PathType = collections.namedtuple('PathType',
+ 'has_index value_type original_line')
+ # Mapping tuples of labels to PathType values.
+ path_types = {}
+
+ seen_subscripts = {}
+
+ for line in diag_out.splitlines():
+ parse_abnf(line)
+ subscripts, value = parse_line(line)
+
+ # Check for duplicates.
+ if subscripts in seen_subscripts:
+ print('error: duplicate value assignment:', repr(line))
+ print(' previous line:,', repr(seen_subscripts[line]))
+ errors += 1
+ else:
+ seen_subscripts[subscripts] = line
+
+ # Compare types against the previously seen labels.
+ labels = tuple([label for label, index in subscripts])
+ has_index = tuple([index is not None for label, index in subscripts])
+ value_type = type(value)
+ if labels in path_types:
+ previous_type = path_types[labels]
+ if has_index != previous_type.has_index:
+ print('error: line has mismatch of indexing:', repr(line))
+ print(' index types:', has_index)
+ print(' previous: ', previous_type.has_index)
+ print(' previous line:', repr(previous_type.original_line))
+ errors += 1
+ if value_type != previous_type.value_type:
+ print('error: line has mismatch of value type:', repr(line))
+ print(' value type:', value_type.__name__)
+ print(' previous: ', previous_type.value_type.__name__)
+ print(' previous line:', repr(previous_type.original_line))
+ errors += 1
+ else:
+ path_types[labels] = PathType(has_index, value_type, line)
+
+ # Check that this line does not add indexing to a previous value.
+ for idx in range(1, len(subscripts) - 1):
+ if subscripts[:idx] in path_types:
+ print('error: line assigns to atomic value:', repr(line))
+ print(' previous line:', repr(previous_type.original_line))
+ errors += 1
+
+ if errors:
+ sys.exit(1)
+
+def get_parser():
+ parser = argparse.ArgumentParser(description=__doc__)
+ parser.add_argument('--manual',
+ help='path to .texi file for consistency checks')
+ parser.add_argument('command',
+ help='comand to run')
+ return parser
+
+
+def main(argv):
+ parser = get_parser()
+ opts = parser.parse_args(argv)
+
+ if opts.manual:
+ check_consistency_with_manual(opts.manual)
+
+ # Remove the initial 'env' command.
+ parse_diagnostics(opts.command.split()[1:])
+
+ if errors:
+ sys.exit(1)
+
+if __name__ == '__main__':
+ main(sys.argv[1:])
diff --git a/manual/install.texi b/manual/install.texi
index e8f36d5726..4c4e76fedf 100644
--- a/manual/install.texi
+++ b/manual/install.texi
@@ -632,6 +632,13 @@ GDB, and should be compatible with the Python version in your system.
As of release time PExpect 4.8.0 is the newest verified to work to test
the pretty printers.
+@item
+The Python @code{abnf} module.
+
+This module is optional used to verify some ABNF grammars in the manual.
+Version 2.2.0 has been confirmed to work as expected. A missing
+@code{abnf} does not reduce test coverage of the library itself.
+
@item
GDB 7.8 or later with support for Python 2.7/3.4 or later
--
2.41.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output
2023-08-23 7:00 [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output Florian Weimer
2023-08-23 7:04 ` [PATCH v2 2/2] elf: Check that --list-diagnostics output has the expected syntax Florian Weimer
@ 2023-08-23 13:47 ` Adhemerval Zanella Netto
1 sibling, 0 replies; 4+ messages in thread
From: Adhemerval Zanella Netto @ 2023-08-23 13:47 UTC (permalink / raw)
To: Florian Weimer, libc-alpha
LGTM, thanks.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
On 23/08/23 04:00, Florian Weimer via Libc-alpha wrote:
> ---
> v2: Drop Python code. @cindex/@item fix as suggested by Arsen.
>
> manual/dynlink.texi | 207 ++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 207 insertions(+)
>
> diff --git a/manual/dynlink.texi b/manual/dynlink.texi
> index 45bf5a5b55..df41c56bfc 100644
> --- a/manual/dynlink.texi
> +++ b/manual/dynlink.texi
> @@ -13,9 +13,216 @@ as plugins) later at run time.
> Dynamic linkers are sometimes called @dfn{dynamic loaders}.
>
> @menu
> +* Dynamic Linker Invocation:: Explicit invocation of the dynamic linker.
> * Dynamic Linker Introspection:: Interfaces for querying mapping information.
> @end menu
>
> +@node Dynamic Linker Invocation
> +
> +@cindex program interpreter
> +When a dynamically linked program starts, the operating system
> +automatically loads the dynamic linker along with the program.
> +@Theglibc{} also supports invoking the dynamic linker explicitly to
> +launch a program. This command uses the implied dynamic linker
> +(also sometimes called the @dfn{program interpreter}):
> +
> +@smallexample
> +sh -c 'echo "Hello, world!"'
> +@end smallexample
> +
> +This command specifies the dynamic linker explicitly:
> +
> +@smallexample
> +ld.so /bin/sh -c 'echo "Hello, world!"'
> +@end smallexample
> +
> +Note that @command{ld.so} does not search the @env{PATH} environment
> +variable, so the full file name of the executable needs to be specified.
> +
> +The @command{ld.so} program supports various options. Options start
> +@samp{--} and need to come before the program that is being launched.
> +Some of the supported options are listed below.
> +
> +@table @code
> +@item --list-diagnostics
> +Print system diagnostic information in a machine-readable format.
> +@xref{Dynamic Linker Diagnostics}.
> +@end table
> +
> +@menu
> +* Dynamic Linker Diagnostics:: Obtaining system diagnostic information.
> +@end menu
> +
> +@node Dynamic Linker Diagnostics
> +@section Dynamic Linker Diagnostics
> +@cindex diagnostics (dynamic linker)
> +
> +The @samp{ld.so --list-diagnostics} produces machine-readable
> +diagnostics output. This output contains system data that affects the
> +behavior of @theglibc{}, and potentially application behavior as well.
> +
> +The exact set of diagnostic items can change between releases of
> +@theglibc{}. The output format itself is not expected to change
> +radically.
> +
> +The following table shows some example lines that can be written by the
> +diagnostics command.
> +
> +@table @code
> +@item dl_pagesize=0x1000
> +The system page size is 4096 bytes.
> +
> +@item env[0x14]="LANG=en_US.UTF-8"
> +This item indicates that the 21st environment variable at process
> +startup contains a setting for @code{LANG}.
> +
> +@item env_filtered[0x22]="DISPLAY"
> +The 35th environment variable is @code{DISPLAY}. Its value is not
> +included in the output for privacy reasons because it is not recognized
> +as harmless by the diagnostics code.
> +
> +@item path.prefix="/usr"
> +This means that @theglibc{} was configured with @code{--prefix=/usr}.
> +
> +@item path.system_dirs[0x0]="/lib64/"
> +@itemx path.system_dirs[0x1]="/usr/lib64/"
> +The built-in dynamic linker search path contains two directories,
> +@code{/lib64} and @code{/usr/lib64}.
> +@end table
> +
> +@subsection Dynamic Linker Diagnostics Output Format
> +
> +As seen above, diagnostic lines assign values (integers or strings) to a
> +sequence of labeled subscripts, separated by @samp{.}. Some subscripts
> +have integer indices associated with them. The subscript indices are
> +not necessarily contiguous or small, so an associative array should be
> +used to store them. Currently, all integers fit into the 64-bit
> +unsigned integer range. Every access path to a value has a fixed type
> +(string or integer) independent of subscript index values. Likewise,
> +whether a subscript is indexed does not depend on previous indices (but
> +may depend on previous subscript labels).
> +
> +A syntax description in ABNF (RFC 5234) follows. Note that
> +@code{%x30-39} denotes the range of decimal digits. Diagnostic output
> +lines are expected to match the @code{line} production.
> +
> +@c ABNF-START
> +@smallexample
> +HEXDIG = %x30-39 / %x61-6f ; lowercase a-f only
> +ALPHA = %x41-5a / %x61-7a / %x7f ; letters and underscore
> +ALPHA-NUMERIC = ALPHA / %x30-39 / "_"
> +DQUOTE = %x22 ; "
> +
> +; Numbers are always hexadecimal and use a 0x prefix.
> +hex-value-prefix = %x30 %x78
> +hex-value = hex-value-prefix 1*HEXDIG
> +
> +; Strings use octal escape sequences and \\, \".
> +string-char = %x20-21 / %x23-5c / %x5d-7e ; printable but not "\
> +string-quoted-octal = %x30-33 2*2%x30-37
> +string-quoted = "\" ("\" / DQUOTE / string-quoted-octal)
> +string-value = DQUOTE *(string-char / string-quoted) DQUOTE
> +
> +value = hex-value / string-value
> +
> +label = ALPHA *ALPHA-NUMERIC
> +index = "[" hex-value "]"
> +subscript = label [index]
> +
> +line = subscript *("." subscript) "=" value
> +@end smallexample
> +
> +@subsection Dynamic Linker Diagnostics Values
> +
> +As mentioned above, the set of diagnostics may change between
> +@theglibc{} releases. Nevertheless, the following table documents a few
> +common diagnostic items. All numbers are in hexadecimal, with a
> +@samp{0x} prefix.
> +
> +@table @code
> +@item dl_dst_lib=@var{string}
> +The @code{$LIB} dynamic string token expands to @var{string}.
> +
> +@cindex HWCAP (diagnostics)
> +@item dl_hwcap=@var{integer}
> +@itemx dl_hwcap2=@var{integer}
> +The HWCAP and HWCAP2 values, as returned for @code{getauxval}, and as
> +used in other places depending on the architecture.
> +
> +@cindex page size (diagnostics)
> +@item dl_pagesize=@var{integer}
> +The system page size is @var{integer} bytes.
> +
> +@item dl_platform=@var{string}
> +The @code{$PLATFORM} dynamic string token expands to @var{string}.
> +
> +@item dso.libc=@var{string}
> +This is the soname of the shared @code{libc} object that is part of
> +@theglibc{}. On most architectures, this is @code{libc.so.6}.
> +
> +@item env[@var{index}]=@var{string}
> +@itemx env_filtered[@var{index}]=@var{string}
> +An environment variable from the process environment. The integer
> +@var{index} is the array index in the environment array. Variables
> +under @code{env} include the variable value after the @samp{=} (assuming
> +that it was present), variables under @code{env_filtered} do not.
> +
> +@item path.prefix=@var{string}
> +This indicates that @theglibc{} was configured using
> +@samp{--prefix=@var{string}}.
> +
> +@item path.sysconfdir=@var{string}
> +@Theglibc{} was configured (perhaps implicitly) with
> +@samp{--sysconfdir=@var{string}} (typically @code{/etc}).
> +
> +@item path.system_dirs[@var{index}]=@var{string}
> +These items list the elements of the built-in array that describes the
> +default library search path. The value @var{string} is a directory file
> +name with a trailing @samp{/}.
> +
> +@item path.rtld=@var{string}
> +This string indicates the application binary interface (ABI) file name
> +of the run-time dynamic linker.
> +
> +@item version.release="stable"
> +@itemx version.release="development"
> +The value @code{"stable"} indicates that this build of @theglibc{} is
> +from a release branch. Releases labeled as @code{"development"} are
> +unreleased development versions.
> +
> +@cindex version (diagnostics)
> +@item version.version="@var{major}.@var{minor}"
> +@itemx version.version="@var{major}.@var{minor}.9000"
> +@Theglibc{} version. Development releases end in @samp{.9000}.
> +
> +@cindex auxiliary vector (diagnostics)
> +@item auxv[@var{index}].a_type=@var{type}
> +@itemx auxv[@var{index}].a_val=@var{integer}
> +@itemx auxv[@var{index}].a_val_string=@var{string}
> +An entry in the auxiliary vector (specific to Linux). The values
> +@var{type} (an integer) and @var{integer} correspond to the members of
> +@code{struct auxv}. If the value is a string, @code{a_val_string} is
> +used instead of @code{a_val}, so that values have consistent types.
> +
> +The @code{AT_HWCAP} and @code{AT_HWCAP2} values in this output do not
> +reflect adjustment by @theglibc{}.
> +
> +@item uname.sysname=@var{string}
> +@itemx uname.nodename=@var{string}
> +@itemx uname.release=@var{string}
> +@itemx uname.version=@var{string}
> +@itemx uname.machine=@var{string}
> +@itemx uname.domain=@var{string}
> +These Linux-specific items show the values of @code{struct utsname}, as
> +reported by the @code{uname} function. @xref{Platform Type}.
> +
> +@cindex CPUID (diagnostics)
> +@item x86.cpu_features.@dots{}
> +These items are specific to the i386 and x86-64 architectures. They
> +reflect supported CPU features and information on cache geometry, mostly
> +collected using the @code{CPUID} instruction.
> +@end table
> +
> @node Dynamic Linker Introspection
> @section Dynamic Linker Introspection
>
>
> base-commit: 65a5112ede9ba3e37e165cf6c9c432f46b903936
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 2/2] elf: Check that --list-diagnostics output has the expected syntax
2023-08-23 7:04 ` [PATCH v2 2/2] elf: Check that --list-diagnostics output has the expected syntax Florian Weimer
@ 2023-08-23 13:53 ` Adhemerval Zanella Netto
0 siblings, 0 replies; 4+ messages in thread
From: Adhemerval Zanella Netto @ 2023-08-23 13:53 UTC (permalink / raw)
To: libc-alpha, Florian Weimer
On 23/08/23 04:04, Florian Weimer via Libc-alpha wrote:
> Parts of elf/tst-rtld-list-diagnostics.py have been copied from
> scripts/tst-ld-trace.py.
>
> The abnf module is entirely optional and used to verify the
> ABNF grammar as included in the manual.
LGTM, some minor notes below.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
> ---
> v2: Clarify the optional nature of the abnf module. Fixes
> from Adhemerval.
>
> INSTALL | 6 +
> elf/Makefile | 9 +
> elf/tst-rtld-list-diagnostics.py | 303 +++++++++++++++++++++++++++++++
> manual/install.texi | 7 +
> 4 files changed, 325 insertions(+)
> create mode 100644 elf/tst-rtld-list-diagnostics.py
>
> diff --git a/INSTALL b/INSTALL
> index 268acadd75..e5152a4ae7 100644
> --- a/INSTALL
> +++ b/INSTALL
> @@ -585,6 +585,12 @@ build the GNU C Library:
> in your system. As of release time PExpect 4.8.0 is the newest
> verified to work to test the pretty printers.
>
> + • The Python ‘abnf’ module.
> +
> + This module is optional used to verify some ABNF grammars in the
[...] is optional *and* used [...].
> + manual. Version 2.2.0 has been confirmed to work as expected. A
> + missing ‘abnf’ does not reduce test coverage of the library itself.
I think it should be 'reduce the test'.
> +
> • GDB 7.8 or later with support for Python 2.7/3.4 or later
>
> GDB itself needs to be configured with Python support in order to
> diff --git a/elf/Makefile b/elf/Makefile
> index c00e2ccfc5..9176cbf1e3 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -1123,6 +1123,7 @@ tests-special += \
> $(objpfx)argv0test.out \
> $(objpfx)tst-pathopt.out \
> $(objpfx)tst-rtld-help.out \
> + $(objpfx)tst-rtld-list-diagnostics.out \
> $(objpfx)tst-rtld-load-self.out \
> $(objpfx)tst-rtld-preload.out \
> $(objpfx)tst-sprof-basic.out \
> @@ -2799,6 +2800,14 @@ $(objpfx)tst-ro-dynamic-mod.so: $(objpfx)tst-ro-dynamic-mod.os \
> -Wl,--script=tst-ro-dynamic-mod.map \
> $(objpfx)tst-ro-dynamic-mod.os
>
> +$(objpfx)tst-rtld-list-diagnostics.out: tst-rtld-list-diagnostics.py \
> + $(..)manual/dynlink.texi $(objpfx)$(rtld-installed-name)
> + $(PYTHON) tst-rtld-list-diagnostics.py \
> + --manual=$(..)manual/dynlink.texi \
> + "$(test-wrapper-env) $(objpfx)$(rtld-installed-name) --list-diagnostics" \
> + > $@; \
> + $(evaluate-test)
> +
> $(objpfx)tst-rtld-run-static.out: $(objpfx)/ldconfig
>
> $(objpfx)tst-dl_find_object.out: \
> diff --git a/elf/tst-rtld-list-diagnostics.py b/elf/tst-rtld-list-diagnostics.py
> new file mode 100644
> index 0000000000..e9ba9e1798
> --- /dev/null
> +++ b/elf/tst-rtld-list-diagnostics.py
> @@ -0,0 +1,303 @@
> +#!/usr/bin/python3
> +# Test that the ld.so --list-diagnostics output has the expected syntax.
> +# Copyright (C) 2022-2023 Free Software Foundation, Inc.
> +# Copyright The GNU Toolchain Authors.
> +# This file is part of the GNU C Library.
> +#
> +# The GNU C Library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +#
> +# The GNU C Library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with the GNU C Library; if not, see
> +# <https://www.gnu.org/licenses/>.
> +
> +import argparse
> +import collections
> +import subprocess
> +import sys
> +
> +try:
> + subprocess.run
> +except:
> + class _CompletedProcess:
> + def __init__(self, args, returncode, stdout=None, stderr=None):
> + self.args = args
> + self.returncode = returncode
> + self.stdout = stdout
> + self.stderr = stderr
> +
> + def _run(*popenargs, input=None, timeout=None, check=False, **kwargs):
> + assert(timeout is None)
> + with subprocess.Popen(*popenargs, **kwargs) as process:
> + try:
> + stdout, stderr = process.communicate(input)
> + except:
> + process.kill()
> + process.wait()
> + raise
> + returncode = process.poll()
> + if check and returncode:
> + raise subprocess.CalledProcessError(returncode, popenargs)
> + return _CompletedProcess(popenargs, returncode, stdout, stderr)
> +
> + subprocess.run = _run
As side note, I think we should move this snippet to a different module since
it is now replicated in two other files (build-many-glibcs.py and tst-ld-trace.py).
> +
> +# Number of errors encountered. Zero means no errors (test passes).
> +errors = 0
> +
> +def parse_line(line):
> + """Parse a line of --list-diagnostics output.
> +
> + This function returns a pair (SUBSCRIPTS, VALUE). VALUE is either
> + a byte string or an integer. SUBSCRIPT is a tuple of (LABEL,
> + INDEX) pairs, where LABEL is a field identifier (a string), and
> + INDEX is an integer or None, to indicate that this field is not
> + indexed.
> +
> + """
> +
> + # Extract the list of subscripts before the value.
> + idx = 0
> + subscripts = []
> + while line[idx] != '=':
> + start_idx = idx
> +
> + # Extract the label.
> + while line[idx] not in '[.=':
> + idx += 1
> + label = line[start_idx:idx]
> +
> + if line[idx] == '[':
> + # Subscript with a 0x index.
> + assert label
> + close_bracket = line.index(']', idx)
> + index = line[idx + 1:close_bracket]
> + assert index.startswith('0x')
> + index = int(index, 0)
> + subscripts.append((label, index))
> + idx = close_bracket + 1
> + else: # '.' or '='.
> + if label:
> + subscripts.append((label, None))
> + if line[idx] == '.':
> + idx += 1
> +
> + # The value is either a string or a 0x number.
> + value = line[idx + 1:]
> + if value[0] == '"':
> + # Decode the escaped string into a byte string.
> + assert value[-1] == '"'
> + idx = 1
> + result = []
> + while True:
> + ch = value[idx]
> + if ch == '\\':
> + if value[idx + 1] in '"\\':
> + result.append(ord(value[idx + 1]))
> + idx += 2
> + else:
> + result.append(int(value[idx + 1:idx + 4], 8))
> + idx += 4
> + elif ch == '"':
> + assert idx == len(value) - 1
> + break
> + else:
> + result.append(ord(value[idx]))
> + idx += 1
> + value = bytes(result)
> + else:
> + # Convert the value into an integer.
> + assert value.startswith('0x')
> + value = int(value, 0)
> + return (tuple(subscripts), value)
> +
> +assert parse_line('a.b[0x1]=0x2') == ((('a', None), ('b', 1)), 2)
> +assert parse_line(r'b[0x3]="four\040\"\\"') == ((('b', 3),), b'four \"\\')
> +
> +# ABNF for a line of --list-diagnostics output.
> +diagnostics_abnf = r"""
> +HEXDIG = %x30-39 / %x61-6f ; lowercase a-f only
> +ALPHA = %x41-5a / %x61-7a / %x7f ; letters and underscore
> +ALPHA-NUMERIC = ALPHA / %x30-39 / "_"
> +DQUOTE = %x22 ; "
> +
> +; Numbers are always hexadecimal and use a 0x prefix.
> +hex-value-prefix = %x30 %x78
> +hex-value = hex-value-prefix 1*HEXDIG
> +
> +; Strings use octal escape sequences and \\, \".
> +string-char = %x20-21 / %x23-5c / %x5d-7e ; printable but not "\
> +string-quoted-octal = %x30-33 2*2%x30-37
> +string-quoted = "\" ("\" / DQUOTE / string-quoted-octal)
> +string-value = DQUOTE *(string-char / string-quoted) DQUOTE
> +
> +value = hex-value / string-value
> +
> +label = ALPHA *ALPHA-NUMERIC
> +index = "[" hex-value "]"
> +subscript = label [index]
> +
> +line = subscript *("." subscript) "=" value
> +"""
> +
> +def check_consistency_with_manual(manual_path):
> + """Verify that the code fragments in the manual match this script.
> +
> + The code fragments are duplicated to clarify the dual license.
> + """
> +
> + global errors
> +
> + def extract_lines(path, start_line, end_line, skip_lines=()):
> + result = []
> + with open(path) as inp:
> + capturing = False
> + for line in inp:
> + if line.strip() == start_line:
> + capturing = True
> + elif not capturing or line.strip() in skip_lines:
> + continue
> + elif line.strip() == end_line:
> + capturing = False
> + else:
> + result.append(line)
> + if not result:
> + raise ValueError('{!r} not found in {!r}'.format(start_line, path))
> + if capturing:
> + raise ValueError('{!r} not found in {!r}'.format(end_line, path))
> + return result
> +
> + def check(name, manual, script):
> + global errors
> +
> + if manual == script:
> + return
> + print('error: {} fragment in manual is different'.format(name))
> + import difflib
> + sys.stdout.writelines(difflib.unified_diff(
> + manual, script, fromfile='manual', tofile='script'))
> + errors += 1
> +
> + manual_abnf = extract_lines(manual_path,
> + '@c ABNF-START', '@end smallexample',
> + skip_lines=('@smallexample',))
> + check('ABNF', diagnostics_abnf.splitlines(keepends=True)[1:], manual_abnf)
> +
> +# If the abnf module can be imported, run an additional check that the
> +# 'line' production from the ABNF grammar matches --list-diagnostics
> +# output lines.
> +try:
> + import abnf
> +except ImportError:
> + abnf = None
> + print('info: skipping ABNF validation because the abnf module is missing')
> +
> +if abnf is not None:
> + class Grammar(abnf.Rule):
> + pass
> +
> + Grammar.load_grammar(diagnostics_abnf)
> +
> + def parse_abnf(line):
> + global errors
> +
> + # Just verify that the line parses.
> + try:
> + Grammar('line').parse_all(line)
> + except abnf.ParseError:
> + print('error: ABNF parse error:', repr(line))
> + errors += 1
> +else:
> + def parse_abnf(line):
> + pass
> +
> +
> +def parse_diagnostics(cmd):
> + global errors
> + diag_out = subprocess.run(cmd, stdout=subprocess.PIPE, check=True,
> + universal_newlines=True).stdout
> + if diag_out[-1] != '\n':
> + print('error: ld.so output does not end in newline')
> + errors += 1
> +
> + PathType = collections.namedtuple('PathType',
> + 'has_index value_type original_line')
> + # Mapping tuples of labels to PathType values.
> + path_types = {}
> +
> + seen_subscripts = {}
> +
> + for line in diag_out.splitlines():
> + parse_abnf(line)
> + subscripts, value = parse_line(line)
> +
> + # Check for duplicates.
> + if subscripts in seen_subscripts:
> + print('error: duplicate value assignment:', repr(line))
> + print(' previous line:,', repr(seen_subscripts[line]))
> + errors += 1
> + else:
> + seen_subscripts[subscripts] = line
> +
> + # Compare types against the previously seen labels.
> + labels = tuple([label for label, index in subscripts])
> + has_index = tuple([index is not None for label, index in subscripts])
> + value_type = type(value)
> + if labels in path_types:
> + previous_type = path_types[labels]
> + if has_index != previous_type.has_index:
> + print('error: line has mismatch of indexing:', repr(line))
> + print(' index types:', has_index)
> + print(' previous: ', previous_type.has_index)
> + print(' previous line:', repr(previous_type.original_line))
> + errors += 1
> + if value_type != previous_type.value_type:
> + print('error: line has mismatch of value type:', repr(line))
> + print(' value type:', value_type.__name__)
> + print(' previous: ', previous_type.value_type.__name__)
> + print(' previous line:', repr(previous_type.original_line))
> + errors += 1
> + else:
> + path_types[labels] = PathType(has_index, value_type, line)
> +
> + # Check that this line does not add indexing to a previous value.
> + for idx in range(1, len(subscripts) - 1):
> + if subscripts[:idx] in path_types:
> + print('error: line assigns to atomic value:', repr(line))
> + print(' previous line:', repr(previous_type.original_line))
> + errors += 1
> +
> + if errors:
> + sys.exit(1)
> +
> +def get_parser():
> + parser = argparse.ArgumentParser(description=__doc__)
> + parser.add_argument('--manual',
> + help='path to .texi file for consistency checks')
> + parser.add_argument('command',
> + help='comand to run')
> + return parser
> +
> +
> +def main(argv):
> + parser = get_parser()
> + opts = parser.parse_args(argv)
> +
> + if opts.manual:
> + check_consistency_with_manual(opts.manual)
> +
> + # Remove the initial 'env' command.
> + parse_diagnostics(opts.command.split()[1:])
> +
> + if errors:
> + sys.exit(1)
> +
> +if __name__ == '__main__':
> + main(sys.argv[1:])
> diff --git a/manual/install.texi b/manual/install.texi
> index e8f36d5726..4c4e76fedf 100644
> --- a/manual/install.texi
> +++ b/manual/install.texi
> @@ -632,6 +632,13 @@ GDB, and should be compatible with the Python version in your system.
> As of release time PExpect 4.8.0 is the newest verified to work to test
> the pretty printers.
>
> +@item
> +The Python @code{abnf} module.
> +
> +This module is optional used to verify some ABNF grammars in the manual.
> +Version 2.2.0 has been confirmed to work as expected. A missing
> +@code{abnf} does not reduce test coverage of the library itself.
Same issues as the INSTALL file.
> +
> @item
> GDB 7.8 or later with support for Python 2.7/3.4 or later
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-08-23 13:53 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-23 7:00 [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output Florian Weimer
2023-08-23 7:04 ` [PATCH v2 2/2] elf: Check that --list-diagnostics output has the expected syntax Florian Weimer
2023-08-23 13:53 ` Adhemerval Zanella Netto
2023-08-23 13:47 ` [PATCH v2 1/2] manual: Document ld.so --list-diagnostics output Adhemerval Zanella Netto
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).