From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
To: libc-alpha@sourceware.org, Florian Weimer <fweimer@redhat.com>
Subject: Re: [PATCH 3/3] elf: Check that --list-diagnostics output has the expected syntax
Date: Mon, 21 Aug 2023 13:29:53 -0300 [thread overview]
Message-ID: <6ccec3fa-f630-faa0-bf89-bf1ae915fe67@linaro.org> (raw)
In-Reply-To: <30749c92b4926f909b21102765bba0676e009d04.1691172895.git.fweimer@redhat.com>
On 04/08/23 15:16, Florian Weimer via Libc-alpha wrote:
> Parts of elf/tst-rtld-list-diagnostics.py have been copied from
> scripts/tst-ld-trace.py.
>
> The abnf module is entirely optional and used to verify the
> ABNF grammar as included in the manual.
> ---
> INSTALL | 5 +
> elf/Makefile | 9 +
> elf/tst-rtld-list-diagnostics.py | 311 +++++++++++++++++++++++++++++++
> manual/install.texi | 6 +
> 4 files changed, 331 insertions(+)
> create mode 100644 elf/tst-rtld-list-diagnostics.py
>
> diff --git a/INSTALL b/INSTALL
> index 268acadd75..3f662bf427 100644
> --- a/INSTALL
> +++ b/INSTALL
> @@ -585,6 +585,11 @@ build the GNU C Library:
> in your system. As of release time PExpect 4.8.0 is the newest
> verified to work to test the pretty printers.
>
> + • The Python ‘abnf’ module.
> +
> + This module is used to verify some ABNF grammars in the manual.
> + Version 2.2.0 has been confirmed to work as expected.
> +
This part mixes requires and recommended tools, I think it should be clear
that the abnf module is complete optional and it not being present only
reduces the test coverage.
The rest look ok.
> • GDB 7.8 or later with support for Python 2.7/3.4 or later
>
> GDB itself needs to be configured with Python support in order to
> diff --git a/elf/Makefile b/elf/Makefile
> index c00e2ccfc5..9176cbf1e3 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -1123,6 +1123,7 @@ tests-special += \
> $(objpfx)argv0test.out \
> $(objpfx)tst-pathopt.out \
> $(objpfx)tst-rtld-help.out \
> + $(objpfx)tst-rtld-list-diagnostics.out \
> $(objpfx)tst-rtld-load-self.out \
> $(objpfx)tst-rtld-preload.out \
> $(objpfx)tst-sprof-basic.out \
> @@ -2799,6 +2800,14 @@ $(objpfx)tst-ro-dynamic-mod.so: $(objpfx)tst-ro-dynamic-mod.os \
> -Wl,--script=tst-ro-dynamic-mod.map \
> $(objpfx)tst-ro-dynamic-mod.os
>
> +$(objpfx)tst-rtld-list-diagnostics.out: tst-rtld-list-diagnostics.py \
> + $(..)manual/dynlink.texi $(objpfx)$(rtld-installed-name)
> + $(PYTHON) tst-rtld-list-diagnostics.py \
> + --manual=$(..)manual/dynlink.texi \
> + "$(test-wrapper-env) $(objpfx)$(rtld-installed-name) --list-diagnostics" \
> + > $@; \
> + $(evaluate-test)
> +
> $(objpfx)tst-rtld-run-static.out: $(objpfx)/ldconfig
>
> $(objpfx)tst-dl_find_object.out: \
> diff --git a/elf/tst-rtld-list-diagnostics.py b/elf/tst-rtld-list-diagnostics.py
> new file mode 100644
> index 0000000000..f4ff06fd86
> --- /dev/null
> +++ b/elf/tst-rtld-list-diagnostics.py
> @@ -0,0 +1,311 @@
> +#!/usr/bin/python3
> +# Test that the ld.so --list-diagnostics output has the expected syntax.
> +# Copyright (C) 2022-2023 Free Software Foundation, Inc.
> +# Copyright The GNU Toolchain Authors.
> +# This file is part of the GNU C Library.
> +#
> +# The GNU C Library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +#
> +# The GNU C Library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with the GNU C Library; if not, see
> +# <https://www.gnu.org/licenses/>.
> +
> +import argparse
> +import collections
> +import subprocess
> +import sys
> +
> +try:
> + subprocess.run
> +except:
> + class _CompletedProcess:
> + def __init__(self, args, returncode, stdout=None, stderr=None):
> + self.args = args
> + self.returncode = returncode
> + self.stdout = stdout
> + self.stderr = stderr
> +
> + def _run(*popenargs, input=None, timeout=None, check=False, **kwargs):
> + assert(timeout is None)
> + with subprocess.Popen(*popenargs, **kwargs) as process:
> + try:
> + stdout, stderr = process.communicate(input)
> + except:
> + process.kill()
> + process.wait()
> + raise
> + returncode = process.poll()
> + if check and returncode:
> + raise subprocess.CalledProcessError(returncode, popenargs)
> + return _CompletedProcess(popenargs, returncode, stdout, stderr)
> +
> + subprocess.run = _run
> +
> +# Number of errors encountered. Zero means no errors (test passes).
> +errors = 0
> +
> +# PYTHON-START
> +def parse_line(line):
> + """Parse a line of --list-diagnostics output.
> +
> + This function returns a pair (SUBSCRIPTS, VALUE). VALUE is either
> + a byte string or an integer. SUBSCRIPT is a tuple of (LABEL,
> + INDEX) pairs, where LABEL is a field identifier (a string), and
> + INDEX is an integer or None, to indicate that this field is not
> + indexed.
> +
> + """
> +
> + # Extract the list of subscripts before the value.
> + idx = 0
> + subscripts = []
> + while line[idx] != '=':
> + start_idx = idx
> +
> + # Extract the label.
> + while line[idx] not in '[.=':
> + idx += 1
> + label = line[start_idx:idx]
> +
> + if line[idx] == '[':
> + # Subscript with a 0x index.
> + assert label
> + close_bracket = line.index(']', idx)
> + index = line[idx + 1:close_bracket]
> + assert index.startswith('0x')
> + index = int(index, 0)
> + subscripts.append((label, index))
> + idx = close_bracket + 1
> + else: # '.' or '='.
> + if label:
> + subscripts.append((label, None))
> + if line[idx] == '.':
> + idx += 1
> +
> + # The value is either a string or a 0x number.
> + value = line[idx + 1:]
> + if value[0] == '"':
> + # Decode the escaped string into a byte string.
> + assert value[-1] == '"'
> + idx = 1
> + result = []
> + while True:
> + ch = value[idx]
> + if ch == '\\':
> + if value[idx + 1] in '"\\':
> + result.append(ord(value[idx + 1]))
> + idx += 2
> + else:
> + result.append(int(value[idx + 1:idx + 4], 8))
> + idx += 4
> + elif ch == '"':
> + assert idx == len(value) - 1
> + break
> + else:
> + result.append(ord(value[idx]))
> + idx += 1
> + value = bytes(result)
> + else:
> + # Convert the value into an integer.
> + assert value.startswith('0x')
> + value = int(value, 0)
> + return (tuple(subscripts), value)
> +# PYTHON-END
> +
> +assert parse_line('a.b[0x1]=0x2') == ((('a', None), ('b', 1)), 2)
> +assert parse_line(r'b[0x3]="four\040\"\\"') == ((('b', 3),), b'four \"\\')
> +
> +# ABNF for a line of --list-diagnostics output.
> +diagnostics_abnf = r"""
> +HEXDIG = %x30-39 / %x61-6f ; lowercase a-f only
> +ALPHA = %x41-5a / %x61-7a / %x7f ; letters and underscore
> +ALPHA-NUMERIC = ALPHA / %x30-39 / "_"
> +DQUOTE = %x22 ; "
> +
> +; Numbers are always hexadecimal and use a 0x prefix.
> +hex-value-prefix = %x30 %x78
> +hex-value = hex-value-prefix 1*HEXDIG
> +
> +; Strings use octal escape sequences and \\, \".
> +string-char = %x20-21 / %x23-5c / %x5d-7e ; printable but not "\
> +string-quoted-octal = %x30-33 2*2%x30-37
> +string-quoted = "\" ("\" / DQUOTE / string-quoted-octal)
> +string-value = DQUOTE *(string-char / string-quoted) DQUOTE
> +
> +value = hex-value / string-value
> +
> +label = ALPHA *ALPHA-NUMERIC
> +index = "[" hex-value "]"
> +subscript = label [index]
> +
> +line = subscript *("." subscript) "=" value
> +"""
> +
> +def check_consistency_with_manual(manual_path):
> + """Verify that the code fragments in the manual match this script.
> +
> + The code fragments are duplicated to clarify the dual license.
> + """
> +
> + global errors
> +
> + def extract_lines(path, start_line, end_line, skip_lines=()):
> + result = []
> + with open(path) as inp:
> + capturing = False
> + for line in inp:
> + if line.strip() == start_line:
> + capturing = True
> + elif not capturing or line.strip() in skip_lines:
> + continue
> + elif line.strip() == end_line:
> + capturing = False
> + else:
> + result.append(line)
> + if not result:
> + raise ValueError('{!r} not found in {!r}'.format(start_line, path))
> + if capturing:
> + raise ValueError('{!r} not found in {!r}'.format(end_line, path))
> + return result
> +
> + def check(name, manual, script):
> + global errors
> +
> + if manual == script:
> + return
> + print('error: {} fragment in manual is different'.format(name))
> + import difflib
> + sys.stdout.writelines(difflib.unified_diff(
> + manual, script, fromfile='manual', tofile='script'))
> + errors += 1
> +
> + manual_python = extract_lines(manual_path,
> + '@c PYTHON-START', '@end smallexample',
> + skip_lines=('@smallexample',))
> + script_python = extract_lines(__file__, '# PYTHON-START', '# PYTHON-END')
> + check('Python code', manual_python, script_python)
> +
> + manual_abnf = extract_lines(manual_path,
> + '@c ABNF-START', '@end smallexample',
> + skip_lines=('@smallexample',))
> + check('ABNF', diagnostics_abnf.splitlines(keepends=True)[1:], manual_abnf)
> +
> +# If the abnf module can be imported, run an additional check that the
> +# 'line' production from the ABNF grammar matches --list-diagnostics
> +# output lines.
> +try:
> + import abnf
> +except ImportError:
> + abnf = None
> + print('info: skipping ABNF validation because the abnf module is missing')
> +
> +if abnf is not None:
> + class Grammar(abnf.Rule):
> + pass
> +
> + Grammar.load_grammar(diagnostics_abnf)
> +
> + def parse_abnf(line):
> + global errors
> +
> + # Just verify that the line parses.
> + try:
> + Grammar('line').parse_all(line)
> + except abnf.ParseError:
> + print('error: ABNF parse error:', repr(line))
> + errors += 1
> +else:
> + def parse_abnf(line):
> + pass
> +
> +
> +def parse_diagnostics(cmd):
> + global errors
> + diag_out = subprocess.run(cmd, stdout=subprocess.PIPE, check=True,
> + universal_newlines=True).stdout
> + if diag_out[-1] != '\n':
> + print('error: ld.so output does not end in newline')
> + errors += 1
> +
> + PathType = collections.namedtuple('PathType',
> + 'has_index value_type original_line')
> + # Mapping tuples of labels to PathType values.
> + path_types = {}
> +
> + seen_subscripts = {}
> +
> + for line in diag_out.splitlines():
> + parse_abnf(line)
> + subscripts, value = parse_line(line)
> +
> + # Check for duplicates.
> + if subscripts in seen_subscripts:
> + print('error: duplicate value assignment:', repr(line))
> + print(' previous line:,', repr(seen_subscripts[line]))
> + errors += 1
> + else:
> + seen_subscripts[subscripts] = line
> +
> + # Compare types against the previously seen labels.
> + labels = tuple([label for label, index in subscripts])
> + has_index = tuple([index is not None for label, index in subscripts])
> + value_type = type(value)
> + if labels in path_types:
> + previous_type = path_types[labels]
> + if has_index != previous_type.has_index:
> + print('error: line has mismatch of indexing:', repr(line))
> + print(' index types:', has_index)
> + print(' previous: ', previous_type.has_index)
> + print(' previous line:', repr(previous_type.original_line))
> + errors += 1
> + if value_type != previous_type.value_type:
> + print('error: line has mismatch of value type:', repr(line))
> + print(' value type:', value_type.__name__)
> + print(' previous: ', previous_type.value_type.__name__)
> + print(' previous line:', repr(previous_type.original_line))
> + errors += 1
> + else:
> + path_types[labels] = PathType(has_index, value_type, line)
> +
> + # Check that this line does not add indexing to a previous value.
> + for idx in range(1, len(subscripts) - 1):
> + if subscripts[:idx] in path_types:
> + print('error: line assigns to atomic value:', repr(line))
> + print(' previous line:', repr(previous_type.original_line))
> + errors += 1
> +
> + if errors:
> + sys.exit(1)
> +
> +def get_parser():
> + parser = argparse.ArgumentParser(description=__doc__)
> + parser.add_argument('--manual',
> + help='path to .texi file for consistency checks')
> + parser.add_argument('command',
> + help='comand to run')
> + return parser
> +
> +
> +def main(argv):
> + parser = get_parser()
> + opts = parser.parse_args(argv)
> +
> + if opts.manual:
> + check_consistency_with_manual(opts.manual)
> +
> + # Remove the initial 'env' command.
> + parse_diagnostics(opts.command.split()[1:])
> +
> + if errors:
> + sys.exit(1)
> +
> +if __name__ == '__main__':
> + main(sys.argv[1:])
> diff --git a/manual/install.texi b/manual/install.texi
> index e8f36d5726..2107eb7268 100644
> --- a/manual/install.texi
> +++ b/manual/install.texi
> @@ -632,6 +632,12 @@ GDB, and should be compatible with the Python version in your system.
> As of release time PExpect 4.8.0 is the newest verified to work to test
> the pretty printers.
>
> +@item
> +The Python @code{abnf} module.
> +
> +This module is used to verify some ABNF grammars in the manual.
> +Version 2.2.0 has been confirmed to work as expected.
> +
> @item
> GDB 7.8 or later with support for Python 2.7/3.4 or later
>
prev parent reply other threads:[~2023-08-21 16:29 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-04 18:16 [PATCH 0/3] Document ld.so --list-diagnostics, add syntax tests Florian Weimer
2023-08-04 18:16 ` [PATCH 1/3] Linux: Avoid conflicting types in ld.so --list-diagnostics Florian Weimer
2023-08-21 16:02 ` Adhemerval Zanella Netto
2023-08-04 18:16 ` [PATCH 2/3] manual: Document ld.so --list-diagnostics output Florian Weimer
2023-08-04 23:33 ` Arsen Arsenović
2023-08-08 15:38 ` Florian Weimer
2023-08-21 16:24 ` Adhemerval Zanella Netto
2023-08-23 6:33 ` Florian Weimer
2023-08-04 18:16 ` [PATCH 3/3] elf: Check that --list-diagnostics output has the expected syntax Florian Weimer
2023-08-21 16:29 ` Adhemerval Zanella Netto [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6ccec3fa-f630-faa0-bf89-bf1ae915fe67@linaro.org \
--to=adhemerval.zanella@linaro.org \
--cc=fweimer@redhat.com \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).