From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oo1-xc29.google.com (mail-oo1-xc29.google.com [IPv6:2607:f8b0:4864:20::c29]) by sourceware.org (Postfix) with ESMTPS id 7F52B3858C01 for ; Wed, 23 Aug 2023 13:53:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7F52B3858C01 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oo1-xc29.google.com with SMTP id 006d021491bc7-565f2567422so3435575eaf.2 for ; Wed, 23 Aug 2023 06:53:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1692798827; x=1693403627; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=Qqxeo1pkGej/cD1L9BmNLrbbvBFvRj1BJ3nQLvCxiWw=; b=rXW+Ff02IHQEXnD0shFHrvtAIXPuPWN9zqYI0wxEEOWfK2LVj4brHCMWnw1EWfGX3X vntThEiqjYUhZMkuLA48Q4TTZyzs5a7hMkdvC3NX59Rae8s0wSIRJduPJsGrq1DuIb9y r/tWJPv9UV8L8HrBsU6qaIOqWDdHXLLrNroMKEuGP2scpFCbIe5tCbaqCM29wEdWcTmR l9k8ViQ0XaHOF+LRo1+7UZCRvlcL9cLx0sMr12VBwtbPYEwiMrpCdLu8JMc7PzKsbs5g 5m4HGPK2wN36lsVjW17N2gWzZWeh34g1ugVFYQsq2kjf03tJxXPTH7Fmp3VmmhCknCia HMVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692798827; x=1693403627; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Qqxeo1pkGej/cD1L9BmNLrbbvBFvRj1BJ3nQLvCxiWw=; b=Bj2IMw1SRiecSpr2zCaZ+wwh0q3re5/tMW5FvDWN+HEKDGsKvIESSufRzPlMJM53qY xm32l8kadjRtm5BLt2jjBvWGjK+olUSn98xDgV+HOQvYnAc/zCDwmOvzRHkX0+zJHzlW 3WUe5sqIXXDr5L8h174YFvzk9NBqxANVBQGYUHObD+M3G5ELd8R/GRuK1dioJ4ykAzqn +S5oUDr0a5oTGIVNbPJWYqn1uEUO7jd6D30Xcg+uoUmcSGPqdS+dCs0Xoje/8VCAhVjP gFIebZXEG4WrCNT99jhWL2BwOpDr8+QRS1J7jOwPnmblBVgKYJKMWbcDZptZGVmbDt4c Mm0Q== X-Gm-Message-State: AOJu0YyiJfNCDHJA85krsn0S/j6T/nONUlCHzMhmC9VkJROwJATL549n ZGbZyNSJgCv9p8fhhzEXF/WxD6eFsMriKKUfbWc4qg== X-Google-Smtp-Source: AGHT+IHzZrMIqLxXM+lDT7qbx3VsYyS+5aP24fRI7QOtUl/XdH+rWAXZ5siyxQxiMTS9pRe86pdYVQ== X-Received: by 2002:a4a:3956:0:b0:56e:4921:deb0 with SMTP id x22-20020a4a3956000000b0056e4921deb0mr11883865oog.8.1692798827420; Wed, 23 Aug 2023 06:53:47 -0700 (PDT) Received: from ?IPV6:2804:1b3:a7c2:c275:d8f:2562:4517:f8f5? ([2804:1b3:a7c2:c275:d8f:2562:4517:f8f5]) by smtp.gmail.com with ESMTPSA id m5-20020a4aab85000000b0054f85f67f31sm5746605oon.46.2023.08.23.06.53.45 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 23 Aug 2023 06:53:46 -0700 (PDT) Message-ID: <4cab9849-a766-2efc-9cff-22a1551cc90a@linaro.org> Date: Wed, 23 Aug 2023 10:53:44 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: Re: [PATCH v2 2/2] elf: Check that --list-diagnostics output has the expected syntax Content-Language: en-US To: libc-alpha@sourceware.org, Florian Weimer References: From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-14.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 23/08/23 04:04, Florian Weimer via Libc-alpha wrote: > Parts of elf/tst-rtld-list-diagnostics.py have been copied from > scripts/tst-ld-trace.py. > > The abnf module is entirely optional and used to verify the > ABNF grammar as included in the manual. LGTM, some minor notes below. Reviewed-by: Adhemerval Zanella > --- > v2: Clarify the optional nature of the abnf module. Fixes > from Adhemerval. > > INSTALL | 6 + > elf/Makefile | 9 + > elf/tst-rtld-list-diagnostics.py | 303 +++++++++++++++++++++++++++++++ > manual/install.texi | 7 + > 4 files changed, 325 insertions(+) > create mode 100644 elf/tst-rtld-list-diagnostics.py > > diff --git a/INSTALL b/INSTALL > index 268acadd75..e5152a4ae7 100644 > --- a/INSTALL > +++ b/INSTALL > @@ -585,6 +585,12 @@ build the GNU C Library: > in your system. As of release time PExpect 4.8.0 is the newest > verified to work to test the pretty printers. > > + • The Python ‘abnf’ module. > + > + This module is optional used to verify some ABNF grammars in the [...] is optional *and* used [...]. > + manual. Version 2.2.0 has been confirmed to work as expected. A > + missing ‘abnf’ does not reduce test coverage of the library itself. I think it should be 'reduce the test'. > + > • GDB 7.8 or later with support for Python 2.7/3.4 or later > > GDB itself needs to be configured with Python support in order to > diff --git a/elf/Makefile b/elf/Makefile > index c00e2ccfc5..9176cbf1e3 100644 > --- a/elf/Makefile > +++ b/elf/Makefile > @@ -1123,6 +1123,7 @@ tests-special += \ > $(objpfx)argv0test.out \ > $(objpfx)tst-pathopt.out \ > $(objpfx)tst-rtld-help.out \ > + $(objpfx)tst-rtld-list-diagnostics.out \ > $(objpfx)tst-rtld-load-self.out \ > $(objpfx)tst-rtld-preload.out \ > $(objpfx)tst-sprof-basic.out \ > @@ -2799,6 +2800,14 @@ $(objpfx)tst-ro-dynamic-mod.so: $(objpfx)tst-ro-dynamic-mod.os \ > -Wl,--script=tst-ro-dynamic-mod.map \ > $(objpfx)tst-ro-dynamic-mod.os > > +$(objpfx)tst-rtld-list-diagnostics.out: tst-rtld-list-diagnostics.py \ > + $(..)manual/dynlink.texi $(objpfx)$(rtld-installed-name) > + $(PYTHON) tst-rtld-list-diagnostics.py \ > + --manual=$(..)manual/dynlink.texi \ > + "$(test-wrapper-env) $(objpfx)$(rtld-installed-name) --list-diagnostics" \ > + > $@; \ > + $(evaluate-test) > + > $(objpfx)tst-rtld-run-static.out: $(objpfx)/ldconfig > > $(objpfx)tst-dl_find_object.out: \ > diff --git a/elf/tst-rtld-list-diagnostics.py b/elf/tst-rtld-list-diagnostics.py > new file mode 100644 > index 0000000000..e9ba9e1798 > --- /dev/null > +++ b/elf/tst-rtld-list-diagnostics.py > @@ -0,0 +1,303 @@ > +#!/usr/bin/python3 > +# Test that the ld.so --list-diagnostics output has the expected syntax. > +# Copyright (C) 2022-2023 Free Software Foundation, Inc. > +# Copyright The GNU Toolchain Authors. > +# This file is part of the GNU C Library. > +# > +# The GNU C Library is free software; you can redistribute it and/or > +# modify it under the terms of the GNU Lesser General Public > +# License as published by the Free Software Foundation; either > +# version 2.1 of the License, or (at your option) any later version. > +# > +# The GNU C Library is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > +# Lesser General Public License for more details. > +# > +# You should have received a copy of the GNU Lesser General Public > +# License along with the GNU C Library; if not, see > +# . > + > +import argparse > +import collections > +import subprocess > +import sys > + > +try: > + subprocess.run > +except: > + class _CompletedProcess: > + def __init__(self, args, returncode, stdout=None, stderr=None): > + self.args = args > + self.returncode = returncode > + self.stdout = stdout > + self.stderr = stderr > + > + def _run(*popenargs, input=None, timeout=None, check=False, **kwargs): > + assert(timeout is None) > + with subprocess.Popen(*popenargs, **kwargs) as process: > + try: > + stdout, stderr = process.communicate(input) > + except: > + process.kill() > + process.wait() > + raise > + returncode = process.poll() > + if check and returncode: > + raise subprocess.CalledProcessError(returncode, popenargs) > + return _CompletedProcess(popenargs, returncode, stdout, stderr) > + > + subprocess.run = _run As side note, I think we should move this snippet to a different module since it is now replicated in two other files (build-many-glibcs.py and tst-ld-trace.py). > + > +# Number of errors encountered. Zero means no errors (test passes). > +errors = 0 > + > +def parse_line(line): > + """Parse a line of --list-diagnostics output. > + > + This function returns a pair (SUBSCRIPTS, VALUE). VALUE is either > + a byte string or an integer. SUBSCRIPT is a tuple of (LABEL, > + INDEX) pairs, where LABEL is a field identifier (a string), and > + INDEX is an integer or None, to indicate that this field is not > + indexed. > + > + """ > + > + # Extract the list of subscripts before the value. > + idx = 0 > + subscripts = [] > + while line[idx] != '=': > + start_idx = idx > + > + # Extract the label. > + while line[idx] not in '[.=': > + idx += 1 > + label = line[start_idx:idx] > + > + if line[idx] == '[': > + # Subscript with a 0x index. > + assert label > + close_bracket = line.index(']', idx) > + index = line[idx + 1:close_bracket] > + assert index.startswith('0x') > + index = int(index, 0) > + subscripts.append((label, index)) > + idx = close_bracket + 1 > + else: # '.' or '='. > + if label: > + subscripts.append((label, None)) > + if line[idx] == '.': > + idx += 1 > + > + # The value is either a string or a 0x number. > + value = line[idx + 1:] > + if value[0] == '"': > + # Decode the escaped string into a byte string. > + assert value[-1] == '"' > + idx = 1 > + result = [] > + while True: > + ch = value[idx] > + if ch == '\\': > + if value[idx + 1] in '"\\': > + result.append(ord(value[idx + 1])) > + idx += 2 > + else: > + result.append(int(value[idx + 1:idx + 4], 8)) > + idx += 4 > + elif ch == '"': > + assert idx == len(value) - 1 > + break > + else: > + result.append(ord(value[idx])) > + idx += 1 > + value = bytes(result) > + else: > + # Convert the value into an integer. > + assert value.startswith('0x') > + value = int(value, 0) > + return (tuple(subscripts), value) > + > +assert parse_line('a.b[0x1]=0x2') == ((('a', None), ('b', 1)), 2) > +assert parse_line(r'b[0x3]="four\040\"\\"') == ((('b', 3),), b'four \"\\') > + > +# ABNF for a line of --list-diagnostics output. > +diagnostics_abnf = r""" > +HEXDIG = %x30-39 / %x61-6f ; lowercase a-f only > +ALPHA = %x41-5a / %x61-7a / %x7f ; letters and underscore > +ALPHA-NUMERIC = ALPHA / %x30-39 / "_" > +DQUOTE = %x22 ; " > + > +; Numbers are always hexadecimal and use a 0x prefix. > +hex-value-prefix = %x30 %x78 > +hex-value = hex-value-prefix 1*HEXDIG > + > +; Strings use octal escape sequences and \\, \". > +string-char = %x20-21 / %x23-5c / %x5d-7e ; printable but not "\ > +string-quoted-octal = %x30-33 2*2%x30-37 > +string-quoted = "\" ("\" / DQUOTE / string-quoted-octal) > +string-value = DQUOTE *(string-char / string-quoted) DQUOTE > + > +value = hex-value / string-value > + > +label = ALPHA *ALPHA-NUMERIC > +index = "[" hex-value "]" > +subscript = label [index] > + > +line = subscript *("." subscript) "=" value > +""" > + > +def check_consistency_with_manual(manual_path): > + """Verify that the code fragments in the manual match this script. > + > + The code fragments are duplicated to clarify the dual license. > + """ > + > + global errors > + > + def extract_lines(path, start_line, end_line, skip_lines=()): > + result = [] > + with open(path) as inp: > + capturing = False > + for line in inp: > + if line.strip() == start_line: > + capturing = True > + elif not capturing or line.strip() in skip_lines: > + continue > + elif line.strip() == end_line: > + capturing = False > + else: > + result.append(line) > + if not result: > + raise ValueError('{!r} not found in {!r}'.format(start_line, path)) > + if capturing: > + raise ValueError('{!r} not found in {!r}'.format(end_line, path)) > + return result > + > + def check(name, manual, script): > + global errors > + > + if manual == script: > + return > + print('error: {} fragment in manual is different'.format(name)) > + import difflib > + sys.stdout.writelines(difflib.unified_diff( > + manual, script, fromfile='manual', tofile='script')) > + errors += 1 > + > + manual_abnf = extract_lines(manual_path, > + '@c ABNF-START', '@end smallexample', > + skip_lines=('@smallexample',)) > + check('ABNF', diagnostics_abnf.splitlines(keepends=True)[1:], manual_abnf) > + > +# If the abnf module can be imported, run an additional check that the > +# 'line' production from the ABNF grammar matches --list-diagnostics > +# output lines. > +try: > + import abnf > +except ImportError: > + abnf = None > + print('info: skipping ABNF validation because the abnf module is missing') > + > +if abnf is not None: > + class Grammar(abnf.Rule): > + pass > + > + Grammar.load_grammar(diagnostics_abnf) > + > + def parse_abnf(line): > + global errors > + > + # Just verify that the line parses. > + try: > + Grammar('line').parse_all(line) > + except abnf.ParseError: > + print('error: ABNF parse error:', repr(line)) > + errors += 1 > +else: > + def parse_abnf(line): > + pass > + > + > +def parse_diagnostics(cmd): > + global errors > + diag_out = subprocess.run(cmd, stdout=subprocess.PIPE, check=True, > + universal_newlines=True).stdout > + if diag_out[-1] != '\n': > + print('error: ld.so output does not end in newline') > + errors += 1 > + > + PathType = collections.namedtuple('PathType', > + 'has_index value_type original_line') > + # Mapping tuples of labels to PathType values. > + path_types = {} > + > + seen_subscripts = {} > + > + for line in diag_out.splitlines(): > + parse_abnf(line) > + subscripts, value = parse_line(line) > + > + # Check for duplicates. > + if subscripts in seen_subscripts: > + print('error: duplicate value assignment:', repr(line)) > + print(' previous line:,', repr(seen_subscripts[line])) > + errors += 1 > + else: > + seen_subscripts[subscripts] = line > + > + # Compare types against the previously seen labels. > + labels = tuple([label for label, index in subscripts]) > + has_index = tuple([index is not None for label, index in subscripts]) > + value_type = type(value) > + if labels in path_types: > + previous_type = path_types[labels] > + if has_index != previous_type.has_index: > + print('error: line has mismatch of indexing:', repr(line)) > + print(' index types:', has_index) > + print(' previous: ', previous_type.has_index) > + print(' previous line:', repr(previous_type.original_line)) > + errors += 1 > + if value_type != previous_type.value_type: > + print('error: line has mismatch of value type:', repr(line)) > + print(' value type:', value_type.__name__) > + print(' previous: ', previous_type.value_type.__name__) > + print(' previous line:', repr(previous_type.original_line)) > + errors += 1 > + else: > + path_types[labels] = PathType(has_index, value_type, line) > + > + # Check that this line does not add indexing to a previous value. > + for idx in range(1, len(subscripts) - 1): > + if subscripts[:idx] in path_types: > + print('error: line assigns to atomic value:', repr(line)) > + print(' previous line:', repr(previous_type.original_line)) > + errors += 1 > + > + if errors: > + sys.exit(1) > + > +def get_parser(): > + parser = argparse.ArgumentParser(description=__doc__) > + parser.add_argument('--manual', > + help='path to .texi file for consistency checks') > + parser.add_argument('command', > + help='comand to run') > + return parser > + > + > +def main(argv): > + parser = get_parser() > + opts = parser.parse_args(argv) > + > + if opts.manual: > + check_consistency_with_manual(opts.manual) > + > + # Remove the initial 'env' command. > + parse_diagnostics(opts.command.split()[1:]) > + > + if errors: > + sys.exit(1) > + > +if __name__ == '__main__': > + main(sys.argv[1:]) > diff --git a/manual/install.texi b/manual/install.texi > index e8f36d5726..4c4e76fedf 100644 > --- a/manual/install.texi > +++ b/manual/install.texi > @@ -632,6 +632,13 @@ GDB, and should be compatible with the Python version in your system. > As of release time PExpect 4.8.0 is the newest verified to work to test > the pretty printers. > > +@item > +The Python @code{abnf} module. > + > +This module is optional used to verify some ABNF grammars in the manual. > +Version 2.2.0 has been confirmed to work as expected. A missing > +@code{abnf} does not reduce test coverage of the library itself. Same issues as the INSTALL file. > + > @item > GDB 7.8 or later with support for Python 2.7/3.4 or later >