From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by sourceware.org (Postfix) with ESMTPS id 9F6CD385E036 for ; Thu, 19 Aug 2021 12:41:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9F6CD385E036 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=serhei.io Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=serhei.io Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.west.internal (Postfix) with ESMTP id B117E32009EB; Thu, 19 Aug 2021 08:41:26 -0400 (EDT) Received: from imap21 ([10.202.2.71]) by compute2.internal (MEProxy); Thu, 19 Aug 2021 08:41:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=serhei.io; h= mime-version:message-id:in-reply-to:references:date:from:to :subject:content-type; s=fm2; bh=cLfQTFIKZdKZXCEGkzbPczhJBLW73so WuOsw1yYllDA=; b=k+1gNCMWtiM1253ru2PsOst0H+n5Jxdx8KkULu4nroSsXPj K6uwrtmzzT5d/y/94XSxDU1nQsCvzKbqJ5V4Rki8zcIOD7R56DH6pdPkyb8mhWlQ DSNwSoEYRIk9mhN/74PDR4eIJWkT3gRWMSHy+FBu9DpfVjZvt2AMQLa127JDwjse nDamXy6URRD5CGD7JSJf4iIn6yQTYzADzffRJQGUeJ6HvVbACniOvoKdH/auapjC 89CMxEgaxCRTTG6bbn5UE74RUyAZB4OVq4WtP0yaH58VLujefkauvBt1X1v+jb7o b48MWv8g2OEeLnmFWUT6kEAKx4BRSN1rkuGXHFQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; bh=cLfQTF IKZdKZXCEGkzbPczhJBLW73soWuOsw1yYllDA=; b=AVMcZRGSfRAtg6Pg0MpqwF sPiYo5qNrKcPID+uwq0OYJZqjcrihUYbOG9SNRkjDHfvZXSRruMdYVK/zaO4c3gS s4xuh/9pl6BONZXB4R/FewyzgI+sT8n7s/HJoDd9OGFEiRr3elXHWm7o4kqF6NLN z7prfMV90zyvZikRjMtA+nzwF64D3t5HTHDiIJm82nV28n2013viTT5r4cDobPZs mABrFowaarOLekR9FXJ+8h1gJBdK53Gy3orEgED8y/ZoxyCypU8Kao4IegaxnSVO ncqQaNMcI9/YduWs8S4jet+v8sgiQz/6yibTgTvQ5MZQD8h0spDUK3sNgiU/lNuA == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddrleejgdehgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefofgggkfgjfhffhffvufgtsehttdertderredtnecuhfhrohhmpedfufgvrhhh vghiucforghkrghrohhvfdcuoehmvgesshgvrhhhvghirdhioheqnecuggftrfgrthhtvg hrnhepgeduledugfehffdtueejudekheehudehudfffefhjeehledugfevgffhffehgfej necuffhomhgrihhnpehsvghrhhgvihdrihhonecuvehluhhsthgvrhfuihiivgeptdenuc frrghrrghmpehmrghilhhfrhhomhepmhgvsehsvghrhhgvihdrihho X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 501) id 631A051C0061; Thu, 19 Aug 2021 08:41:25 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.5.0-alpha0-1118-g75eff666e5-fm-20210816.002-g75eff666 Mime-Version: 1.0 Message-Id: <3ad2b380-d3f5-438d-bd38-3f16470a159a@www.fastmail.com> In-Reply-To: <20210818192639.2362335-2-keiths@redhat.com> References: <20210818192639.2362335-1-keiths@redhat.com> <20210818192639.2362335-2-keiths@redhat.com> Date: Thu, 19 Aug 2021 08:40:55 -0400 From: "Serhei Makarov" To: "Keith Seitz" , Bunsen Subject: Re: [PATCH 1/4] Rewrite gdb.parse_dejagnu_sum Content-Type: text/plain X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: bunsen@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Bunsen mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Aug 2021 12:41:40 -0000 Hello Keith, I had a look, the 4 patches look good to commit. I will do additional testing with the SystemTap data next week. All the best, Serhei On Wed, Aug 18, 2021, at 3:26 PM, Keith Seitz via Bunsen wrote: > This patch rewrites gdb.parse_dejagnu_sum, making it significantly > simple and more reliable. With the current version of this function, > I have been consistently seeing 8,000+ "missing" tests -- tests > that are recorded in gdb.sum but never make it into the Bunsen > database. > > After chasing down a number of problems, I found it was much easier > to simply rewrite this function. Consequently, my Bunsen imports of > gdb.sum now account for every test. > --- > scripts-main/gdb/parse_dejagnu.py | 206 ++++++++++++++++-------------- > 1 file changed, 110 insertions(+), 96 deletions(-) > > diff --git a/scripts-main/gdb/parse_dejagnu.py > b/scripts-main/gdb/parse_dejagnu.py > index b56ed3c..fc6a30e 100755 > --- a/scripts-main/gdb/parse_dejagnu.py > +++ b/scripts-main/gdb/parse_dejagnu.py > @@ -71,116 +71,126 @@ def get_outcome_subtest(line): > if m is None: return None > return m.group('outcome'), m.group('subtest') > > +# Normalize the test named NAME. NAME_DICT is used to track these. > +# > +# This is unfortunately quite complex. > + > +def normalize_name(name_dict, name): > + > + assert(name is not None) > + assert(name != "") > + > + # The buildbot does: > + # > + # test_name = re.sub (r'(\s+)? \(.*\)$', r'', orig_name) > + # > + # But this is overly aggressive, causing thousands of duplicate > + # names to be recorded. > + # > + # Instead, try to remove known constant statuses. Unfortunately, this is > + # quite slow, but it is the most reliable way to avoid 10,000 duplicate > + # names from invading the database. > + test_name = re.sub(r' \((PRMS.*|timeout|eof|GDB internal error' > + r'|the program exited|the program is no longer running' > + r'|got interactive prompt|got breakpoint menu' > + r'|resync count exceeded|bad file format|file not found' > + r'|incomplete note section|unexpected output' > + r'|inferior_not_stopped|stopped at wrong place' > + r'|unknown output after running|dwarf version unhandled' > + r'|line numbers scrambled\?|out of virtual memory' > + r'|missing filename|missing module' > + r'|missing /usr/bin/prelink)\)', r'', name) > + > + if test_name in name_dict: > + # If the test is already present in the file list, then > + # we include a unique identifier in the end of it, in the > + # form or '<>' (where N is a number >= 2). This is > + # useful because the GDB testsuite is full of non-unique > + # test messages. > + i = 2 > + while True: > + nname = test_name + ' <<' + str (i) + '>>' > + if nname not in name_dict: > + break > + i += 1 > + test_name = nname > + > + name_dict[test_name] = test_name > + return test_name > + > def parse_dejagnu_sum(testrun, sumfile, all_cases=None, > consolidate_pass=False, verbose=True): > -# consolidate_pass=True, verbose=True): > if testrun is None: return None > f = openfile_or_xz(sumfile) > > last_exp = None > - last_test_passed = False # at least one pass and no fails > - last_test_failed = False # at least one fail > - failed_subtests = [] # XXX Better known as 'unpassed'? > - passed_subtests = [] > - failed_subtests_summary = 0 > - passed_subtests_summary = 0 > + counts = dict() > + names = dict() > + > + # The global test_outcome_map doesn't contain all of our > + # outcomes. Add those now. > + test_outcome_map['PATH'] = 'PATH' # Tests with paths in their names > + > + # Clear counts dictionary > + counts = dict.fromkeys(test_outcome_map, 0) > > + # Iterate over lines in the sum file. > for cur in Cursor(sumfile, path=os.path.basename(sumfile), input_stream=f): > line = cur.line > > - # XXX need to handle several .sum formats > - # buildbot format :: all lines are outcome lines, include the > .exp > - # regular format :: outcome lines separated by "Running > .exp ..." > - outcome, expname, subtest = None, None, None > + # There's always an exception. ERRORs are not output the same > + # way as other test results. They simply list a reason. > + # FIXME: ERRORs typically span a range of lines > info = get_expname_subtest(line) > - if info is not None: > - outcome, expname, subtest = info > - elif (line.startswith("Running") and ".exp ..." in line): > - outcome = None > - expname = get_running_exp(line) > - else: > - info = get_outcome_subtest(line) > - if info is not None: > - outcome, subtest = info > - > - # XXX these situations mark an .exp boundary: > - finished_exp = False > - if expname != last_exp and expname is not None and last_exp is > not None: > - finished_exp = True > - elif "Summary ===" in line: > - finished_exp = True > - > - if finished_exp: > - running_cur.line_end = cur.line_end-1 > - if consolidate_pass and last_test_passed: > - testrun.add_testcase(name=last_exp, > - outcome='PASS', > - origin_sum=running_cur) > - elif last_test_passed: > - # Report each passed subtest individually: > - for passed_subtest, outcome, cursor in passed_subtests: > - testrun.add_testcase(name=last_exp, > - outcome=outcome, > - subtest=passed_subtest, > - origin_sum=cursor) > - # Report all failed and untested subtests: > - for failed_subtest, outcome, cursor in failed_subtests: > - testrun.add_testcase(name=last_exp, > - outcome=outcome, > - subtest=failed_subtest, > - origin_sum=cursor) > - > - if expname is not None and expname != last_exp: > - last_exp = expname > - running_cur = Cursor(start=cur) > - last_test_passed = False > - last_test_failed = False > - failed_subtests = [] > - passed_subtests = [] > + if info is None: > + if line.startswith('ERROR:'): > + # In this case, the "subtest" is actually the reason > + # for the failure. LAST_EXP is not necessarily strictly > + # correct, but we would have to watch for additional > + # messages (Running TESTFILE ...) to make this work > properly. > + # In practice, it's not typically a problem. > + info = ('ERROR', last_exp, line[len('ERROR: '):]) > + elif line.endswith(".exp:\n"): > + # An unnamed test. It happens. > + line = line[:-1] + " " + "UNNAMED_TEST" + "\n" > + info = get_expname_subtest(line) > + if info is None: > + # We tried. Nothing else we can do. > + print("WARNING: unknown expname/subtest in outcome > line --", line, file=sys.stderr) > + continue > + else: > + continue > > - if outcome is None: > + outcome, expname, subtest = info > + > + # Warn and skip any outcome that is not in test_outcome_map! > + # It will cause an exception later. > + if outcome not in test_outcome_map: > + print(f'WARNING: unexpected test outcome ({outcome}) in > line -- {line}') > continue > - # XXX The line contains a test outcome. > - synth_line = line > - if all_cases is not None and expname is None: > - # XXX force embed the expname into the line for later > annotation code > - synth_line = str(outcome) + ": " + last_exp + ": " + > str(subtest) > - all_cases.append(synth_line) > - > - # TODO: Handle other dejagnu outcomes if they show up: > - if line.startswith("FAIL: ") \ > - or line.startswith("KFAIL: ") \ > - or line.startswith("XFAIL: ") \ > - or line.startswith("ERROR: tcl error sourcing"): > - last_test_failed = True > - last_test_passed = False > - failed_subtests.append((line, > - check_mapping(line, > test_outcome_map, start=True), > - cur)) # XXX single line > - failed_subtests_summary += 1 > - if line.startswith("UNTESTED: ") \ > - or line.startswith("UNSUPPORTED: ") \ > - or line.startswith("UNRESOLVED: "): > - # don't update last_test_{passed,failed} > - failed_subtests.append((line, > - check_mapping(line, > test_outcome_map, start=True), > - cur)) > - # don't tally > - if line.startswith("PASS: ") \ > - or line.startswith("XPASS: ") \ > - or line.startswith("IPASS: "): > - if not last_test_failed: # no fails so far > - last_test_passed = True > - if not consolidate_pass: > - passed_subtests.append((line, > - check_mapping(line, > test_outcome_map, start=True), > - cur)) > - passed_subtests_summary += 1 > - f.close() > + if last_exp != expname: > + last_exp = expname > + names.clear() > + > + # Normalize the name to account for duplicates. > + subtest = normalize_name(names, subtest) > > - testrun.pass_count = passed_subtests_summary > - testrun.fail_count = failed_subtests_summary > + if all_cases is not None: > + # ERRORs are not appended to outcome_lines! > + if outcome != "ERROR": > + all_cases.append(line) > > + if consolidate_pass: > + pass # not implemented > + else: > + testrun.add_testcase(name=expname, outcome=outcome, > + subtest=subtest, origin_sum=cur) > + counts[outcome] += 1 > + f.close() > + > + testrun.pass_count = counts['PASS'] + counts['XPASS'] + counts['KPASS'] > + testrun.fail_count = counts['FAIL'] + counts['XFAIL'] + counts['KFAIL'] \ > + + counts['ERROR'] # UNTESTED, UNSUPPORTED, UNRESOLVED not tallied > return testrun > > def annotate_dejagnu_log(testrun, logfile, outcome_lines=[], > @@ -218,7 +228,11 @@ def annotate_dejagnu_log(testrun, logfile, > outcome_lines=[], > # (1b) Build a map of outcome_lines: > testcase_line_start = {} # .exp name -> index of first > outcome_line with this name > for j in range(len(outcome_lines)): > - outcome, expname, subtest = > get_expname_subtest(outcome_lines[j]) > + info = get_expname_subtest(outcome_lines[j]) > + if info is None: > + print("WARNING: unknown expname/subtest in outcome line > --", outcome_lines[j], file=sys.stderr) > + continue > + outcome, expname, subtest = info > if expname not in testcase_line_start: > testcase_line_start[expname] = j > > -- > 2.31.1 > > -- All the best, Serhei http://serhei.io