From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=c+/J=JD=redhat.com=wcohen@sourceware.org>
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124])
	by sourceware.org (Postfix) with ESMTPS id 0E83F3858C53
	for <bunsen@sourceware.org>; Thu, 25 Jan 2024 17:32:00 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0E83F3858C53
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0E83F3858C53
Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706203922; cv=none;
	b=kamFY9+vAs1m5Us16/BIoWVx18zlk4VWx9vcsqwjqTa4mkD1ioeZBvpNONek4Ec+5qfPNygJfWFrZhZ8Z3hdfz+Tq51AMLeU1PbC+YOKhibQkFoDDg2j0Pk2fphOfQr36ATPov5hVLZsA1CTCd97AlZGSVxoAP3EcMX7YxvxdAg=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
	t=1706203922; c=relaxed/simple;
	bh=37otvjcIc0S+/YwQFne2rufWlqTJztDzPAOXlMag6l8=;
	h=DKIM-Signature:Message-ID:Date:MIME-Version:To:From:Subject; b=LnRK7IeJSpxRwWgoHhaFLtc3dioo2250Gg0cadTn8C4mwBjSCwBAeCYbMWNolE8Y1VOtlqSdL6gmEuAsuUz2Iu4PNnPHGMXhp1Lwq6VqdymbJVzhHIhfjj2SnkxA9qzt7CQSSJ+vk2REVRs/LkEltpCg1sqbICnPvQqA/X/nc7U=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1706203919;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding;
	bh=Rjh1hV/npeuRYQOdt7PCQgrhLU/VcFVIye8M09PVyGk=;
	b=HKU7TubAuOjIedS45hUVzZdToPBGSgpwD8W7bNQ+V1P6Gny3RjABFj8JZc+GaFLk40TbPR
	g4Hr9FfgPsIhL2BDKqVVFHzVbxYaRh9H2CoFVqcHywFq7dcnN65shbvWACJSN+n2FdtE+U
	732ZTzL9EHrS9ydHSJGZPmFigQdoB8s=
Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73])
 by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3,
 cipher=TLS_AES_256_GCM_SHA384) id us-mta-537-P6W8Q8CwOVOBENtwwS73Zw-1; Thu,
 25 Jan 2024 12:31:58 -0500
X-MC-Unique: P6W8Q8CwOVOBENtwwS73Zw-1
Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CF2713C2A1C8
	for <bunsen@sourceware.org>; Thu, 25 Jan 2024 17:31:57 +0000 (UTC)
Received: from [10.13.129.81] (dhcp129-81.rdu.redhat.com [10.13.129.81])
	by smtp.corp.redhat.com (Postfix) with ESMTP id BE97D2166B33;
	Thu, 25 Jan 2024 17:31:57 +0000 (UTC)
Message-ID: <1a40287e-fad9-483f-bfb6-d62e618cbdce@redhat.com>
Date: Thu, 25 Jan 2024 12:31:57 -0500
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Cc: wcohen@redhat.com
To: bunsen@sourceware.org
From: William Cohen <wcohen@redhat.com>
Subject: Analysis Suggestions for Bunsen
X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Language: en-US
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <bunsen.sourceware.org>

Collecting data across a wide range of architectures and linux
distributions can provide useful information about what the root cause
of a failure is.  The comparisons where a particular test works and
doesn’t work gives an idea which changes influence the test result.
Below are some ideas of what might be useful analysis based on
debugging SystemTap failure that I (and others) might find useful
when searching through the Bunsen data.


Check correlation between kernel and test success

Over time there are changes in the kernel and sometimes these changes
break systemtap. This might cause few tests to fail or in the worst
case cause the smoke test to fail.  Checking the changes in the test
results when the kernel version changes within a distribution might
give some idea whether the kernel is to blame.


There might also be some opportunities for checking between
distributions as versions of Fedora may not update to the same kernel
at the same time.  Looking at the test results and seeing a test
transition from PASS to FAIL when two different versions of Fedora
update to the same kernel would provide some indication whether the
test failure was due to a kernel change.

I find myself checking the results between RHEL8/RHEL9/Fedora to see
if a particular test worked with an older kernel and then broke on
newer kernels.


Check correlation between architecture and test success

There is architecture specific code in Systemtap (and the kernels).
Having the analysis compare the results for the same distribution but
on different architectures is helpful.  For example PR31074 was
observed on aarch64 machine but the particular test functioned fine on
x86_64 the machine most commonly used for development.  Having Bunsen
analysis compare results of the same distribution running on different
architectures could point out issues in machine specific code.


Identify tests that do not reliably pass (or fail)

Given the complex interactions of the different parts of the system it
is possible that a particular test may not reliably work or fail.
Analysis that goes through the time axis for a particular environment
and looks for often it transitions between passing and failing.


Allow multiple runs on the same machine environment

>From run to run there are some variations in the test results.  It
would be nice if there was some way to rerun the test(s) in the same
environment to determine whether the test is failing every time or is
sporadic.


Compare variations of the variations of the *sycall.exp tests

As mentioned above some tests occasionally fail.  The multiple runs of
same environment would be one way to identify those.  Another way to
identify them would be to compare tests that one would expect to have
the same results.  The *syscall.exp tests run a number of variations
using different probe techniques plus 32-bit and 64-bit
variants. Comparing the multiple tests for the same syscall would be
an additional way to sporadic failures.


Analysis that orders output based on the “Freshest Failures”

Generally, much more interested in addressing failures that are
recent.  It might be nice to have analysis showing transitions from
PASS to FAIL from newest to oldest in the test results.  That would
highlight which tests failed due to recent changes.  Might want to
filter out the unreliable tests.


Have some way to annotate and indicate that a particular commit should fixes a test

Often commits are made to fix specific failures in the testsuite.  It
would be nice if there was some way to communicate that information
back to Bunsen.  Bunsen could use that to flag whether the fix is
incomplete and the test is still failing in some situations.

-Will Cohen