From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 0E83F3858C53 for ; Thu, 25 Jan 2024 17:32:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0E83F3858C53 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0E83F3858C53 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706203922; cv=none; b=kamFY9+vAs1m5Us16/BIoWVx18zlk4VWx9vcsqwjqTa4mkD1ioeZBvpNONek4Ec+5qfPNygJfWFrZhZ8Z3hdfz+Tq51AMLeU1PbC+YOKhibQkFoDDg2j0Pk2fphOfQr36ATPov5hVLZsA1CTCd97AlZGSVxoAP3EcMX7YxvxdAg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706203922; c=relaxed/simple; bh=37otvjcIc0S+/YwQFne2rufWlqTJztDzPAOXlMag6l8=; h=DKIM-Signature:Message-ID:Date:MIME-Version:To:From:Subject; b=LnRK7IeJSpxRwWgoHhaFLtc3dioo2250Gg0cadTn8C4mwBjSCwBAeCYbMWNolE8Y1VOtlqSdL6gmEuAsuUz2Iu4PNnPHGMXhp1Lwq6VqdymbJVzhHIhfjj2SnkxA9qzt7CQSSJ+vk2REVRs/LkEltpCg1sqbICnPvQqA/X/nc7U= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706203919; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=Rjh1hV/npeuRYQOdt7PCQgrhLU/VcFVIye8M09PVyGk=; b=HKU7TubAuOjIedS45hUVzZdToPBGSgpwD8W7bNQ+V1P6Gny3RjABFj8JZc+GaFLk40TbPR g4Hr9FfgPsIhL2BDKqVVFHzVbxYaRh9H2CoFVqcHywFq7dcnN65shbvWACJSN+n2FdtE+U 732ZTzL9EHrS9ydHSJGZPmFigQdoB8s= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-537-P6W8Q8CwOVOBENtwwS73Zw-1; Thu, 25 Jan 2024 12:31:58 -0500 X-MC-Unique: P6W8Q8CwOVOBENtwwS73Zw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CF2713C2A1C8 for ; Thu, 25 Jan 2024 17:31:57 +0000 (UTC) Received: from [10.13.129.81] (dhcp129-81.rdu.redhat.com [10.13.129.81]) by smtp.corp.redhat.com (Postfix) with ESMTP id BE97D2166B33; Thu, 25 Jan 2024 17:31:57 +0000 (UTC) Message-ID: <1a40287e-fad9-483f-bfb6-d62e618cbdce@redhat.com> Date: Thu, 25 Jan 2024 12:31:57 -0500 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: wcohen@redhat.com To: bunsen@sourceware.org From: William Cohen Subject: Analysis Suggestions for Bunsen X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Collecting data across a wide range of architectures and linux distributions can provide useful information about what the root cause of a failure is. The comparisons where a particular test works and doesn’t work gives an idea which changes influence the test result. Below are some ideas of what might be useful analysis based on debugging SystemTap failure that I (and others) might find useful when searching through the Bunsen data. Check correlation between kernel and test success Over time there are changes in the kernel and sometimes these changes break systemtap. This might cause few tests to fail or in the worst case cause the smoke test to fail. Checking the changes in the test results when the kernel version changes within a distribution might give some idea whether the kernel is to blame. There might also be some opportunities for checking between distributions as versions of Fedora may not update to the same kernel at the same time. Looking at the test results and seeing a test transition from PASS to FAIL when two different versions of Fedora update to the same kernel would provide some indication whether the test failure was due to a kernel change. I find myself checking the results between RHEL8/RHEL9/Fedora to see if a particular test worked with an older kernel and then broke on newer kernels. Check correlation between architecture and test success There is architecture specific code in Systemtap (and the kernels). Having the analysis compare the results for the same distribution but on different architectures is helpful. For example PR31074 was observed on aarch64 machine but the particular test functioned fine on x86_64 the machine most commonly used for development. Having Bunsen analysis compare results of the same distribution running on different architectures could point out issues in machine specific code. Identify tests that do not reliably pass (or fail) Given the complex interactions of the different parts of the system it is possible that a particular test may not reliably work or fail. Analysis that goes through the time axis for a particular environment and looks for often it transitions between passing and failing. Allow multiple runs on the same machine environment >From run to run there are some variations in the test results. It would be nice if there was some way to rerun the test(s) in the same environment to determine whether the test is failing every time or is sporadic. Compare variations of the variations of the *sycall.exp tests As mentioned above some tests occasionally fail. The multiple runs of same environment would be one way to identify those. Another way to identify them would be to compare tests that one would expect to have the same results. The *syscall.exp tests run a number of variations using different probe techniques plus 32-bit and 64-bit variants. Comparing the multiple tests for the same syscall would be an additional way to sporadic failures. Analysis that orders output based on the “Freshest Failures” Generally, much more interested in addressing failures that are recent. It might be nice to have analysis showing transitions from PASS to FAIL from newest to oldest in the test results. That would highlight which tests failed due to recent changes. Might want to filter out the unreliable tests. Have some way to annotate and indicate that a particular commit should fixes a test Often commits are made to fix specific failures in the testsuite. It would be nice if there was some way to communicate that information back to Bunsen. Bunsen could use that to flag whether the fix is incomplete and the test is still failing in some situations. -Will Cohen