public inbox for bunsen@sourceware.org
 help / color / mirror / Atom feed
* Analysis Suggestions for Bunsen
@ 2024-01-25 17:31 William Cohen
  0 siblings, 0 replies; only message in thread
From: William Cohen @ 2024-01-25 17:31 UTC (permalink / raw)
  To: bunsen; +Cc: wcohen

Collecting data across a wide range of architectures and linux
distributions can provide useful information about what the root cause
of a failure is.  The comparisons where a particular test works and
doesn’t work gives an idea which changes influence the test result.
Below are some ideas of what might be useful analysis based on
debugging SystemTap failure that I (and others) might find useful
when searching through the Bunsen data.


Check correlation between kernel and test success

Over time there are changes in the kernel and sometimes these changes
break systemtap. This might cause few tests to fail or in the worst
case cause the smoke test to fail.  Checking the changes in the test
results when the kernel version changes within a distribution might
give some idea whether the kernel is to blame.


There might also be some opportunities for checking between
distributions as versions of Fedora may not update to the same kernel
at the same time.  Looking at the test results and seeing a test
transition from PASS to FAIL when two different versions of Fedora
update to the same kernel would provide some indication whether the
test failure was due to a kernel change.

I find myself checking the results between RHEL8/RHEL9/Fedora to see
if a particular test worked with an older kernel and then broke on
newer kernels.


Check correlation between architecture and test success

There is architecture specific code in Systemtap (and the kernels).
Having the analysis compare the results for the same distribution but
on different architectures is helpful.  For example PR31074 was
observed on aarch64 machine but the particular test functioned fine on
x86_64 the machine most commonly used for development.  Having Bunsen
analysis compare results of the same distribution running on different
architectures could point out issues in machine specific code.


Identify tests that do not reliably pass (or fail)

Given the complex interactions of the different parts of the system it
is possible that a particular test may not reliably work or fail.
Analysis that goes through the time axis for a particular environment
and looks for often it transitions between passing and failing.


Allow multiple runs on the same machine environment

From run to run there are some variations in the test results.  It
would be nice if there was some way to rerun the test(s) in the same
environment to determine whether the test is failing every time or is
sporadic.


Compare variations of the variations of the *sycall.exp tests

As mentioned above some tests occasionally fail.  The multiple runs of
same environment would be one way to identify those.  Another way to
identify them would be to compare tests that one would expect to have
the same results.  The *syscall.exp tests run a number of variations
using different probe techniques plus 32-bit and 64-bit
variants. Comparing the multiple tests for the same syscall would be
an additional way to sporadic failures.


Analysis that orders output based on the “Freshest Failures”

Generally, much more interested in addressing failures that are
recent.  It might be nice to have analysis showing transitions from
PASS to FAIL from newest to oldest in the test results.  That would
highlight which tests failed due to recent changes.  Might want to
filter out the unreliable tests.


Have some way to annotate and indicate that a particular commit should fixes a test

Often commits are made to fix specific failures in the testsuite.  It
would be nice if there was some way to communicate that information
back to Bunsen.  Bunsen could use that to flag whether the fix is
incomplete and the test is still failing in some situations.

-Will Cohen


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2024-01-25 17:32 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-25 17:31 Analysis Suggestions for Bunsen William Cohen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).