DejaGnu and toolchain testing

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* DejaGnu and toolchain testing
@ 2013-07-26  0:21 Joseph S. Myers
  2013-07-26 16:08 ` Rob Savoye
  0 siblings, 1 reply; 5+ messages in thread
From: Joseph S. Myers @ 2013-07-26  0:21 UTC (permalink / raw)
  To: gcc, rob.savoye

I was interested to watch the video of the DejaGnu BOF at the Cauldron.  A 
few issues with DejaGnu for toolchain testing that I've noted but I don't 
think were covered there include:

* DejaGnu has a lot of hardcoded logic to try to find various files in a 
toolchain build directory.  A lot of it is actually for very old toolchain 
versions (using GCC version 2 or older, for example).  The first issue 
with this is that it doesn't belong in DejaGnu: the toolchain should be 
free to rearrange its build directories without needing changes to DejaGnu 
itself (which in practice means there's lots of such logic in the 
toolchain's own testsuites *as well*, duplicating the DejaGnu code to a 
greater or lesser extent).  The second issue is that "make install" 
already knows where to find files in the build directory, and it would be 
better to move towards build-tree testing installing the toolchain in a 
staging directory and running tools from there, rather than needing any 
logic in the testsuites at all to enable bits of uninstalled tools to find 
other bits of uninstalled tools.  (There might still be a few bits like 
setting LD_LIBRARY_PATH required.  But the compiler command lines would be 
much simpler and much closer to how users actually use the compiler in 
practice.)

* Similarly, DejaGnu has hardcoded prune_warnings - and again GCC adds 
lots of its own prunes; it's not clear hardcoding this in DejaGnu is a 
particularly good idea either.

* Another piece of unfortunate hardcoding in DejaGnu is how remote-host 
testing uses "-o a.out" when running tools on the remote host - such a 
difference from how they are run on a local host results in lots of issue 
where a tool cares about the output file name in some way (e.g. to 
generate other output files).

* A key feature of QMTest that I like but I don't think got mentioned is 
that you can *statically enumerate the set of tests* without running them.  
That is, a testsuite has a well-defined set of tests, and that set does 
not depend on what the results of the tests are - whereas it's very easy 
and common for a DejaGnu test to have test names (the text after PASS: or 
FAIL: ) depending on whether the test passed or failed, or how the test 
passed or failed (no doubt the testsuite authors had reasons for doing 
this, but it conflicts with any automatic comparison of results).  The 
QMTest model isn't wonderfully well-matched to toolchain testing - in 
toolchain testing, you can typically do a single indivisible test 
execution (e.g. compiling a file), which produces results for a large 
number of test assertions (tests for warnings on particular lines of that 
file), and QMTest expects one indivisible test execution to produce one 
result.  But a model where a test can contain multiple assertions, and 
both tests and their assertions can be statically enumerated independent 
of their result, and where the results can be annotated by the testsuite 
(to deal with the purposes for which testsuites stick extra text on the 
PASS/FAIL line) certainly seems better than one that makes it likely the 
set of test assertions will vary in unpredictable ways.

* People in the BOF seemed happy with expect.  I think expect has caused 
quite a few problems for toolchain testing.  In particular, there are or 
have been too many places where expect likes to throw away input whose 
size exceeds some arbitrary limit and you need to hack around those by 
increasing the limits in some way.  GCC tests can generate and test for 
very large numbers of diagnostics from a single test, and some binutils 
tests can generate megabytes of output from a tool (that are then matched 
against regular expressions etc.).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: DejaGnu and toolchain testing
  2013-07-26  0:21 DejaGnu and toolchain testing Joseph S. Myers
@ 2013-07-26 16:08 ` Rob Savoye
  2013-07-26 16:37   ` Joseph S. Myers
  0 siblings, 1 reply; 5+ messages in thread
From: Rob Savoye @ 2013-07-26 16:08 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: gcc

On 07/25/2013 06:21 PM, Joseph S. Myers wrote:
> I was interested to watch the video of the DejaGnu BOF at the Cauldron.  A 
> few issues with DejaGnu for toolchain testing that I've noted but I don't 
> think were covered there include:
  Thanks for the thoughtful comments, they're useful as I start
considering refactoring DejaGnu to keep it working for the next 22
years... There is a lot of crusty old code in DejaGnu, I admit it.
DejaGnu was never truly designed, it was just built and debugged while
we were using it, and it shows.

  I'm not sure if this discussion is better on the GCC list or the
DejaGnu list, but I would like to keep this thread going. Course GCC
developers are the main users of DejaGnu anyway.
> * DejaGnu has a lot of hardcoded logic to try to find various files in a 
> toolchain build directory.  A lot of it is actually for very old toolchain 
> versions (using GCC version 2 or older, for example).  The first issue 
> with this is that it doesn't belong in DejaGnu: the toolchain should be 
> free to rearrange its build directories without needing changes to DejaGnu 
> itself (which in practice means there's lots of such logic in the 
> toolchain's own testsuites *as well*, duplicating the DejaGnu code to a 
> greater or lesser extent).  The second issue is that "make install" 
  DejaGnu is a testing framework, so it makes sense that much of the GCC
testing logic is in gcc/testsuite/{lib,config}. It was also a decision
at the time that having a testsuite override existing procs in DejaGnu
core was a good idea. Now many years later, I think I'd move most what
GCC needs into the core, especially all the "dg* style of tests.

  At one time the thought was DejaGnu was a general purpose test
framework, but I think at this point in time, it's really just used for
toolchain testing. (although my Gnash project also uses it) So I think
tweaking DejaGnu core to be mainly toolchain testing oriented is
probably a good idea.
> * Similarly, DejaGnu has hardcoded prune_warnings - and again GCC adds 
> lots of its own prunes; it's not clear hardcoding this in DejaGnu is a 
> particularly good idea either.
  The DejaGnu pruning is older than GCC's. :-)
> * Another piece of unfortunate hardcoding in DejaGnu is how remote-host 
> testing uses "-o a.out" when running tools on the remote host - such a 
> difference from how they are run on a local host results in lots of issue 
  This is historical, a.out being common at the time.
> * A key feature of QMTest that I like but I don't think got mentioned is 
> that you can *statically enumerate the set of tests* without running them.  
> That is, a testsuite has a well-defined set of tests, and that set does 
> not depend on what the results of the tests are - whereas it's very easy 
> and common for a DejaGnu test to have test names (the text after PASS: or 
> FAIL: ) depending on whether the test passed or failed, or how the test 
> passed or failed (no doubt the testsuite authors had reasons for doing 
> this, but it conflicts with any automatic comparison of results).  The 
  One of my other ideas for DejaGnu 2.0 is improved test result output.
I'm currently importing all test results into a database (see the mysql
branch on savannah), and find text parsing painful and lacking more fine
grained details. The text field for PASS/FAIL is overloaded. Since I
want to improve the ability to analyze results, ie... comparing what
happens with differing configure or command line options, I think the
output format has to change. One thought is to only add new fields into
the --xml output, as that is database specific, and leave the current
text output unchanged.
> * People in the BOF seemed happy with expect.  I think expect has caused 
> quite a few problems for toolchain testing.  In particular, there are or 
  I don't think it was that we were happy with expect, but at least for
GDB testing, nobody has any alternatives. I thought I mentioned that a
refactored DejaGnu would only use expect for GDB testing, everything
else wouldn't require it. That also means all the remote execution procs
would need to work without expect as well.

    - rob -

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: DejaGnu and toolchain testing
  2013-07-26 16:08 ` Rob Savoye
@ 2013-07-26 16:37   ` Joseph S. Myers
  2013-07-26 17:01     ` Rob Savoye
  0 siblings, 1 reply; 5+ messages in thread
From: Joseph S. Myers @ 2013-07-26 16:37 UTC (permalink / raw)
  To: Rob Savoye; +Cc: gcc

On Fri, 26 Jul 2013, Rob Savoye wrote:

> > * DejaGnu has a lot of hardcoded logic to try to find various files in a 
> > toolchain build directory.  A lot of it is actually for very old toolchain 
> > versions (using GCC version 2 or older, for example).  The first issue 
> > with this is that it doesn't belong in DejaGnu: the toolchain should be 
> > free to rearrange its build directories without needing changes to DejaGnu 
> > itself (which in practice means there's lots of such logic in the 
> > toolchain's own testsuites *as well*, duplicating the DejaGnu code to a 
> > greater or lesser extent).  The second issue is that "make install" 
>   DejaGnu is a testing framework, so it makes sense that much of the GCC
> testing logic is in gcc/testsuite/{lib,config}. It was also a decision
> at the time that having a testsuite override existing procs in DejaGnu
> core was a good idea. Now many years later, I think I'd move most what
> GCC needs into the core, especially all the "dg* style of tests.
> 
>   At one time the thought was DejaGnu was a general purpose test
> framework, but I think at this point in time, it's really just used for
> toolchain testing. (although my Gnash project also uses it) So I think
> tweaking DejaGnu core to be mainly toolchain testing oriented is
> probably a good idea.

Anything in the core needs to avoid obstructing toolchain changes.  People 
typically test with the installed DejaGnu from their OS, and the OS itself 
may well be a few years old (e.g. Ubuntu 10.04), so it's undesirable for 
an enhancement to the GCC testsuite to require a new version of DejaGnu.  
This means clean extensibility, and avoiding DejaGnu hardcoding things 
that are not stable public interfaces.

For example, it should be possible to completely rearrange the internal 
structure of a toolchain build directory without needing to change any 
external tools such as DejaGnu.  So all the information about the 
structure of that directory - how to use -B, -L etc. options to run an 
uninstalled compiler, if that's done at all - should be in the toolchain 
sources rather than in DejaGnu.  (As noted, I think really we shouldn't be 
testing uninstalled compilers at all - let "make install" be the single 
place that needs to know how to put the different bits of a build 
directory together.)

However, it makes sense for DejaGnu to include a parser for diagnostics 
that follow the GNU Coding Standards - that's a public interface.  But (a) 
there should be a better way of handling column numbers than the kludges 
in the GCC testsuite at present, and (b) extensibility for such a parser 
is still desirable, as new forms of diagnostic location may be added in 
future.

Right now, DejaGnu has lots of toolchain stuff in the core ... toolchain 
stuff for building Cygnus trees 20 years ago rather than what's useful 
now.  It's not that much better if a DejaGnu version released in 2013 and 
used for testing in 2017 has things in it that are good for testing 2013 
toolchains and get in the way for testing 2017 toolchains.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: DejaGnu and toolchain testing
  2013-07-26 16:37   ` Joseph S. Myers
@ 2013-07-26 17:01     ` Rob Savoye
  2013-07-26 17:27       ` Joseph S. Myers
  0 siblings, 1 reply; 5+ messages in thread
From: Rob Savoye @ 2013-07-26 17:01 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: gcc

On 07/26/2013 10:37 AM, Joseph S. Myers wrote:
> Anything in the core needs to avoid obstructing toolchain changes.  People 
> typically test with the installed DejaGnu from their OS, and the OS itself 
> may well be a few years old (e.g. Ubuntu 10.04), so it's undesirable for 
> an enhancement to the GCC testsuite to require a new version of DejaGnu.  
> This means clean extensibility, and avoiding DejaGnu hardcoding things 
> that are not stable public interfaces.
  DejaGnu is basically stagnant because most people consider the pain of
any improvements too great to change anything. If I launch off on a
DejaGnu 2.0, my thought is the existing release wouldn't go away. Many
distributions ship multiple versions of some applications. Any changes
to DejaGnu would likely live in a branch for a long time, but would be
usable by anyone interested in better functionality. Yes, an actual
design and defining public interfaces would be a good idea. Currently
DejaGnu has many arbitrary APIs and settings, all created without a
whole lot of thought other than working around or fixing a problem.

  I also realize that any major changes to DejaGnu will require
corresponding changes in the testsuite support code. I'm completely
aware of how much work this would be having written much of it... There
would have to be backward compatibility maintained for a considerable time.
> Right now, DejaGnu has lots of toolchain stuff in the core ... toolchain 
> stuff for building Cygnus trees 20 years ago rather than what's useful 
> now.  It's not that much better if a DejaGnu version released in 2013 and 
> used for testing in 2017 has things in it that are good for testing 2013 
> toolchains and get in the way for testing 2017 toolchains.
   I'd agree there is lots of crufty support for things like the old
Cygnus trees that could be removed. Ideally I'd prefer to explore
people's ideas on what would be useful for testing toolchains 5-10 years
from now. Me, I want something not dependent on a dying and mostly
unmaintained scripting language that nobody likes anyway (the current
working idea is to use python). I also want to be able to compare test
results in better ways than diffing huge text files. I'd like to compare
multiple test runs as well in a reasonably detailed fashion.

    - rob -

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: DejaGnu and toolchain testing
  2013-07-26 17:01     ` Rob Savoye
@ 2013-07-26 17:27       ` Joseph S. Myers
  0 siblings, 0 replies; 5+ messages in thread
From: Joseph S. Myers @ 2013-07-26 17:27 UTC (permalink / raw)
  To: Rob Savoye; +Cc: gcc

On Fri, 26 Jul 2013, Rob Savoye wrote:

>    I'd agree there is lots of crufty support for things like the old
> Cygnus trees that could be removed. Ideally I'd prefer to explore
> people's ideas on what would be useful for testing toolchains 5-10 years
> from now. Me, I want something not dependent on a dying and mostly
> unmaintained scripting language that nobody likes anyway (the current
> working idea is to use python). I also want to be able to compare test
> results in better ways than diffing huge text files. I'd like to compare
> multiple test runs as well in a reasonably detailed fashion.

* Eliminate build-tree testing.

* Look at QMTest's class structure - I don't think it's quite right as I 
explained regarding not separating the unit that gets run from the unit 
that has an assigned PASS/FAIL result, but it's closer than DejaGnu is at 
present, in particular as regards the ability to enumerate tests 
independently of running them (so, to look at the testsuite and a log of a 
partial run and see what tests were not run).  Another thing I don't 
really care for there is how it handles XFAILs (the QMTest approach has 
logical simplicity, but is not so good in practice for toolchain testing, 
I think, so I prefer tests actually having XPASS/XFAIL results as in 
DejaGnu).

* Structured results so that annotations can readily be associated with 
individual test results, and whole test runs.  Some annotations identify 
the test run in some way (configured target, configure options, ...).  The 
test's "name" might have multiple fields rather than being a pure text 
string as at present (file with test, options used, line on which 
assertion is being tested, for example).  And there would be other 
annotations such as compile command, output produced by compile command, 
... - much of this is presently in the .log file but not in a properly 
machine-processable form.  (However, I'd still like the format to be 
something simple that it's easy to generate from non-DejaGnu testsuites as 
well if desired.)

* Built-in test harness software support for parallelism, while allowing 
for cases of host or target boards not supporting parallelism (if host 
does but not target, you can still run compiles in parallel).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-07-26 17:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-26  0:21 DejaGnu and toolchain testing Joseph S. Myers
2013-07-26 16:08 ` Rob Savoye
2013-07-26 16:37   ` Joseph S. Myers
2013-07-26 17:01     ` Rob Savoye
2013-07-26 17:27       ` Joseph S. Myers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).