* DejaGnu and toolchain testing
@ 2013-07-26 0:21 Joseph S. Myers
2013-07-26 16:08 ` Rob Savoye
0 siblings, 1 reply; 5+ messages in thread
From: Joseph S. Myers @ 2013-07-26 0:21 UTC (permalink / raw)
To: gcc, rob.savoye
I was interested to watch the video of the DejaGnu BOF at the Cauldron. A
few issues with DejaGnu for toolchain testing that I've noted but I don't
think were covered there include:
* DejaGnu has a lot of hardcoded logic to try to find various files in a
toolchain build directory. A lot of it is actually for very old toolchain
versions (using GCC version 2 or older, for example). The first issue
with this is that it doesn't belong in DejaGnu: the toolchain should be
free to rearrange its build directories without needing changes to DejaGnu
itself (which in practice means there's lots of such logic in the
toolchain's own testsuites *as well*, duplicating the DejaGnu code to a
greater or lesser extent). The second issue is that "make install"
already knows where to find files in the build directory, and it would be
better to move towards build-tree testing installing the toolchain in a
staging directory and running tools from there, rather than needing any
logic in the testsuites at all to enable bits of uninstalled tools to find
other bits of uninstalled tools. (There might still be a few bits like
setting LD_LIBRARY_PATH required. But the compiler command lines would be
much simpler and much closer to how users actually use the compiler in
practice.)
* Similarly, DejaGnu has hardcoded prune_warnings - and again GCC adds
lots of its own prunes; it's not clear hardcoding this in DejaGnu is a
particularly good idea either.
* Another piece of unfortunate hardcoding in DejaGnu is how remote-host
testing uses "-o a.out" when running tools on the remote host - such a
difference from how they are run on a local host results in lots of issue
where a tool cares about the output file name in some way (e.g. to
generate other output files).
* A key feature of QMTest that I like but I don't think got mentioned is
that you can *statically enumerate the set of tests* without running them.
That is, a testsuite has a well-defined set of tests, and that set does
not depend on what the results of the tests are - whereas it's very easy
and common for a DejaGnu test to have test names (the text after PASS: or
FAIL: ) depending on whether the test passed or failed, or how the test
passed or failed (no doubt the testsuite authors had reasons for doing
this, but it conflicts with any automatic comparison of results). The
QMTest model isn't wonderfully well-matched to toolchain testing - in
toolchain testing, you can typically do a single indivisible test
execution (e.g. compiling a file), which produces results for a large
number of test assertions (tests for warnings on particular lines of that
file), and QMTest expects one indivisible test execution to produce one
result. But a model where a test can contain multiple assertions, and
both tests and their assertions can be statically enumerated independent
of their result, and where the results can be annotated by the testsuite
(to deal with the purposes for which testsuites stick extra text on the
PASS/FAIL line) certainly seems better than one that makes it likely the
set of test assertions will vary in unpredictable ways.
* People in the BOF seemed happy with expect. I think expect has caused
quite a few problems for toolchain testing. In particular, there are or
have been too many places where expect likes to throw away input whose
size exceeds some arbitrary limit and you need to hack around those by
increasing the limits in some way. GCC tests can generate and test for
very large numbers of diagnostics from a single test, and some binutils
tests can generate megabytes of output from a tool (that are then matched
against regular expressions etc.).
--
Joseph S. Myers
joseph@codesourcery.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: DejaGnu and toolchain testing
2013-07-26 0:21 DejaGnu and toolchain testing Joseph S. Myers
@ 2013-07-26 16:08 ` Rob Savoye
2013-07-26 16:37 ` Joseph S. Myers
0 siblings, 1 reply; 5+ messages in thread
From: Rob Savoye @ 2013-07-26 16:08 UTC (permalink / raw)
To: Joseph S. Myers; +Cc: gcc
On 07/25/2013 06:21 PM, Joseph S. Myers wrote:
> I was interested to watch the video of the DejaGnu BOF at the Cauldron. A
> few issues with DejaGnu for toolchain testing that I've noted but I don't
> think were covered there include:
Thanks for the thoughtful comments, they're useful as I start
considering refactoring DejaGnu to keep it working for the next 22
years... There is a lot of crusty old code in DejaGnu, I admit it.
DejaGnu was never truly designed, it was just built and debugged while
we were using it, and it shows.
I'm not sure if this discussion is better on the GCC list or the
DejaGnu list, but I would like to keep this thread going. Course GCC
developers are the main users of DejaGnu anyway.
> * DejaGnu has a lot of hardcoded logic to try to find various files in a
> toolchain build directory. A lot of it is actually for very old toolchain
> versions (using GCC version 2 or older, for example). The first issue
> with this is that it doesn't belong in DejaGnu: the toolchain should be
> free to rearrange its build directories without needing changes to DejaGnu
> itself (which in practice means there's lots of such logic in the
> toolchain's own testsuites *as well*, duplicating the DejaGnu code to a
> greater or lesser extent). The second issue is that "make install"
DejaGnu is a testing framework, so it makes sense that much of the GCC
testing logic is in gcc/testsuite/{lib,config}. It was also a decision
at the time that having a testsuite override existing procs in DejaGnu
core was a good idea. Now many years later, I think I'd move most what
GCC needs into the core, especially all the "dg* style of tests.
At one time the thought was DejaGnu was a general purpose test
framework, but I think at this point in time, it's really just used for
toolchain testing. (although my Gnash project also uses it) So I think
tweaking DejaGnu core to be mainly toolchain testing oriented is
probably a good idea.
> * Similarly, DejaGnu has hardcoded prune_warnings - and again GCC adds
> lots of its own prunes; it's not clear hardcoding this in DejaGnu is a
> particularly good idea either.
The DejaGnu pruning is older than GCC's. :-)
> * Another piece of unfortunate hardcoding in DejaGnu is how remote-host
> testing uses "-o a.out" when running tools on the remote host - such a
> difference from how they are run on a local host results in lots of issue
This is historical, a.out being common at the time.
> * A key feature of QMTest that I like but I don't think got mentioned is
> that you can *statically enumerate the set of tests* without running them.
> That is, a testsuite has a well-defined set of tests, and that set does
> not depend on what the results of the tests are - whereas it's very easy
> and common for a DejaGnu test to have test names (the text after PASS: or
> FAIL: ) depending on whether the test passed or failed, or how the test
> passed or failed (no doubt the testsuite authors had reasons for doing
> this, but it conflicts with any automatic comparison of results). The
One of my other ideas for DejaGnu 2.0 is improved test result output.
I'm currently importing all test results into a database (see the mysql
branch on savannah), and find text parsing painful and lacking more fine
grained details. The text field for PASS/FAIL is overloaded. Since I
want to improve the ability to analyze results, ie... comparing what
happens with differing configure or command line options, I think the
output format has to change. One thought is to only add new fields into
the --xml output, as that is database specific, and leave the current
text output unchanged.
> * People in the BOF seemed happy with expect. I think expect has caused
> quite a few problems for toolchain testing. In particular, there are or
I don't think it was that we were happy with expect, but at least for
GDB testing, nobody has any alternatives. I thought I mentioned that a
refactored DejaGnu would only use expect for GDB testing, everything
else wouldn't require it. That also means all the remote execution procs
would need to work without expect as well.
- rob -
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: DejaGnu and toolchain testing
2013-07-26 16:08 ` Rob Savoye
@ 2013-07-26 16:37 ` Joseph S. Myers
2013-07-26 17:01 ` Rob Savoye
0 siblings, 1 reply; 5+ messages in thread
From: Joseph S. Myers @ 2013-07-26 16:37 UTC (permalink / raw)
To: Rob Savoye; +Cc: gcc
On Fri, 26 Jul 2013, Rob Savoye wrote:
> > * DejaGnu has a lot of hardcoded logic to try to find various files in a
> > toolchain build directory. A lot of it is actually for very old toolchain
> > versions (using GCC version 2 or older, for example). The first issue
> > with this is that it doesn't belong in DejaGnu: the toolchain should be
> > free to rearrange its build directories without needing changes to DejaGnu
> > itself (which in practice means there's lots of such logic in the
> > toolchain's own testsuites *as well*, duplicating the DejaGnu code to a
> > greater or lesser extent). The second issue is that "make install"
> DejaGnu is a testing framework, so it makes sense that much of the GCC
> testing logic is in gcc/testsuite/{lib,config}. It was also a decision
> at the time that having a testsuite override existing procs in DejaGnu
> core was a good idea. Now many years later, I think I'd move most what
> GCC needs into the core, especially all the "dg* style of tests.
>
> At one time the thought was DejaGnu was a general purpose test
> framework, but I think at this point in time, it's really just used for
> toolchain testing. (although my Gnash project also uses it) So I think
> tweaking DejaGnu core to be mainly toolchain testing oriented is
> probably a good idea.
Anything in the core needs to avoid obstructing toolchain changes. People
typically test with the installed DejaGnu from their OS, and the OS itself
may well be a few years old (e.g. Ubuntu 10.04), so it's undesirable for
an enhancement to the GCC testsuite to require a new version of DejaGnu.
This means clean extensibility, and avoiding DejaGnu hardcoding things
that are not stable public interfaces.
For example, it should be possible to completely rearrange the internal
structure of a toolchain build directory without needing to change any
external tools such as DejaGnu. So all the information about the
structure of that directory - how to use -B, -L etc. options to run an
uninstalled compiler, if that's done at all - should be in the toolchain
sources rather than in DejaGnu. (As noted, I think really we shouldn't be
testing uninstalled compilers at all - let "make install" be the single
place that needs to know how to put the different bits of a build
directory together.)
However, it makes sense for DejaGnu to include a parser for diagnostics
that follow the GNU Coding Standards - that's a public interface. But (a)
there should be a better way of handling column numbers than the kludges
in the GCC testsuite at present, and (b) extensibility for such a parser
is still desirable, as new forms of diagnostic location may be added in
future.
Right now, DejaGnu has lots of toolchain stuff in the core ... toolchain
stuff for building Cygnus trees 20 years ago rather than what's useful
now. It's not that much better if a DejaGnu version released in 2013 and
used for testing in 2017 has things in it that are good for testing 2013
toolchains and get in the way for testing 2017 toolchains.
--
Joseph S. Myers
joseph@codesourcery.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: DejaGnu and toolchain testing
2013-07-26 16:37 ` Joseph S. Myers
@ 2013-07-26 17:01 ` Rob Savoye
2013-07-26 17:27 ` Joseph S. Myers
0 siblings, 1 reply; 5+ messages in thread
From: Rob Savoye @ 2013-07-26 17:01 UTC (permalink / raw)
To: Joseph S. Myers; +Cc: gcc
On 07/26/2013 10:37 AM, Joseph S. Myers wrote:
> Anything in the core needs to avoid obstructing toolchain changes. People
> typically test with the installed DejaGnu from their OS, and the OS itself
> may well be a few years old (e.g. Ubuntu 10.04), so it's undesirable for
> an enhancement to the GCC testsuite to require a new version of DejaGnu.
> This means clean extensibility, and avoiding DejaGnu hardcoding things
> that are not stable public interfaces.
DejaGnu is basically stagnant because most people consider the pain of
any improvements too great to change anything. If I launch off on a
DejaGnu 2.0, my thought is the existing release wouldn't go away. Many
distributions ship multiple versions of some applications. Any changes
to DejaGnu would likely live in a branch for a long time, but would be
usable by anyone interested in better functionality. Yes, an actual
design and defining public interfaces would be a good idea. Currently
DejaGnu has many arbitrary APIs and settings, all created without a
whole lot of thought other than working around or fixing a problem.
I also realize that any major changes to DejaGnu will require
corresponding changes in the testsuite support code. I'm completely
aware of how much work this would be having written much of it... There
would have to be backward compatibility maintained for a considerable time.
> Right now, DejaGnu has lots of toolchain stuff in the core ... toolchain
> stuff for building Cygnus trees 20 years ago rather than what's useful
> now. It's not that much better if a DejaGnu version released in 2013 and
> used for testing in 2017 has things in it that are good for testing 2013
> toolchains and get in the way for testing 2017 toolchains.
I'd agree there is lots of crufty support for things like the old
Cygnus trees that could be removed. Ideally I'd prefer to explore
people's ideas on what would be useful for testing toolchains 5-10 years
from now. Me, I want something not dependent on a dying and mostly
unmaintained scripting language that nobody likes anyway (the current
working idea is to use python). I also want to be able to compare test
results in better ways than diffing huge text files. I'd like to compare
multiple test runs as well in a reasonably detailed fashion.
- rob -
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: DejaGnu and toolchain testing
2013-07-26 17:01 ` Rob Savoye
@ 2013-07-26 17:27 ` Joseph S. Myers
0 siblings, 0 replies; 5+ messages in thread
From: Joseph S. Myers @ 2013-07-26 17:27 UTC (permalink / raw)
To: Rob Savoye; +Cc: gcc
On Fri, 26 Jul 2013, Rob Savoye wrote:
> I'd agree there is lots of crufty support for things like the old
> Cygnus trees that could be removed. Ideally I'd prefer to explore
> people's ideas on what would be useful for testing toolchains 5-10 years
> from now. Me, I want something not dependent on a dying and mostly
> unmaintained scripting language that nobody likes anyway (the current
> working idea is to use python). I also want to be able to compare test
> results in better ways than diffing huge text files. I'd like to compare
> multiple test runs as well in a reasonably detailed fashion.
* Eliminate build-tree testing.
* Look at QMTest's class structure - I don't think it's quite right as I
explained regarding not separating the unit that gets run from the unit
that has an assigned PASS/FAIL result, but it's closer than DejaGnu is at
present, in particular as regards the ability to enumerate tests
independently of running them (so, to look at the testsuite and a log of a
partial run and see what tests were not run). Another thing I don't
really care for there is how it handles XFAILs (the QMTest approach has
logical simplicity, but is not so good in practice for toolchain testing,
I think, so I prefer tests actually having XPASS/XFAIL results as in
DejaGnu).
* Structured results so that annotations can readily be associated with
individual test results, and whole test runs. Some annotations identify
the test run in some way (configured target, configure options, ...). The
test's "name" might have multiple fields rather than being a pure text
string as at present (file with test, options used, line on which
assertion is being tested, for example). And there would be other
annotations such as compile command, output produced by compile command,
... - much of this is presently in the .log file but not in a properly
machine-processable form. (However, I'd still like the format to be
something simple that it's easy to generate from non-DejaGnu testsuites as
well if desired.)
* Built-in test harness software support for parallelism, while allowing
for cases of host or target boards not supporting parallelism (if host
does but not target, you can still run compiles in parallel).
--
Joseph S. Myers
joseph@codesourcery.com
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-07-26 17:27 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-26 0:21 DejaGnu and toolchain testing Joseph S. Myers
2013-07-26 16:08 ` Rob Savoye
2013-07-26 16:37 ` Joseph S. Myers
2013-07-26 17:01 ` Rob Savoye
2013-07-26 17:27 ` Joseph S. Myers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).