Re: ACATS legal status cleared by FSF

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: ACATS legal status cleared by FSF
@ 2001-12-09 15:06 dewar
  2001-12-09 15:55 ` Joseph S. Myers
  0 siblings, 1 reply; 30+ messages in thread
From: dewar @ 2001-12-09 15:06 UTC (permalink / raw)
  To: guerby, zack; +Cc: dewar, gcc, kenner, mrs

<<BTW is there any record of the existing noncompile testsuite catching
problems, or did it just prevent any serious error message work by
scaring people?
>>

Well sure, there have been some cases in which a subtle change to the
front end caused an error message to be lost, but it is infrequent. 
And for sure it has not stopped ACT from doing serious error message
work, which has always been a focus for us (getting the best possible
error messages), but we often often have cases where we make a simple
improvement in an error message, and 90% of the work is checking very
carefully through the B-test baselines to ensure that the baselines
should indeed be updated and nothing has slipped by.

It is certainly not possible to provide a comprehensive test suite for
use with the gcc tree. That's because the most valuable tests from
the test suites in use are the tests in the Compaq and ACT test suites
None of the Compaq suite (called "DEC test suite" informally) are
available for use, due to licensing restrictions, and the great majority
of the ACT tests are not available, since they are proprietary customer
code.

So what we are doing at the gcc site is to put a subset of high value
tests that are worth the effort running. What I am saying is that the
C tests meet this criterion, but the B tests don't.

Yes occasionally, a change that someone makes to the system will break a
B test, so what? Much more often it will be the case if people make changes
to the front end that they break one of the tests in the DEC test suite
or ACT test suite. In either case, we here at ACT have to figure out how
to repair the problem, and I would guess that the B tests issues will play
a very minor role.

If someone sees a misspelling in an error message, I am happy for them to
just fix it, and do not want to inhibit such a change just because of the
effort of updating the B tests. Indeed if someone does update the B test
baseline, it would be more work for us to check that they had done this
update correctly than to do it ourselves, and that careful check would be
required in any case. Remember that the B test baseline is an artifact that
is maintained not for testing purposes primarily, but for validation 
purposes, something we are not interested in for the gcc version per se.

Now of course there are changes to error messages that require a huge
amount of work in all test suites. A good example is an enhancement
request we have logged that suggests a nicer treatment of continuation
messages, so that a multi-line message is obviously a multi-line messagre
rather than separate messages. This is a fairly simple patch that could
be done in half an hour, but the consequences to the base lines of all
test suites would be ferocious, so that is why this change is still on the
list (there are lots of other things on the list). In fact perhaps we can
publish at least some of our list of suggested enhancements so that people
can try to do some of them :-)

I actually think that by far the most valuable addition to the C tests would
be to add some of the tests from the ACT test suite that ACT wrote, and that
are therefore potentially available. As soon as we have the tree issues
fully under control (most notably the docuemntation is still a real issue),
we will send some of these tests along.

Robert Dewar

And P.S. we are certainly not suggesting hiding the B tests, we are just
suggesting not worrying about them too much :-)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-09 15:06 ACATS legal status cleared by FSF dewar
@ 2001-12-09 15:55 ` Joseph S. Myers
  0 siblings, 0 replies; 30+ messages in thread
From: Joseph S. Myers @ 2001-12-09 15:55 UTC (permalink / raw)
  To: dewar; +Cc: gcc

On Sun, 9 Dec 2001 dewar@gnat.com wrote:

> I actually think that by far the most valuable addition to the C tests would
> be to add some of the tests from the ACT test suite that ACT wrote, and that
> are therefore potentially available. As soon as we have the tree issues
> fully under control (most notably the docuemntation is still a real issue),
> we will send some of these tests along.

Also, once we have a test harness for ordinary Ada tests similar to the
harnesses for other languages, contributing a test case for every change
ACT make unless there's some reason you can't (e.g. the harness doesn't
support it, or attempts to create a nonconfidential test case for a
problem shown up in confidential code fail), following the normal
requirements for contributing to GCC.  This way a useful test suite can
continue to be built up over time.

(This applies both to front end changes, and to changes to the rest of GCC
to fix problems shown up by Ada test cases.)

-- 
Joseph S. Myers
jsm28@cam.ac.uk

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-09 14:52   ` guerby
@ 2001-12-09 19:47     ` Geert Bosch
  0 siblings, 0 replies; 30+ messages in thread
From: Geert Bosch @ 2001-12-09 19:47 UTC (permalink / raw)
  To: guerby; +Cc: zack, dewar, kenner, mrs, gcc

On Sun, 9 Dec 2001 guerby@acm.org wrote:
  BTW is there any record of the existing noncompile testsuite catching
  problems, or did it just prevent any serious error message work by
  scaring people?

It has not really been useful in catching problems in error messages.
One of the reasons is that most error messages emitted by the compiler
are not triggered at all by the ACATS test. For example, warnings are 
not tested at all and turned off, since the standard doesn't say anything 
about warnings. Also we have for many years collected all reports from 
users who were confused by error messages, and we have tried to improve 
the messages. Exactly these messages and warnings are hardest to get right
and are the ones that do not get tested by ACATS.

In comparison with the ACATS test, regression tests of actual code with 
actual errors that people have made have been far more helpful as tests.
Most advanced error handling and recovery (look for example at the
case of detecting an "is" token replaced by a ";"), is directly
inspired by repeating reports from users. The whole philosophy
behind the error messages in GNAT is not so much to tie them to the 
letter of the RM, but evaluate there usefulness on actual reports
from users.

This is the reason you won't find many direct quotations of
RM rules that have been violated, but rather a less formal message,
which may be a simplification which is clearer for an actual user
(like saying "type T" instead of "the type whose first named subtype is T").
I would encourage anybody (but especially those who have not memorized
the Ada Reference Manual) who finds that GNATemits an unclear error message,
where it clearly could have done better, to file a low priority 
bug report/enhancement request (with complete test case of course)
in the GCC bug database. This kind of reports/tests is what is needed
to improve on error messages. If there is not a to high threshold
in the form of mandatory updating of 50 arcane B test files, somebody
might actually want to improve the message.

  -Geert

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
@ 2001-12-09 19:03 dewar
  0 siblings, 0 replies; 30+ messages in thread
From: dewar @ 2001-12-09 19:03 UTC (permalink / raw)
  To: dewar, jsm28; +Cc: gcc

<<Also, once we have a test harness for ordinary Ada tests similar to the
harnesses for other languages, contributing a test case for every change
ACT make unless there's some reason you can't (e.g. the harness doesn't
support it, or attempts to create a nonconfidential test case for a
problem shown up in confidential code fail), following the normal
requirements for contributing to GCC.  This way a useful test suite can
continue to be built up over time.
>>

Yes, naturally we will try to do this, but it is often not at all easy
to create stripped down test cases, and in such cases, it is better to have
the fix in the gcc tree with no test case, then to simply withhold the fix.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-09 13:02 ` Zack Weinberg
@ 2001-12-09 14:52   ` guerby
  2001-12-09 19:47     ` Geert Bosch
  0 siblings, 1 reply; 30+ messages in thread
From: guerby @ 2001-12-09 14:52 UTC (permalink / raw)
  To: zack; +Cc: dewar, kenner, mrs, gcc

> I think this may be the crux of the difference between the B tests and
> the existing "noncompile" tests for gcc and g++.  We - all the people
> arguing for inclusion of noncompile tests - are used to a context
> where it is easy to automate verification that diagnostics are
> correctly issued.

We here = everyone, I know of no one that posted and is against the
inclusion of B tests. And yes "it is easy to automate" for ACATS too,
but once you have done (and/or adapted from ACT) the baselining
(proper GCC specific marking and splitting) for 11886 errors in 1525
files. The ACATS writers of course did marking, but targeted at no
specific compiler, and that means not for the current GCC in
particular. This work (marking at the right place and splitting so the
compiler dont abort in the middle) has been done as part of growing
the noncompile section of the GCC testsuite, but that's not zero sum
work.

Once you have that done for ACATS, when you improve or change error
recovery or introduce new messages, or change the place where message
are emitted, you're faced with updating the baseline, and on ACATS,
that probably means something in the hundred of baseline updates in
addition to your 20 lines frontend patch. Not undoable of course (all
Ada vendors are doing it), but it will not necessarily mean that GCC
Ada error messages will improve because of it. 

BTW is there any record of the existing noncompile testsuite catching
problems, or did it just prevent any serious error message work by
scaring people?

And the main point: we need a volunteer.

> I'd like to see context - is the ACATS validation suite available
> online somewhere I can go look at it?

See my previous posts, at least:

<http://gcc.gnu.org/ml/gcc/2001-12/msg00357.html>

PS: I'm behind schedule for the execute test tarball, may be during
this week or next weekend.

-- 
Laurent Guerby <guerby@acm.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
@ 2001-12-09 14:00 dewar
  0 siblings, 0 replies; 30+ messages in thread
From: dewar @ 2001-12-09 14:00 UTC (permalink / raw)
  To: dewar, zack; +Cc: gcc, kenner, mrs

<<I think this may be the crux of the difference between the B tests and
the existing "noncompile" tests for gcc and g++.  We - all the people
arguing for inclusion of noncompile tests - are used to a context
where it is easy to automate verification that diagnostics are
correctly issued.  When diagnostics do legitimately change, the test
harness has to be adjusted, but this is straightforward, easily done
by the person who changed the diagnostics.
>>

I don't think the noncompile tests for g++ are anything like as
comprehensive as the B tests in ACATS, which go out of there way
to test every marginal condition in the ARM. The history of these
tests is that they are done sentence by sentence against the RM,
probably there are 50,000 separate tests in all or something like
that, since many tests contain dozens of errors, and there are
thousands of tests. Furthermore, the tests were specifically
designed to check marginal cases (boundary condition testing
was the philosophy of the ACVC tests in the first place). A
consequence is that when a message changes, or disappears, or
moves to a different location, it often takes quite a bit of
expertise in the detailed semantics of Ada at the RM level to
determine whether the change is legitimate. Note that an incorrect
change to the base line, which might be of little consequence in
the g++ case, can be a serious bug in the Ada case, since it
could cause a failed validation in the future, so these baseline
changes have to be done with extreme care.

I certainly think we should upload the B tests, and we have no
problem submitting the current baselines, and people are welcome
to see whether changes they make make a difference, but I think
it is an unnecessary burden on people to require that these tests
be run, and certainly an unneccessary burden to require that the
baselines be updated.

I think the concern here is the following. The question of whether
to require/recommend/suggest that the Ada test suite be run as part
of major/minor gcc bugfix/newfeature modifications is one that needs
discussing, but I (and others familiar with the B tests) feel that
it is far better to encourage people to run the C tests, which are
likely to be far more useful, as well as executable tests that we will
provide to supplement these tests, and have more people doing this,
than having fewer people run the more onerous B tests. And certainly
the L tests should be abandoned as per previous discussion of the
subject.

I will ask Gail Schenker to provide Laurent with the current B test
baselines, and then he can do with them as he sees fit.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-07 19:12 dewar
@ 2001-12-09 13:02 ` Zack Weinberg
  2001-12-09 14:52   ` guerby
  0 siblings, 1 reply; 30+ messages in thread
From: Zack Weinberg @ 2001-12-09 13:02 UTC (permalink / raw)
  To: dewar; +Cc: kenner, mrs, gcc

On Fri, Dec 07, 2001 at 09:56:50PM -0500, dewar@gnat.com wrote:
> Note that just *reading* B tests to see if the output is correct is
> a very difficult task, one that only someone with quite a bit of
> ACVC/ACATS validation experience can do. A formal validation run
> using these tests often involves several days of painstaking manual
> work by someone who is an expert in the B tests to assure
> compliance.

I think this may be the crux of the difference between the B tests and
the existing "noncompile" tests for gcc and g++.  We - all the people
arguing for inclusion of noncompile tests - are used to a context
where it is easy to automate verification that diagnostics are
correctly issued.  When diagnostics do legitimately change, the test
harness has to be adjusted, but this is straightforward, easily done
by the person who changed the diagnostics.

You're saying that the B tests are nothing like that, and we are
finding that hard to believe.  I'd like to see context - is the ACATS
validation suite available online somewhere I can go look at it?

zw

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
@ 2001-12-07 19:12 dewar
  2001-12-09 13:02 ` Zack Weinberg
  0 siblings, 1 reply; 30+ messages in thread
From: dewar @ 2001-12-07 19:12 UTC (permalink / raw)
  To: dewar, kenner, mrs, zack; +Cc: gcc

Note that just *reading* B tests to see if the output is correct is a
very difficult task, one that only someone with quite a bit of ACVC/ACATS
validation experience can do. A formal validation run using these tests
often involves several days of painstaking manual work by someone who
is an expert in the B tests to assure compliance.

Consider various possibilities:

1. You must not make changes to the compiler that change the B tests baselines.

     This is far too restrictive, it would forbid even fixing a spelling error.
     In practice fixing a bug in one part of the compiler can often change
     an error message elsewhere in one of the B tests (by implementing
     better error recovery). Most often such changes are for the better.

2. If the B test baseline changes, verify the baseline change and adjust the
   baseline.

     This would be fine, except that the verification process typically
     requires someone with very good expertise in Ada semantics, AND
     very good familiarity with the ACATS test suite. There are not
     many such people in the world.

A huge effort from any Ada vendor goes into making sure that the compiler
passes all the B tests. Many have actively questioned the value of this
effort in the past, but if you want to formally validate you have no 
choice. But this does not necessarily mean that these are well chosen
tests from the point of view of our use here at gnu.org.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
@ 2001-12-07 18:57 dewar
  0 siblings, 0 replies; 30+ messages in thread
From: dewar @ 2001-12-07 18:57 UTC (permalink / raw)
  To: dewar, kenner, mrs, zack; +Cc: gcc

<<If your compiler randomly changes around where messages come out all
the time, maybe you should re-engineer it from the top down, fix all
of them to be correct, once, fix the testcases in the testsuite to
conform to it, and then refuse any changes to this status quo for 5-10
years, and then after 5 years, put in all the changes enmass you would
like, redo the testsuite, lather, rise, repeat.
>>

It is not at all a matter of "randomly changing around where messages
come out all the time". Rather it is a matter of constantly improving
the messages (something that would be welcome in all compilers :-)
and each time such improvement occurs, it can discombobulate the baselines,
and require fairly painstaking adjustments. Of course we have to do these
adjustments at ACT, but the real point is that it would be a mistake
to make the B tests a barrier to development. The B tests of Ada are
really quite unlikely any other test suite I have seen for any other language,
so I would definitely advise becoming thoroughly familiar with these tests
before being too sure you know the answers.

Notice the pattern here. Among those who are familiar with the ACATS B
tests there is a consensus that it is not obvious that they are of value
in our context. It is those who do not know the suite who are sure they
must be of great value :-)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
@ 2001-12-07 18:50 mike stump
  0 siblings, 0 replies; 30+ messages in thread
From: mike stump @ 2001-12-07 18:50 UTC (permalink / raw)
  To: dewar, kenner, zack; +Cc: gcc

> From: dewar@gnat.com
> To: kenner@vlsi1.ultra.nyu.edu, zack@codesourcery.com
> Cc: gcc@gcc.gnu.org
> Date: Fri,  7 Dec 2001 20:39:54 -0500 (EST)

> <<One possibility is to run the B tests in a non-conventional way, where all
> the test harness does is to check for the presence of at least one
> error line for each line marked ERROR.  I don't know how hard such a
> harness is to write, and that's not the way B tests are usually done, but
> might work.
> >>

> That's not good enough, the errors often do not occur on exactly the correct
> lines.

We have years of experience with such a scheme for C++, it works, it
is useful, it isn't a maintenance burden.  I find the value of it
easily outweighs the maintenance cost of it.

If your compiler randomly changes around where messages come out all
the time, maybe you should re-engineer it from the top down, fix all
of them to be correct, once, fix the testcases in the testsuite to
conform to it, and then refuse any changes to this status quo for 5-10
years, and then after 5 years, put in all the changes enmass you would
like, redo the testsuite, lather, rise, repeat.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
@ 2001-12-07 17:59 dewar
  0 siblings, 0 replies; 30+ messages in thread
From: dewar @ 2001-12-07 17:59 UTC (permalink / raw)
  To: kenner, zack; +Cc: gcc

<<One possibility is to run the B tests in a non-conventional way, where all
the test harness does is to check for the presence of at least one
error line for each line marked ERROR.  I don't know how hard such a
harness is to write, and that's not the way B tests are usually done, but
might work.
>>

That's not good enough, the errors often do not occur on exactly the correct
lines.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
@ 2001-12-07  3:18 Richard Kenner
  0 siblings, 0 replies; 30+ messages in thread
From: Richard Kenner @ 2001-12-07  3:18 UTC (permalink / raw)
  To: zack; +Cc: gcc

    This is all fair.  How about a compromise position where you check the
    B tests into the CVS tree but don't bother adding dejagnu harnesses
    for them.  They are then instantly available if they are needed.

The issue is the *baselines*, not either the B tests themselves or the
test harness.  They must be checked in *and* actively maintained if
they are to be "instantly available".

One possibility is to run the B tests in a non-conventional way, where all
the test harness does is to check for the presence of at least one
error line for each line marked ERROR.  I don't know how hard such a
harness is to write, and that's not the way B tests are usually done, but
might work.

That would just leave the issue of splits, not baselines, and those change
less often.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
@ 2001-12-06 19:09 dewar
  0 siblings, 0 replies; 30+ messages in thread
From: dewar @ 2001-12-06 19:09 UTC (permalink / raw)
  To: bosch, zack; +Cc: gcc, guerby

<<My experience working on the C front end has been that tests for proper
diagnostics from ill-formed code are extremely useful.  Can you explain
why they're almost useless for the Ada front end?
>>

It's just the experience that I mentioned, that most of the work is
fiddling for perfectly acceptable correct changes in messages.

<<This is all fair.  How about a compromise position where you check the B
tests into the CVS tree but don't bother adding dejagnu harnesses for them.
They are then instantly available if they are needed.
>>

Certainly they are not secret. And we can certainly check them in. The
question is whether to set up the harness with the baselines or not.
(evaluating B test runs without the baselines is an overwhelming task).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-06 15:41         ` Geert Bosch
@ 2001-12-06 18:22           ` Zack Weinberg
  0 siblings, 0 replies; 30+ messages in thread
From: Zack Weinberg @ 2001-12-06 18:22 UTC (permalink / raw)
  To: Geert Bosch; +Cc: guerby, gcc

On Thu, Dec 06, 2001 at 06:40:27PM -0500, Geert Bosch wrote:
> 
> Yes, I'm confident it does too. That is why I am in favor of adding
> these tests and running them. Of all the Ada cases you're talking about,
> there has not even been one that was related to a B test. They are
> completely useless for testing the backend and almost useless for 
> the front end.

My experience working on the C front end has been that tests for proper
diagnostics from ill-formed code are extremely useful.  Can you explain
why they're almost useless for the Ada front end?

> I think it is important to realize, that it is easy enough to go and
> add B tests later on, if you find they would have caught problems that
> went by unnoticed. The tests are available and can be added at any time.
> Paying a high upfront price to prevent this scenario that the Ada
> maintainers find unlikely to be a problem, seems not a good idea to me.
> The only thing it will do is hinder development.

This is all fair.  How about a compromise position where you check the B
tests into the CVS tree but don't bother adding dejagnu harnesses for them.
They are then instantly available if they are needed.

zw

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
@ 2001-12-06 17:38 dewar
  0 siblings, 0 replies; 30+ messages in thread
From: dewar @ 2001-12-06 17:38 UTC (permalink / raw)
  To: jsm28, kenner; +Cc: gcc

<Well, the ACATS tests do not check code quality, but it's correct that the B
tests verify that each condition that must produce an error message do so.
And I agree that this is worthwhile test to run.
>>

The important thing to understand is that I would say only one percent of
changes to B test output are actual errors. The remaining 99% are a result
of minor changes in error message wording, formatting, or production.

I agree with Geert that for now it is a bad idea to include the B tests.

<<It is certainly valuable to have the B tests *around* for those cases when
having a run might be useful, but requiring them as a condition for checkins
doesn't make any sense at all for changes other than to the Ada front end
(since these tests mostly don't even get out of the front end since *all* of
them have errors) and is of only marginal value for changes to the Ada front
end.
>>

I would say marginal here = not worth while

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-06 15:10       ` Zack Weinberg
@ 2001-12-06 15:41         ` Geert Bosch
  2001-12-06 18:22           ` Zack Weinberg
  0 siblings, 1 reply; 30+ messages in thread
From: Geert Bosch @ 2001-12-06 15:41 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: guerby, gcc

On Thursday, December 6, 2001, at 06:08 , Zack Weinberg wrote:
> I'm 100% confident that there is value to having "make check" drive at
> least some set of Ada tests.  Ever since I've been a member of the
> project I've been seeing patches to the back end go by with a note
> "this code is only used by Ada" or "test case is in Ada" (with the
> implication that writing a C testcase is impossible or at least too
> much work).  How much back end logic is that, that the current test
> suite doesn't even touch?

Yes, I'm confident it does too. That is why I am in favor of adding
these tests and running them. Of all the Ada cases you're talking about,
there has not even been one that was related to a B test. They are
completely useless for testing the backend and almost useless for 
the front end.

I think it is important to realize, that it is easy enough to go and
add B tests later on, if you find they would have caught problems that
went by unnoticed. The tests are available and can be added at any time.
Paying a high upfront price to prevent this scenario that the Ada
maintainers find unlikely to be a problem, seems not a good idea to me.
The only thing it will do is hinder development.

   -Geert

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
@ 2001-12-06 15:40 Richard Kenner
  0 siblings, 0 replies; 30+ messages in thread
From: Richard Kenner @ 2001-12-06 15:40 UTC (permalink / raw)
  To: zack; +Cc: gcc

    I'm 100% confident that there is value to having "make check" drive at
    least some set of Ada tests.  

I don't think *anybody* disagrees with that.  The question is what should
that subset be?  My experience with backend changes is that most of the
failures are in C3, CD, CXA, and CXG, with a smaller number in C4.  For
changes to other than the Ada front-end, the benefit of running additional
chapters is, in my opinion, small.  I can't remember a time when a backend
change caused an ACATS failure that didn't show up in one of those chapters.
That being said, I should also point out that the ACATS suite isn't a very
good test of the back-end at all, but perhaps running it with different
optimization levels will help.

    (with the implication that writing a C testcase is impossible or at
    least too much work).  

Usually impossible.  The issue is trees that can't be made in C, such as with
PLACEHOLDER_EXPR.

    It may make sense not to run the ACATS B tests by default, but they
    should at least be _present_ in the repository so that everyone is on
    an equal footing for changes that affect error messages.

I think everybody agrees with that too, but the question is what does
"present" mean with respect to the baselines, which is where the real
maintenance issue is.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-06 14:24     ` Geert Bosch
  2001-12-06 14:32       ` Joseph S. Myers
@ 2001-12-06 15:10       ` Zack Weinberg
  2001-12-06 15:41         ` Geert Bosch
  1 sibling, 1 reply; 30+ messages in thread
From: Zack Weinberg @ 2001-12-06 15:10 UTC (permalink / raw)
  To: Geert Bosch; +Cc: guerby, gcc

On Thu, Dec 06, 2001 at 05:12:14PM -0500, Geert Bosch wrote:
> 
> It is virtually impossible for people to "break" these tests, which
> is why I say they are of no value. Even if people *do* manage to
> break them (in the hypothetical case that the maintainers would not
> catch the error before approving), this will not go unnoticed for a
> long time anyway. In the mean time, the *only* programs affected are
> programs with fatal errors to start with.
> 
> I would be surprised if there would even be consensus to run just
> the executable ACATS tests as part of make check, since this would
> already double testing time for all contributors. Adding testing
> requirements is not free, and there needs to be a benefit to it.
> 
> For most of the executable ACATS tests I think there is a good
> benefit/cost ratio for the front end, and even for the back end. For
> that reason I am happy to see Laurent doing the work to get them
> integrated. For the B tests, adding testing is of near-zero value at
> a high cost.
> 
> Zack, I'd like to see very good reasons why you think it is
> reasonable to significantly increase of required volunteer time to
> make GNAT changes and that way hinder development and maintenance.

I'm 100% confident that there is value to having "make check" drive at
least some set of Ada tests.  Ever since I've been a member of the
project I've been seeing patches to the back end go by with a note
"this code is only used by Ada" or "test case is in Ada" (with the
implication that writing a C testcase is impossible or at least too
much work).  How much back end logic is that, that the current test
suite doesn't even touch?

It may make sense not to run the ACATS B tests by default, but they
should at least be _present_ in the repository so that everyone is on
an equal footing for changes that affect error messages.

zw

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
@ 2001-12-06 15:01 Richard Kenner
  0 siblings, 0 replies; 30+ messages in thread
From: Richard Kenner @ 2001-12-06 15:01 UTC (permalink / raw)
  To: jsm28; +Cc: gcc

    In general I consider a patch which adds a diagnostic without
    including a test exercising that code path, or adds a language feature
    without proper tests for the associated constraints, to be defective.
    I get the impression from this discussion that these tests represent
    something similar for Ada - tests of the ways in which code can be
    defective and diagnostics issued for it - and so would be of similar
    value.  It is just as much a fundamental part of avoiding regressions
    that bad code remains diagnosed and the messages do not get worse, as
    that good code continues to compile and code quality does not get
    worse.

Well, the ACATS tests do not check code quality, but it's correct that the B
tests verify that each condition that must produce an error message do so.
And I agree that this is worthwhile test to run.

However, I also agree with what Geert said: it is important to become
familiar with this test suite before making such decisions.  This is a very
complex test suite with a very high cost of maintenance.  You need to look at
both the benefit and cost of running each of the tests.

The problem with the B tests in particular is that the normal way of running
them is to compare the output with a baseline output and manually inspect
differences between that baseline and any different output.  This means that
a wording change in a common error message can easily affect over a thousand
baseline files.  Dealing these tests is an esoteric specialty built up over
the last few decades.

It is certainly valuable to have the B tests *around* for those cases when
having a run might be useful, but requiring them as a condition for checkins
doesn't make any sense at all for changes other than to the Ada front end
(since these tests mostly don't even get out of the front end since *all* of
them have errors) and is of only marginal value for changes to the Ada front
end.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-06 14:24     ` Geert Bosch
@ 2001-12-06 14:32       ` Joseph S. Myers
  2001-12-06 15:10       ` Zack Weinberg
  1 sibling, 0 replies; 30+ messages in thread
From: Joseph S. Myers @ 2001-12-06 14:32 UTC (permalink / raw)
  To: Geert Bosch; +Cc: Zack Weinberg, guerby, gcc

On Thu, 6 Dec 2001, Geert Bosch wrote:

> It is virtually impossible for people to "break" these tests, which
> is why I say they are of no value. Even if people *do* manage to break
> them (in the hypothetical case that the maintainers would not catch the
> error before approving), this will not go unnoticed for a long
> time anyway. In the mean time, the *only* programs affected  are 
> programs with fatal errors to start with.

I have not examined these tests, but for C I would consider it valuable
for every constraint from either standard version, and every diagnostic
message that the front end can issue, to have at least one test.  (Ideally
the test suite should aim for high code coverage in the compiler; I
haven't tried running in conjunction with gcov to see how far away from
this we are.)  In general I consider a patch which adds a diagnostic 
without including a test exercising that code path, or adds a language 
feature without proper tests for the associated constraints, to be 
defective.  I get the impression from this discussion that these tests 
represent something similar for Ada - tests of the ways in which code can 
be defective and diagnostics issued for it - and so would be of similar 
value.  It is just as much a fundamental part of avoiding regressions that 
bad code remains diagnosed and the messages do not get worse, as that good 
code continues to compile and code quality does not get worse.

-- 
Joseph S. Myers
jsm28@cam.ac.uk

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-06 11:48   ` Zack Weinberg
@ 2001-12-06 14:24     ` Geert Bosch
  2001-12-06 14:32       ` Joseph S. Myers
  2001-12-06 15:10       ` Zack Weinberg
  0 siblings, 2 replies; 30+ messages in thread
From: Geert Bosch @ 2001-12-06 14:24 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: guerby, gcc

On Thursday, December 6, 2001, at 02:40 , Zack Weinberg wrote:

> I disagree in the strongest possible terms.  Put the B tests in the
> public repository.  If you don't, you only make life harder for people
> outside of ACT who wish to work on the Ada front end.
>
> The maintenance work has to be done anyway, and ought to be the
> responsibility of the person who makes the change that causes the
> tests to regress.  If the B tests are run as part of "make check" in
> the FSF tree, this will be enforced automatically.

I'd like you to first get more familiair with the test suite before
making such strong comments. Laurent and I both have a lot of experience
with this test suite and do not see much value in running the B tests,
whille the cost (in volunteer time) is high.

It is virtually impossible for people to "break" these tests, which
is why I say they are of no value. Even if people *do* manage to break
them (in the hypothetical case that the maintainers would not catch the
error before approving), this will not go unnoticed for a long
time anyway. In the mean time, the *only* programs affected  are 
programs
with fatal errors to start with.

I would be surprised if there would even be consensus to run just
the executable ACATS tests as part of make check, since this would
already double testing time for all contributors. Adding testing
requirements is not free, and there needs to be a benefit to it.

For most of the executable ACATS tests I think there is a good 
benefit/cost
ratio for the front end, and even for the back end. For that reason 
I am happy
to see Laurent doing the work to get them integrated. For the B 
tests, adding
testing is of near-zero value at a high cost.

Zack, I'd like to see very good reasons why you think it is 
reasonable to
significantly increase of required volunteer time to make GNAT changes
and that way hinder development and maintenance.

   -Geert

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-06  9:34 ` Geert Bosch
@ 2001-12-06 11:48   ` Zack Weinberg
  2001-12-06 14:24     ` Geert Bosch
  0 siblings, 1 reply; 30+ messages in thread
From: Zack Weinberg @ 2001-12-06 11:48 UTC (permalink / raw)
  To: Geert Bosch; +Cc: guerby, gcc

On Thu, Dec 06, 2001 at 12:21:02PM -0500, Geert Bosch wrote:
[...]

Most of these plans seem fine by me except...

> >The B tests require a lot of maintainance (hundreds of pages of
> >changes each time you improve a message, split of files with too many
> >errors, etc...), and have no value for the backend, may be someone
> >will volunteer the packaging, but not me. I assume ACT dedicates
> >someone to this task anyway :).
> Yes, agreed. These tests just represent a lot of maintenance work for
> very little benefit, since these are only tests of programs with errors
> and indeed ACT already does this work.

I disagree in the strongest possible terms.  Put the B tests in the
public repository.  If you don't, you only make life harder for people
outside of ACT who wish to work on the Ada front end.

The maintenance work has to be done anyway, and ought to be the
responsibility of the person who makes the change that causes the
tests to regress.  If the B tests are run as part of "make check" in
the FSF tree, this will be enforced automatically.

zw

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-05 15:13 guerby
                   ` (2 preceding siblings ...)
  2001-12-06  3:36 ` Geoff Keating
@ 2001-12-06  9:34 ` Geert Bosch
  2001-12-06 11:48   ` Zack Weinberg
  3 siblings, 1 reply; 30+ messages in thread
From: Geert Bosch @ 2001-12-06  9:34 UTC (permalink / raw)
  To: guerby; +Cc: gcc


On Wednesday, December 5, 2001, at 06:05 , <guerby@acm.org> wrote:
> Class A tests check for acceptance (compilation) of language
> constructs that are expected to compile without error.
>
> Class B tests  check that illegal  constructs are recognized  and
> treated as fatal  errors. They are  not expected to  successfully
> compile, bind, or execute.
>
> Class C tests  check that executable  constructs are  implemented
> correctly and produce expected results. These tests are  expected
> to  compile,  bind,   execute  and  report   "PASSED"  or   "NOT-
> APPLICABLE".  Each   class  C   test  reports   "PASSED",   "NOT-
> APPLICABLE", or "FAILED" based on  the results of the  conditions
> tested.
>
> Class D tests check that implementations perform exact arithmetic
> on large literal  numbers. These tests  are expected to  compile,
> bind, execute and report "PASSED". Each test reports "PASSED"  or
> "FAILED" based on the conditions tested. Some implementations may
> report errors at compile  time for some of  them, if the  literal
> numbers exceed compiler limits.
>
> Class E tests check for constructs that may require inspection to
> verify. They have special grading criteria that are stated within
> the test source.
>
> Class L tests check that all  library unit dependencies within  a
> program are  satisfied  before  the  program  can  be  bound  and
> executed, that  circularity  among  units is  detected,  or  that
> pragmas  that  apply  to   an  entire  partition  are   correctly
> processed.
>>>
>
> All ACATS tests are identifed by a 7 characters key which is more or
> less composed from Class + Reference Manual Chapter + RM Section + Key.
>
> All the files for a test begin by this 7 character key, most tests
> have only one file, but some have more than one.  An ACATS source file
> can contain multiple compilation unit, to run them through GNAT we
> first need to "gnatchop" them so that each unit ends up with one file
> with the name GNAT expects by default, then we have to guess what is
> the file containing the main routine, "gnatmake" it and run it.
> This increases the count of source files from 2500 to 4100.
>
> To avoid preprocessing and script-like machinery overhead, I propose
> that we commit the ACATS sources directly in the form expected by
> GNAT, so we end up just having a list of main unit names, and so a
> simple minded loop of "gnatmake x; run x" will be the only thing
> needed to run.
>
> [1] Any objection?
I think your idea of having a set of easily/quickly runnable set of
ACATS tests is a good one.
>
> [2] Do we want to keep some subdirectories, if so what granularity, or
> all in one dir is okay (something like 4000 files)?
 From experience I find it really valuable to have separate 
subdirectories
per class/chapter. For example, if you work on some changes to handling
of floating point, you would initially just want to run chapter 
CXG, which
means the executable RM Annex G tests.
> [3] Some file names will be up to 56 characters, okay? (87% have less
> or equal than 14.)
It might be preferable to "krunch" the names to a more managable length
using the -gnatk option, since there are still many filesystems 
that have
issues with long names and it's also easier to deal with shorter names
as human beings (those who have seen the few examples of few long 
names in
ACATS tests, will agree they do not help: only the first few are 
useful).
> [4] What should be the top directory in the gcc tree for the ACATS
> test suite?  gcc/gcc/testsuite/ada/acats or a top level directory or
> something else? I assume we want ACATS somewhat separated from other
> Ada tests generated by the GCC project, this will facilitate
> maintenance.
We should definately keep ACATS completely separate.
> [5] IMHO it is best if the form in which we commit the sources of the
> ACATS tests is as separated as possible from the testsuite harness
> technology at first, just sources, README and a list of test names.
> Okay?
Seems OK.
> There are convoluted tests trying to see if the compiler conforms to
> the Ada compilation model initially suggested by the RM, but as RMS
> invented a new way of reading this and that GNAT does follow it (and
> modern proprietary compiler do the same), these tests are of no value,
> I propose we just drop them. I'll of course document all such tests
> dropped and why
Indeed, I would not bother by these. I do not see any benefit of these
for testing, mostly they just add a lot of script work and clutter for
no gain.
> The B tests require a lot of maintainance (hundreds of pages of
> changes each time you improve a message, split of files with too many
> errors, etc...), and have no value for the backend, may be someone
> will volunteer the packaging, but not me. I assume ACT dedicates
> someone to this task anyway :).
Yes, agreed. These tests just represent a lot of maintenance work for
very little benefit, since these are only tests of programs with errors
and indeed ACT already does this work.
> [7] I can help if someone wants to take over this, but ACT has probably
> to provide an initial baseline of splits and scripts, so please check
> with them first.
Even though these scripts and splits are not secret or anything, they
change rapidly and something given today will be pretty useless a couple
of weeks from now. This would just be too much of a hassle to deal with.
> I have a patch applying to all tests using the delay statement in
> order to be able to rescale them
That's OK. Please use same casing as in rest of the test though.
> [8] I propose to commit directly with the patch applied, I'll of course
> commit and maintain a list of tests that were modified and why, any
> objection?
Fine. I was wondering though, whether it would make sense to bundle 
tests
into bigger batches. There really is no reason why we couldn't have 
tests
that work on an entire chapter at a time by just making a master 
procedure
that calls all other tests, and I'd think that might speed up the 
process.
Did you ever try that, and if so, what were your findings?
> [10] The first harness prototype will be ultra simple and should allow
> anyone to play with it so we can make progress on portability and
> features later on by patches on stuff in CVS the regular GCC way
> instead of by generating a 20 MB tarball each time we change
> something.
You did not really address the reason why you had decided not to use
the existing testing harness. I can see a number of reasons why seperate
scripts would be better for ACATS, but these should be documented.
> The upstream ACATS is maintained using CVS, web documents and
> interface from <http://www.ada-auth.org/>. There's a very low volume
> public mailing list.
>
> [12] Do we want to scrap the legalese and replace it by something else?
> Do we want to identify changes or extension made within the GCC
> project, and if so how?
I wouldn't think so. This seems to only add confusion about the 
licensing.
Of course if we would make significant changes to the tests, we 
should put
the files under GPL. Also any new files and scripts should be GPL-ed.
> [13] I apply for maintainership of this stuff once commited, and will
> keep it in synch with the change made upstream.
This would be great.

   -Geert

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-05 15:13 guerby
  2001-12-05 16:21 ` Joseph S. Myers
  2001-12-05 18:00 ` Jerry van Dijk
@ 2001-12-06  3:36 ` Geoff Keating
  2001-12-06  9:34 ` Geert Bosch
  3 siblings, 0 replies; 30+ messages in thread
From: Geoff Keating @ 2001-12-06  3:36 UTC (permalink / raw)
  To: guerby; +Cc: gcc

<guerby@acm.org> writes:

> I just received confirmation from FSF that ACATS DFAR legalese means
> that we can do whatever we want with it. In this email I make a few
> descriptions and proposals, they start by [N].
...

You do plan to use dejagnu to run the tests, correct?

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
@ 2001-12-05 23:36 dewar
  0 siblings, 0 replies; 30+ messages in thread
From: dewar @ 2001-12-05 23:36 UTC (permalink / raw)
  To: gcc, guerby

Laurent's plan sounds like a good one to me!

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-05 15:13 guerby
  2001-12-05 16:21 ` Joseph S. Myers
@ 2001-12-05 18:00 ` Jerry van Dijk
  2001-12-06  3:36 ` Geoff Keating
  2001-12-06  9:34 ` Geert Bosch
  3 siblings, 0 replies; 30+ messages in thread
From: Jerry van Dijk @ 2001-12-05 18:00 UTC (permalink / raw)
  To: guerby; +Cc: gcc

guerby@acm.org writes:

 > To avoid preprocessing and script-like machinery overhead, I propose
 > that we commit the ACATS sources directly in the form expected by
 > GNAT, so we end up just having a list of main unit names, and so a
 > simple minded loop of "gnatmake x; run x" will be the only thing
 > needed to run. 
 > 
 > [1] Any objection?

I assume we want to run an ACATS subset as a basic sanity check, right ?
But even in that case, it would be advisable to follow the changes made
to the test suite. Your proposal means this has to be done manually by
someone. Who is going to monitor this, and propagate all changes in the
tests ?

 > [2] Do we want to keep some subdirectories, if so what granularity, or
 > all in one dir is okay (something like 4000 files)?

I like the current subdirectory structure, it makes it easier if you want
to check just one test, or a set or related tests.

 > [5] IMHO it is best if the form in which we commit the sources of the
 > ACATS tests is as separated as possible from the testsuite harness
 > technology at first, just sources, README and a list of test names.
 > Okay?

I would a a minium also add the scripts to run all test automatically
and report on them. Also a way to run a single test for debugging.

 > The B tests require a lot of maintainance (hundreds of pages of
 > changes each time you improve a message, split of files with too many
 > errors, etc...), and have no value for the backend, may be someone
 > will volunteer the packaging, but not me. I assume ACT dedicates
 > someone to this task anyway :).

I think it's important to test the whole GNAT system, not just the backend.
The value of the B tests is that it alerts you to unexpected changes. I think
that at least for now they should stay. We can add 'skip tests' for changes
that are to much work for the added value.

 > [12] Do we want to scrap the legalese and replace it by something else?
 > Do we want to identify changes or extension made within the GCC
 > project, and if so how?

I do not think you are allowed to 'scap the legales'.

Basically the question is if you want to see the test suite as an scaled
down ACATS (in which case all changes etc need to be documented), or see
ACATS as a starting point, and develop the suite further, for example
adding regression testing for discovered bugs, etc.

My first impulse is to stick to ACATS, to follow language development,
and add extra tests separately.

Just my opinion, of course.

-- 
--  Jerry van Dijk   | email: jvandyk@attglobal.net
--  Leiden, Holland  | web:   home.trouwweb.nl/Jerry

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-05 15:13 guerby
@ 2001-12-05 16:21 ` Joseph S. Myers
  2001-12-05 18:00 ` Jerry van Dijk
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 30+ messages in thread
From: Joseph S. Myers @ 2001-12-05 16:21 UTC (permalink / raw)
  To: guerby; +Cc: gcc

On Thu, 6 Dec 2001 guerby@acm.org wrote:

> [4] What should be the top directory in the gcc tree for the ACATS
> test suite?  gcc/gcc/testsuite/ada/acats or a top level directory or

That seems plausible - presumably "runtest --tool ada" will be used in
"make check" to run both those and other Ada tests?

> [8] I propose to commit directly with the patch applied, I'll of course
> commit and maintain a list of tests that were modified and why, any
> objection?

It is arguable that perhaps the CVS vendor branch mechanism should be used
- import the unmodified sources on the vendor branch with GCC's modified
version on the mainline.  This should make it easy to use CVS to see what
the differences from unmodified ACATS are.  I don't know whether whatever
tests it is deliberately decided to drop rather than include in the GCC
testsuite should also go on the vendor branch, or whether it should just
be documented which are dropped.

> The upstream ACATS is maintained using CVS, web documents and
> interface from <http://www.ada-auth.org/>. There's a very low volume
> public mailing list.

We ought to have a link to this from the Ada section of readings.html.

-- 
Joseph S. Myers
jsm28@cam.ac.uk

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: ACATS legal status cleared by FSF
  2001-12-05 15:28 Richard Kenner
@ 2001-12-05 15:41 ` guerby
  0 siblings, 0 replies; 30+ messages in thread
From: guerby @ 2001-12-05 15:41 UTC (permalink / raw)
  To: kenner; +Cc: gcc

> Another quetsion is what compilation options do we run them with?

I assume you talk about the mode everyone will be forced to run before
commit as part of the standard procedure once Ada is first class
citizen in GCC?

ACATS at -O0 is 32 minutes on a P3 1GHz to be compared to 25 minutes
for C only make check and 45 minutes for c,c++,f77 (which does run
each test 4 or 5 times with different flags IIRC). I assume -O2 won't
take much longer (I did run it, but don't have the timings handy).

I'll do a breakdown by chapter / by -O level in the next few days.

I have no opinion on the topic, everything is possible depending on
how much time we want the standard check to last on an average
machine. The harness must support easy flag configuration anyway.

-- 
Laurent Guerby <guerby@acm.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re:  ACATS legal status cleared by FSF
@ 2001-12-05 15:28 Richard Kenner
  2001-12-05 15:41 ` guerby
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Kenner @ 2001-12-05 15:28 UTC (permalink / raw)
  To: guerby; +Cc: gcc

Another quetsion is what compilation options do we run them with?

Do we use -O0, -O1, -O2 or do we run them with all three?

In my experience, the ACATS tests are only a moderately-good test of the
backend.  The C2 tests really don't test anything in the backend at all
and the C7, C8, CA, and CB tests are nearly totally front-end tests.
C9 and CXD mostly checks the library.

My suggestion is to use C3, C4, C5, C6, CD, CXA, and CXG as the tests
run regularly as back-end tests and to run them with the same collection
of options we use for the C execution tests.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* ACATS legal status cleared by FSF
@ 2001-12-05 15:13 guerby
  2001-12-05 16:21 ` Joseph S. Myers
                   ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: guerby @ 2001-12-05 15:13 UTC (permalink / raw)
  To: gcc

I just received confirmation from FSF that ACATS DFAR legalese means
that we can do whatever we want with it. In this email I make a few
descriptions and proposals, they start by [N].

The executable part of ACATS (formaly A, C, D, E and L tests - see
below) comprises 2500 source files, summing to 15 MB and providing
around 2300 executable tests. Each source file has a DFAR legalese
header - attached after my signature - and a description of the test.

From the ACATS documentation

<<
Class A tests check for acceptance (compilation) of language
constructs that are expected to compile without error.

Class B tests  check that illegal  constructs are recognized  and
treated as fatal  errors. They are  not expected to  successfully
compile, bind, or execute.

Class C tests  check that executable  constructs are  implemented
correctly and produce expected results. These tests are  expected
to  compile,  bind,   execute  and  report   "PASSED"  or   "NOT-
APPLICABLE".  Each   class  C   test  reports   "PASSED",   "NOT-
APPLICABLE", or "FAILED" based on  the results of the  conditions
tested.

Class D tests check that implementations perform exact arithmetic
on large literal  numbers. These tests  are expected to  compile,
bind, execute and report "PASSED". Each test reports "PASSED"  or
"FAILED" based on the conditions tested. Some implementations may
report errors at compile  time for some of  them, if the  literal
numbers exceed compiler limits.

Class E tests check for constructs that may require inspection to
verify. They have special grading criteria that are stated within
the test source.

Class L tests check that all  library unit dependencies within  a
program are  satisfied  before  the  program  can  be  bound  and
executed, that  circularity  among  units is  detected,  or  that
pragmas  that  apply  to   an  entire  partition  are   correctly
processed.
>>

All ACATS tests are identifed by a 7 characters key which is more or
less composed from Class + Reference Manual Chapter + RM Section + Key.

All the files for a test begin by this 7 character key, most tests
have only one file, but some have more than one.  An ACATS source file
can contain multiple compilation unit, to run them through GNAT we
first need to "gnatchop" them so that each unit ends up with one file
with the name GNAT expects by default, then we have to guess what is
the file containing the main routine, "gnatmake" it and run it.
This increases the count of source files from 2500 to 4100.

To avoid preprocessing and script-like machinery overhead, I propose
that we commit the ACATS sources directly in the form expected by
GNAT, so we end up just having a list of main unit names, and so a
simple minded loop of "gnatmake x; run x" will be the only thing
needed to run. 

[1] Any objection?

[2] Do we want to keep some subdirectories, if so what granularity, or
all in one dir is okay (something like 4000 files)?

[3] Some file names will be up to 56 characters, okay? (87% have less
or equal than 14.)

[4] What should be the top directory in the gcc tree for the ACATS
test suite?  gcc/gcc/testsuite/ada/acats or a top level directory or
something else? I assume we want ACATS somewhat separated from other
Ada tests generated by the GCC project, this will facilitate
maintenance.

[5] IMHO it is best if the form in which we commit the sources of the
ACATS tests is as separated as possible from the testsuite harness
technology at first, just sources, README and a list of test names.
Okay?

There are convoluted tests trying to see if the compiler conforms to
the Ada compilation model initially suggested by the RM, but as RMS
invented a new way of reading this and that GNAT does follow it (and
modern proprietary compiler do the same), these tests are of no value,
I propose we just drop them. I'll of course document all such tests
dropped and why.

[6] Okay?

The B tests require a lot of maintainance (hundreds of pages of
changes each time you improve a message, split of files with too many
errors, etc...), and have no value for the backend, may be someone
will volunteer the packaging, but not me. I assume ACT dedicates
someone to this task anyway :).

[7] I can help if someone wants to take over this, but ACT has probably
to provide an initial baseline of splits and scripts, so please check
with them first.

Here is a rough count of executable tests by class and chapter (lower
bound, some are omitted):

   4 ; cz Check that the test reporting stuff works
  75 ; a  
  34 ; c2 Lexical Elements
 351 ; c3 Declarations and Types
 339 ; c4 Names and Expressions
  95 ; c5 Statements
  81 ; c6 Subprograms 
  51 ; c7 Packages
 140 ; c8 Visibility Rules
 255 ; c9 Tasks and Synchronization
  74 ; ca Program Structure and Compilation Issues
  43 ; cb Exceptions
 117 ; cc Generic Units
 173 ; cd Representation Issues
 268 ; ce Predefined Language Environment (Ada 83)
  87 ; cxa Predefined Language Environment (Ada 95)
  30 ; cxb Interface to Other Languages
  13 ; cxc Systems Programming
  38 ; cxd Real-Time Systems
   1 ; cxe Distributed Systems 
  20 ; cxf Information Systems
  29 ; cxg Numerics
   4 ; cxh Safety and Security
   4 ; d  
  11 ; e  
  26 ; l  

I have a patch applying to all tests using the delay statement in
order to be able to rescale them (their default value in second is
absurd for modern machines), the patch is composed of hunks like:

+with Impdef;
 PACKAGE BODY C94005B_PKG IS

      TASK BODY TT IS
@@ -59,7 +59,7 @@
           ACCEPT E (I : INTEGER) DO
                LOCAL := I;
           END E;
-          DELAY 60.0;    -- SINCE THE PARENT UNIT HAS HIGHER PRIORITY
+          DELAY 60.0 * Impdef.One_Second;    -- SINCE THE PARENT UNIT HAS HIGHER PRIORITY
                          -- AT THIS POINT, IT WILL RECEIVE CONTROL AND
                          -- TERMINATE IF THE ERROR IS PRESENT.
           GLOBAL := LOCAL;

[8] I propose to commit directly with the patch applied, I'll of course
commit and maintain a list of tests that were modified and why, any
objection?

[9] Before doing the commit, I'll need a place where to put the
various prototype tarballs to reflect discussions and proposals, could
someone give me access to an ftp server somewhere?

[10] The first harness prototype will be ultra simple and should allow
anyone to play with it so we can make progress on portability and
features later on by patches on stuff in CVS the regular GCC way
instead of by generating a 20 MB tarball each time we change
something.

The upstream ACATS is maintained using CVS, web documents and
interface from <http://www.ada-auth.org/>. There's a very low volume
public mailing list.

[12] Do we want to scrap the legalese and replace it by something else?
Do we want to identify changes or extension made within the GCC
project, and if so how?

[13] I apply for maintainership of this stuff once commited, and will
keep it in synch with the change made upstream.

-- 
Laurent Guerby <guerby@acm.org>

--                             Grant of Unlimited Rights
--
--     Under contracts F33600-87-D-0337, F33600-84-D-0280, MDA903-79-C-0687,
--     F08630-91-C-0015, and DCA100-97-D-0025, the U.S. Government obtained 
--     unlimited rights in the software and documentation contained herein.
--     Unlimited rights are defined in DFAR 252.227-7013(a)(19).  By making 
--     this public release, the Government intends to confer upon all 
--     recipients unlimited rights  equal to those held by the Government.  
--     These rights include rights to use, duplicate, release or disclose the 
--     released technical data and computer software in whole or in part, in 
--     any manner and for any purpose whatsoever, and to have or permit others 
--     to do so.
--
--                                    DISCLAIMER
--
--     ALL MATERIALS OR INFORMATION HEREIN RELEASED, MADE AVAILABLE OR
--     DISCLOSED ARE AS IS.  THE GOVERNMENT MAKES NO EXPRESS OR IMPLIED 
--     WARRANTY AS TO ANY MATTER WHATSOEVER, INCLUDING THE CONDITIONS OF THE
--     SOFTWARE, DOCUMENTATION OR OTHER INFORMATION RELEASED, MADE AVAILABLE 
--     OR DISCLOSED, OR THE OWNERSHIP, MERCHANTABILITY, OR FITNESS FOR A
--     PARTICULAR PURPOSE OF SAID MATERIAL.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2001-12-10  3:32 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-12-09 15:06 ACATS legal status cleared by FSF dewar
2001-12-09 15:55 ` Joseph S. Myers
  -- strict thread matches above, loose matches on Subject: below --
2001-12-09 19:03 dewar
2001-12-09 14:00 dewar
2001-12-07 19:12 dewar
2001-12-09 13:02 ` Zack Weinberg
2001-12-09 14:52   ` guerby
2001-12-09 19:47     ` Geert Bosch
2001-12-07 18:57 dewar
2001-12-07 18:50 mike stump
2001-12-07 17:59 dewar
2001-12-07  3:18 Richard Kenner
2001-12-06 19:09 dewar
2001-12-06 17:38 dewar
2001-12-06 15:40 Richard Kenner
2001-12-06 15:01 Richard Kenner
2001-12-05 23:36 dewar
2001-12-05 15:28 Richard Kenner
2001-12-05 15:41 ` guerby
2001-12-05 15:13 guerby
2001-12-05 16:21 ` Joseph S. Myers
2001-12-05 18:00 ` Jerry van Dijk
2001-12-06  3:36 ` Geoff Keating
2001-12-06  9:34 ` Geert Bosch
2001-12-06 11:48   ` Zack Weinberg
2001-12-06 14:24     ` Geert Bosch
2001-12-06 14:32       ` Joseph S. Myers
2001-12-06 15:10       ` Zack Weinberg
2001-12-06 15:41         ` Geert Bosch
2001-12-06 18:22           ` Zack Weinberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).