From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-42383-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 9485 invoked by alias); 6 Dec 2001 23:00:22 -0000
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
Received: (qmail 9410 invoked from network); 6 Dec 2001 23:00:17 -0000
Received: from unknown (HELO vlsi1.ultra.nyu.edu) (128.122.140.213)
  by sources.redhat.com with SMTP; 6 Dec 2001 23:00:17 -0000
Received: by vlsi1.ultra.nyu.edu (4.1/1.34)
	id AA04282; Thu, 6 Dec 01 17:54:52 EST
Date: Thu, 06 Dec 2001 15:01:00 -0000
From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner)
Message-Id: <10112062254.AA04282@vlsi1.ultra.nyu.edu>
To: jsm28@cam.ac.uk
Subject: Re: ACATS legal status cleared by FSF
Cc: gcc@gcc.gnu.org
X-SW-Source: 2001-12/txt/msg00319.txt.bz2

    In general I consider a patch which adds a diagnostic without
    including a test exercising that code path, or adds a language feature
    without proper tests for the associated constraints, to be defective.
    I get the impression from this discussion that these tests represent
    something similar for Ada - tests of the ways in which code can be
    defective and diagnostics issued for it - and so would be of similar
    value.  It is just as much a fundamental part of avoiding regressions
    that bad code remains diagnosed and the messages do not get worse, as
    that good code continues to compile and code quality does not get
    worse.

Well, the ACATS tests do not check code quality, but it's correct that the B
tests verify that each condition that must produce an error message do so.
And I agree that this is worthwhile test to run.

However, I also agree with what Geert said: it is important to become
familiar with this test suite before making such decisions.  This is a very
complex test suite with a very high cost of maintenance.  You need to look at
both the benefit and cost of running each of the tests.

The problem with the B tests in particular is that the normal way of running
them is to compare the output with a baseline output and manually inspect
differences between that baseline and any different output.  This means that
a wording change in a common error message can easily affect over a thousand
baseline files.  Dealing these tests is an esoteric specialty built up over
the last few decades.

It is certainly valuable to have the B tests *around* for those cases when
having a run might be useful, but requiring them as a condition for checkins
doesn't make any sense at all for changes other than to the Ada front end
(since these tests mostly don't even get out of the front end since *all* of
them have errors) and is of only marginal value for changes to the Ada front
end.