From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 127505 invoked by alias); 25 Oct 2019 06:17:42 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 127137 invoked by uid 89); 25 Oct 2019 06:17:42 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-9.2 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,KAM_ASCII_DIVIDERS,KAM_MANYTO,KAM_SHORT autolearn=ham version=3.3.1 spammy=ancient, Dave, comfort, H*MI:sk:CO2PR00 X-HELO: us-smtp-1.mimecast.com Received: from us-smtp-delivery-1.mimecast.com (HELO us-smtp-1.mimecast.com) (205.139.110.120) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 25 Oct 2019 06:17:39 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1571984257; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rtm2Cg7fSUDdGkRTqJXdTCRE6xrV/ue/dXjmIE2yNe8=; b=WeZku6sOms1dlnl/XUnOL35FEvpldIS045dxF4sidotDug7nlvOnsz423wDfCallRI5MIF OCabE1UAfbS/W396/uakJu/JfgiDBe0V1A9inIU1zSWU9gBB0T/GgLRVl6C5N9Mrra+gC6 bSghGHyb/PGs7DPSjdXdSBAkQydT5vw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-278-BIzRGxMWP2aiUU4t4nnTuA-1; Fri, 25 Oct 2019 02:17:33 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7E91B1800DCA; Fri, 25 Oct 2019 06:17:32 +0000 (UTC) Received: from ovpn-117-109.phx2.redhat.com (ovpn-117-109.phx2.redhat.com [10.3.117.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id 84CF3600C6; Fri, 25 Oct 2019 06:17:31 +0000 (UTC) Message-ID: Subject: Re: GCC selftest improvements From: David Malcolm To: Andrew Dean , "gcc@gcc.gnu.org" , "ro@CeBiTec.Uni-Bielefeld.DE" , "mikestump@comcast.net" , "law@redhat.com" , "jason@redhat.com" Cc: Gabriel Dos Reis Date: Fri, 25 Oct 2019 06:17:00 -0000 In-Reply-To: References: User-Agent: Evolution 3.32.4 (3.32.4-1.fc30) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2019-10/txt/msg00154.txt.bz2 On Thu, 2019-10-24 at 20:50 +0000, Andrew Dean via gcc wrote: > TLDR: I'd like to propose adding a dependency on a modern unit > testing framework to make it easier to write unit tests within GCC. > Before I spend much more time on it, what sort of buy-in should I > get? Are there any people in particular I should work more closely > with as I make this change? >=20=20 > Terminology: Within GCC, there are two types of tests in place: unit > tests and regression tests. The unit tests have been written with a > home-grown selftest framework and run as part of the build process. > Any failures to a unit test results in no compiler being produced. > The regression tests, on the other hand, run after build, and use the > separate DejaGnu framework. In this email, I am only concerning > myself with the unit tests, and throughout the remainder of the > email, any mention of tests refers to these. >=20=20 > Working on GCC, I wanted to add some new unit tests to my feature as > I went, but I noticed that there is a good deal of friction involved. > Right now, adding new unit tests requires writing the test method, > then modifying a second place in the code to call said test method, > repeating as necessary until getting all the way to either the > selftest.c file or the target hook. There is also no way to do test > setup/teardown automatically. Everything is manual. >=20=20 > I'd like to propose adding a dependency on a modern open-source unit > testing framework as an enhancement to the current self test system. > I have used Catch2 (https://github.com/catchorg/Catch2, Boost > Software License 1.0) with great success in the past. I experimented > with adding it to GCC and converting a handful of tests to use > Catch2. Although I only converted a small number of tests, I didn't > see any performance impact during selftest. As a bonus, while doing > so, I actually found that one test that I had written previously > wasn't actually being run, because I had failed to manually call it. >=20=20 > Some nice things that Catch2 provides are better error reporting (see > below for a comparison), ease of adding new tests (just include the > header and write a TEST_CASE(), as opposed to the manual plumbing > required right now), extension points for adding custom comparisons > (I could see this being very useful to expand on the current rtl test > macros), and the ability to run a subset of the tests without > recompiling. It is also easy to integrate Catch2 with the existing > self-test framework. >=20=20 > If this path seems useful to others, I'm happy to pursue it further. > A list of work items I see are: >=20=20 > 1. Convert more tests to verify the claim that build performance is > not degraded > 2. Update the docs to list Catch2 as the new recommended way to write > unit tests > 3. If all of the target self-tests are converted, then we can remove > the target test hook. Similar for the lang test hook. >=20=20 > One thing that would make Catch2 an even more slam-dunk case was if > we were able to enable exceptions for the check builds. Then, running > the unit tests could report multiple failures at the same time > instead of just aborting at the first one. That said, even without > enabling exceptions, Catch2 is on par with the current selftest in > terms of terminating at the first failure. >=20=20 > Another option is to use a test framework that doesn't use > exceptions, such as Google Test (https://github.com/google/googletest > , BSD 3-Clause "New" or "Revised" License). I personally think Catch2 > is more flexible, or I would lead with Google Test. For example, in > Catch2, shared setup is done in place with the tests itself, having > each subtest be a nested SECTION, where-as in GTest, you have to > write a test class that derives from ::test and overrides SetUp(). In > addition, the sections in Catch2 can be nested further, allowing > several related tests to build on each other.=20 >=20=20 > Here is some sample output for the case where all the tests are > passing: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > All tests passed (25 assertions in 5 test cases) >=20=20 > And here is the output when a test fails: > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~~~~~~~~~~ > is a Catch v2.9.2 host application. > Run with -? for options >=20=20 > ------------------------------------------------------------------- > ------------ > test_set_range > ------------------------------------------------------------------- > ------------ > ../../gcc/bitmap.c:2661 > ..................................................................... > .......... > ../../gcc/bitmap.c:2668: FAILED: > REQUIRE( 6 =3D=3D bitmap_count_bits (b) ) > with expansion: > 6 =3D=3D 5 >=20=20 > Catch will terminate because it needed to throw an exception. > The message was: Test failure requires aborting test! > terminate called without an active exception > ../../gcc/bitmap.c:2668: FAILED: > {Unknown expression after the reported line} > due to a fatal error condition: > SIGABRT - Abort (abnormal termination) signal > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > test cases: 2 | 1 passed | 1 failed > assertions: 5 | 3 passed | 2 failed > cc1: internal compiler error: Aborted > > Please submit a full bug report, > with preprocessed source if appropriate. > Please include the complete backtrace with any bug report. > See for instructions. >=20=20 > (Note that at the moment it doesn't know the name of our application > or it would have prefixed "is a Catch..." with our app name). >=20=20 > Compare that to the output of the current test framework: > ../../gcc/bitmap.c:2669: test_set_range: FAIL: ASSERT_EQ ((6), > (bitmap_count_bits (b))) > cc1: internal compiler error: in fail, at selftest.c:47 > /bin/bash ../../gcc/../move-if-change tmp-macro_list macro_list > echo timestamp > s-macro_list > > Please submit a full bug report, > with preprocessed source if appropriate. > Please include the complete backtrace with any bug report. > See for instructions. Thanks for your email, it looks interesting. Is your code somewhere we can see it?=20 I'm the author of gcc's selftest framework (and I use it heavily e.g. for testing the diagnostics subsystem [1]). It went through substantial changes during review. I looked over my notes; for reference, here's a summary of the history of the patches: v1: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00765.html "[PATCH 00/17] RFC: Addding a unit testing framework to gcc" Used Google Test framework, was a dummy "frontend" v2: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01224.html "[PATCH/RFC]: unittesting v2: as a plugin (was Re: [PATCH 00/17] RFC: Addding a unit testing framework to gcc)" Still Google Test, as a plugin rather than a frontend v3: https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02947.html "[PATCH 00/16] Unit tests framework (v3)" Still Google Test, as a plugin built by plugin.exp within DejaGnu tests, with a custom gtest reporter Some discussion about gtest: https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03215.html =20=20=20=20 v4: https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02379.html "[PATCH 00/15] Unittests framework v4: -fself-test" Done via "-fself-test" via compiling a dummy file in DejaGnu tests I believe it was at this point that I switched to a custom API that resembles gtest, rather than gtest itself. v5: https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00082.html "[PATCH 00/21] Add -fself-test framework for fast, early unit-testing=20 (unittests v5)" Done via "-fself-test" at each of the 3 stages of bootstrap. v6: https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00210.html "[PATCH 00/16] v6 of -fself-test/unit-testing patch" Switched to "abort on first failure" Eliminated runner class, and from self-registrating tests to manual test invocation v7: https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00298.html "[PATCH] Selftest framework (v7)" (one combined patch) v8: approved; committed v8 to trunk as r237144 (for gcc 7): https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00410.html Notable followups: 2016-07-11: * r238209: "Support running the selftests under valgrind" * https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00437.html 2017-12-11: * r255563: "Expensive selftests: torture testing for fix-it boundary conditions (PR c/82050)" * https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02459.html (some tests run as a DejaGnu-built plugin) 2018-04-30: * r259782: "selftest: remove "Yoda ordering" in assertions" * https://gcc.gnu.org/ml/gcc-patches/2018-04/msg01323.html 2018-10-17: * r265240: "Run selftests for C++ as well as C" * https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00802.html I think the consensus during review was that I was over-engineering things, and that each iteration from v1 to v8 made the code simpler and involved less C++ "magic", closer to C. Whether that's still the consensus I don't know. Different people within the GCC dev community have different comfort levels with C++, and my initial version (using gtest) was probably too "magic" for some. Maybe people here are more comfortable with C++ now? GCC has some rather unique requirements, in that we support a great many build configurations, some of which are rather primitive - for example, requiring just C++98 with exceptions disabled, in that we want to be able to be bootstrappable on relatively "ancient" configurations. IIRC auto-registration of tests requires that the build configuration have a sufficiently sane implementation of C++ - having globals with non-trivial ctors tends to be problematic when dealing with early implementations of C++. Personally I don't find the manual registration of tests to be a pain, but it would be nice to have more readable errors on failures. There's probably a case for more verbose test output. (generally I immediately just do "make selftest-gdb" on failures; the issue is if it suddenly starts failing on a build I don't have access to) I suspect that exceptions would be a deal-breaker; does Catch2 support -fno-exceptions? As for setup/teardown, I've been able to do that "manually" using RAII- style classes in test functions. Thanks again for your email; hope this is constructive. Dave [1] see e.g. the selftests in gcc/input.c and gcc/diagnostic-show- locus.c