From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-396261-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 126168 invoked by alias); 28 Apr 2015 23:13:03 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 126159 invoked by uid 89); 28 Apr 2015 23:13:03 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Tue, 28 Apr 2015 23:13:01 +0000
Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27])	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t3SNCvFP025958	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL);	Tue, 28 Apr 2015 19:12:57 -0400
Received: from localhost.localdomain (ovpn-113-143.phx2.redhat.com [10.3.113.143])	by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t3SNCv0U005370;	Tue, 28 Apr 2015 19:12:57 -0400
Message-ID: <554013F8.9050300@redhat.com>
Date: Tue, 28 Apr 2015 23:18:00 -0000
From: Jeff Law <law@redhat.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0
MIME-Version: 1.0
To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com
Subject: Re: Mostly rewrite genrecog
References: <87egn5yis1.fsf@e105548-lin.cambridge.arm.com>
In-Reply-To: <87egn5yis1.fsf@e105548-lin.cambridge.arm.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
X-SW-Source: 2015-04/txt/msg01801.txt.bz2

On 04/27/2015 04:20 AM, Richard Sandiford wrote:
> I think it's been the case for a while that parallel builds of GCC tend
> to serialise around the compilation of insn-recog.c, especially with
> higher --enable-checking settings.  This patch tries to speed that
> up by replacing most of genrecog with a new algorithm.
>
> I think the main problems with the current code are:
>
> 1. Vector architectures have added lots of new instructions that have
>     a similar shape and differ only in mode, code or unspec number.
>     The current algorithm doesn't have any way of factoring out those
>     similarities.
>
> 2. When matching a particular instruction, the current code examines
>     everything about a SET_DEST before moving on to the SET_SRC.  This has
>     two subproblems:
>
>     2a. The destination of a SET isn't very distinctive.  It's usually
>         just a register_operand, a memory_operand, a nonimmediate_operand
>         or a flags register.  We therefore tend to backtrack to the
>         SET_DEST a lot, oscillating between groups of instructions with
>         the same kind of destination.
>
>     2b. Backtracking through predicate checks is relatively expensive.
>         It would be good to narrow down the "shape" of the instruction
>         first and only then check the predicates.  (The backtracking is
>         expensive in terms of insn-recog.o compile time too, both because
>         we need to copy into argument registers and out of the result
>         register, and because it adds more sites where spills are needed.)
>
> 3. The code keeps one local variable per rtx depth, so it ends up
>     loading the same rtx many times over (mostly when backtracking).
>     This is very expensive in rtl-checking builds because each XEXP
>     includes a code check and a line-specific failure call.
>
>     In principle the idea of having one local variable per depth
>     is good.  But it was originally written that way when all optimisations
>     were done at the rtl level and I imagine each local variable mapped
>     to one pseudo register.  These days the statements that reload the
>     value needed on backtracking lead to many more SSA names and phi
>     statements than you'd get with just a single variable per position
>     (loaded once, so naturally SSA).  There is still the potential benefit
>     of avoiding having sibling rtxes live at once, but fixing (2) above
>     reduces that problem.
>
> Also, the code is all goto-based, which makes it rather hard to step through.
>
> The patch deals with these as follows:
>
> 1. Detect subpatterns that differ only by mode, code and/or integer
>     (e.g. unspec number) and split them out into a common routine.
>
> 2. Match the "shape" of the instruction first, in terms of codes,
>     integers and vector lengths, and only then check the modes, predicates
>     and dups.  When checking the shape, handle SET_SRCs before SET_DESTs.
>     In practice this seems to greatly reduce the amount of backtracking.
>
> 3. Have one local variable per rtx position.  I tested the patch with
>     and without the change and it helped a lot with rtl-checking builds
>     without seeming to affect release builds much either way.
>
> As far as debuggability goes, the new code avoids gotos and just
> uses "natural" control flow.
>
> The headline stat is that a stage 3 --enable-checking=yes,rtl,df
> build of insn-recog.c on my box goes from 7m43s to 2m2s (using the
> same stage 2 compiler).  The corresponding --enable-checking=release
> change is from 49s to 24s (less impressive, as expected).
>
> The patch seems to speed up recog.  E.g. the time taken to build
> fold-const.ii goes from 6.74s before the patch to 6.69s after it;
> not a big speed-up, but reproducible.
[ big snip ]

I don't see anything that makes me think we don't want this :-)

It's interesting to note that the regressions in loadable size of a 
release-checking insn-recog aren't any mainstream ports.  Hell, I'd 
probably claim they're all fringe ports (iq2000, m68k, mcore, 
microblaze, pdp11, rx)


> Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-none-eabi.
> Also tested by building the testsuite for each of the targets above
> and making sure there were no assembly differences.  Made sure that no
> objects in spec2k6 changed for aarch64-linux-gnu (except for perlbench
> perl.o and cactusADM datestamp.o, where the differences are timestamps).
> OK to install?
To be very blunt, I'm probably not capable of reviewing the new code. 
You're going to know it best and you should probably own it.

Given your history with gcc and attention to detail, I'm comfortable 
with approving this knowing you'll deal with any issues as they arise.

The one thing I would ask is that you make sure to include the recently 
added matching constraint checking bits in genrecog.  I'm assuming all 
the older sanity checks that have been in genrecog for a longer period 
of time have been kept.

Jeff