From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8609 invoked by alias); 12 Aug 2015 08:43:35 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 8600 invoked by uid 89); 12 Aug 2015 08:43:34 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.4 required=5.0 tests=AWL,BAYES_50,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 12 Aug 2015 08:43:33 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 54E1E75; Wed, 12 Aug 2015 01:43:29 -0700 (PDT) Received: from e105689-lin.cambridge.arm.com (e105689-lin.cambridge.arm.com [10.2.207.32]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 051CE3F21A; Wed, 12 Aug 2015 01:43:30 -0700 (PDT) Message-ID: <55CB0731.30905@foss.arm.com> Date: Wed, 12 Aug 2015 08:43:00 -0000 From: Richard Earnshaw User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Richard Earnshaw , Richard Henderson , "gcc-patches@gcc.gnu.org" CC: David Edelsohn , Marcus Shawcroft Subject: Re: [PATCH ppc64,aarch64,alpha 00/15] Improve backend constant generation References: <1439341904-9345-1-git-send-email-rth@redhat.com> <55CB0487.1020505@foss.arm.com> In-Reply-To: <55CB0487.1020505@foss.arm.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2015-08/txt/msg00573.txt.bz2 On 12/08/15 09:32, Richard Earnshaw wrote: > On 12/08/15 02:11, Richard Henderson wrote: >> Something last week had me looking at ppc64 code generation, >> and some of what I saw was fairly bad. Fixing it wasn't going >> to be easy, due to the fact that the logic for generating >> constants wasn't contained within a single function. >> >> Better is the way that aarch64 and alpha have done it in the >> past, sharing a single function with all of the logical that >> can be used for both cost calculation and the actual emission >> of the constants. >> >> However, the way that aarch64 and alpha have done it hasn't >> been ideal, in that there's a fairly costly search that must >> be done every time. I've thought before about changing this >> so that we would be able to cache results, akin to how we do >> it in expmed.c for multiplication. >> >> I've implemented such a caching scheme for three targets, as >> a test of how much code could be shared. The answer appears >> to be about 100 lines of boiler-plate. Minimal, true, but it >> may still be worth it as a way of encouraging backends to do >> similar things in a similar way. >> > > I've got a short week this week, so won't have time to look at this in > detail for a while. So a bunch of questions... but not necessarily > objections :-) > > How do we clear the cache, and when? For example, on ARM, switching > between ARM and Thumb state means we need to generate potentially > radically different sequences? We can do such splitting at function > boundaries now. > > Can we generate different sequences for hot/cold code within a single > function? > > Can we cache sequences with the context (eg use with AND, OR, ADD, etc)? > > >> Some notes about ppc64 in particular: >> >> * Constants aren't split until quite late, preventing all hope of >> CSE'ing portions of the generated code. My gut feeling is that >> this is in general a mistake, but... >> >> I did attempt to fix it, and got nothing for my troubles except >> poorer code generation for AND/IOR/XOR with non-trivial constants. >> > On AArch64 in particular, building complex constants is generally > destructive on the source register (if you want to preserve intermediate > values you have to make intermediate copies); that's clearly never going > to be a win if you don't need at least 3 instructions to form the > constant. > > There might be some cases where you could form a second constant as a > difference from an earlier one, but that then creates data-flow > dependencies and in OoO machines that might not be worth-while. Even > for in-order machines it can restrict scheduling and result in worse code. > > >> I'm somewhat surprised that the operands to the logicals aren't >> visible at rtl generation time, given all the work done in gimple. >> And failing that, combine has enough REG_EQUAL notes that it ought >> to be able to put things back together and see the simpler pattern. >> > > We've tried it in the past. Exposing the individual steps prevents the > higher-level rtl-based optimizations since they can no-longer deal with > the complete sub-expression. Eg. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63724 R.