From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23054 invoked by alias); 17 Oct 2014 07:59:57 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 23034 invoked by uid 89); 17 Oct 2014 07:59:56 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.2 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx2.suse.de Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Fri, 17 Oct 2014 07:59:55 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 4B4BBAAF3; Fri, 17 Oct 2014 07:59:52 +0000 (UTC) Date: Fri, 17 Oct 2014 08:00:00 -0000 From: Richard Biener To: Sebastian Pop cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH][0/n] Merge from match-and-simplify In-Reply-To: <20141016203852.GB29134@f1.c.bardezibar.internal> Message-ID: References: <20141016203852.GB29134@f1.c.bardezibar.internal> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-SW-Source: 2014-10/txt/msg01665.txt.bz2 On Thu, 16 Oct 2014, Sebastian Pop wrote: > Richard Biener wrote: > > > > I have posted 5 patches as part of a larger series to merge > > (parts) from the match-and-simplify branch. While I think > > there was overall consensus that the idea behind the project > > is sound there are technical questions left for how the > > thing should look in the end. I've raised them in 3/n > > which is the only patch of the series that contains any > > patterns sofar. > > > > To re-iterate here (as I expect most people will only look > > at [0/n] patches ;)), the question is whether we are fine > > with making fold-const (thus fold_{unary,binary,ternary}) > > not handle some cases it handles currently. > > I have tested on aarch64 all the code in the match-and-simplify against trunk as > of the last merge at r216315: > > 2014-10-16 Richard Biener > > Merge from trunk r216235 through r216315. > > Overall, I see a lot of perf regressions (about 2/3 of the tests) than > improvements (1/3 of the tests). I will try to reduce tests. Note that the branch goes much further in exercising the machinery than I want to merge at this point (that applies mostly to all passes using the SSA propagator such as CCP and VRP and passes exercising value-numbering - FRE and PRE). It may also simply show the effect of now folding all statements from tree-ssa-forwprop.c. I have yet to investigate the testsuite fallout of [1/n] to [5/n] - testresults have been very noisy lately due to the C11 change and now ICF. > For instance, saxpy regresses at -O3 on aarch64: > > void saxpy(double* x, double* y, double* z) { > int i=0; > for (i = 0 ; i < ARRAY_SIZE; i++) { > z[i] = x[i] + scalar*y[i]; > } > } > > $ diff -u base.s mas.s > --- base.s 2014-10-16 15:30:15.351430000 -0500 > +++ mas.s 2014-10-16 15:30:16.183035000 -0500 > @@ -2,12 +2,14 @@ > add x1, x2, 800 > ldr q0, [x0, x2] > add x3, x2, 1600 > + cmp x0, 784 > ldr q1, [x0, x1] > + add x1, x0, 16 > fmla v0.2d, v1.2d, v2.2d > str q0, [x0, x3] > - add x0, x0, 16 > - cmp x0, 800 > + mov x0, x1 > bne .L140 > .LBE179: > - subs w4, w4, #1 > + cmp w4, 1 > + sub w4, w4, #1 > bne .L139 I don't understand AARCH64 assembly very well but the above looks like RTL issues and/or IVOPTs issues? Thanks for doing performance measurements. Richard.