From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 77507 invoked by alias); 15 Apr 2015 10:14:17 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 77498 invoked by uid 89); 15 Apr 2015 10:14:16 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 15 Apr 2015 10:14:15 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by uk-mta-4.uk.mimecast.lan; Wed, 15 Apr 2015 11:14:12 +0100 Received: from [10.2.207.50] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 15 Apr 2015 11:14:11 +0100 Message-ID: <552E39F3.5030102@arm.com> Date: Wed, 15 Apr 2015 10:14:00 -0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: James Greenhalgh , Kugan CC: "gcc-patches@gcc.gnu.org" , Marcus Shawcroft , Richard Earnshaw , Jim Wilson Subject: Re: [AArch64][PR65375] Fix RTX cost for vector SET References: <55066BCC.4010900@linaro.org> <5506AA24.3050108@arm.com> <5506CD7A.7030109@linaro.org> <5506D77B.5060909@linaro.org> <55070972.3000800@arm.com> <5507813E.3060106@linaro.org> <5513B390.2030201@linaro.org> <552D8FF7.5000105@linaro.org> <20150415092509.GA20852@arm.com> In-Reply-To: <20150415092509.GA20852@arm.com> X-MC-Unique: Cdkw5o8URT6mZSeFPrmIXw-1 Content-Type: text/plain; charset=WINDOWS-1252; format=flowed Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2015-04/txt/msg00738.txt.bz2 On 15/04/15 10:25, James Greenhalgh wrote: > On Tue, Apr 14, 2015 at 11:08:55PM +0100, Kugan wrote: >> Now that Stage1 is open, is this OK for trunk. > Hi Kugan, > >> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >> index cba3c1a..d6ad0af 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c >> @@ -5544,10 +5544,17 @@ aarch64_rtx_costs (rtx x, int code, int outer AT= TRIBUTE_UNUSED, >>=20=20=20 >> /* Fall through. */ >> case REG: >> + if (VECTOR_MODE_P (GET_MODE (op0)) && REG_P (op1)) >> + { >> + /* The cost is 1 per vector-register copied. */ >> + int n_minus_1 =3D (GET_MODE_SIZE (GET_MODE (op0)) - 1) >> + / GET_MODE_SIZE (V4SImode); >> + *cost =3D COSTS_N_INSNS (n_minus_1 + 1); >> + } >> /* const0_rtx is in general free, but we will use an >> instruction to set a register to 0. */ >> - if (REG_P (op1) || op1 =3D=3D const0_rtx) >> - { >> + else if (REG_P (op1) || op1 =3D=3D const0_rtx) >> + { >> /* The cost is 1 per register copied. */ >> int n_minus_1 =3D (GET_MODE_SIZE (GET_MODE (op0)) - 1) >> / UNITS_PER_WORD; > I would not have expected control flow to reach this point, as we have: > >> /* TODO: The cost infrastructure currently does not handle >> vector operations. Assume that all vector operations >> are equally expensive. */ >> if (VECTOR_MODE_P (mode)) >> { >> if (speed) >> *cost +=3D extra_cost->vect.alu; >> return true; >> } > But, I see that this check is broken for a set RTX (which has no mode). > So, your patch works, but only due to a bug in my original implementation. > This leaves the code with quite a messy design. > > There are two ways I see that we could clean things up, both of which > require some reworking of your patch. > > Either we remove my check above and teach the RTX costs how to properly > cost vector operations, or we fix my check to catch all vector RTX > and add the special cases for the small subset of things we understand > up there. > > The correct approach in the long term is to fix the RTX costs to correctly > understand vector operations, so I'd much prefer to see a patch along > these lines, though I appreciate that is a substantially more invasive > piece of work. Would we want to catch all vector RTXes in that check at the top and have special vector rtx handling there? (Perhaps even in a function of its own like aarch64_vector_rtx_costs?). Or do you think it would be cleaner to handle them in the aarch64_rtx_costs giant switch? Vector-specific RTX codes like vec_concat, vec_select would integrate cleanly, but handling other common rtxen could potentially be messy? Kyrill > > Thanks, > James >