From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-395198-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 77507 invoked by alias); 15 Apr 2015 10:14:17 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 77498 invoked by uid 89); 15 Apr 2015 10:14:16 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2
X-HELO: eu-smtp-delivery-143.mimecast.com
Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 15 Apr 2015 10:14:15 +0000
Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by uk-mta-4.uk.mimecast.lan; Wed, 15 Apr 2015 11:14:12 +0100
Received: from [10.2.207.50] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959);	 Wed, 15 Apr 2015 11:14:11 +0100
Message-ID: <552E39F3.5030102@arm.com>
Date: Wed, 15 Apr 2015 10:14:00 -0000
From: Kyrill Tkachov <kyrylo.tkachov@arm.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0
MIME-Version: 1.0
To: James Greenhalgh <james.greenhalgh@arm.com>,  Kugan <kugan.vivekanandarajah@linaro.org>
CC: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,  Marcus Shawcroft <Marcus.Shawcroft@arm.com>, Richard Earnshaw <Richard.Earnshaw@arm.com>,  Jim Wilson <jim.wilson@linaro.org>
Subject: Re: [AArch64][PR65375] Fix RTX cost for vector SET
References: <55066BCC.4010900@linaro.org> <5506AA24.3050108@arm.com> <5506CD7A.7030109@linaro.org> <5506D77B.5060909@linaro.org> <55070972.3000800@arm.com> <5507813E.3060106@linaro.org> <5513B390.2030201@linaro.org> <552D8FF7.5000105@linaro.org> <20150415092509.GA20852@arm.com>
In-Reply-To: <20150415092509.GA20852@arm.com>
X-MC-Unique: Cdkw5o8URT6mZSeFPrmIXw-1
Content-Type: text/plain; charset=WINDOWS-1252; format=flowed
Content-Transfer-Encoding: quoted-printable
X-IsSubscribed: yes
X-SW-Source: 2015-04/txt/msg00738.txt.bz2


On 15/04/15 10:25, James Greenhalgh wrote:
> On Tue, Apr 14, 2015 at 11:08:55PM +0100, Kugan wrote:
>> Now that Stage1 is open, is this OK for trunk.
> Hi Kugan,
>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index cba3c1a..d6ad0af 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -5544,10 +5544,17 @@ aarch64_rtx_costs (rtx x, int code, int outer AT=
TRIBUTE_UNUSED,
>>=20=20=20
>>   	  /* Fall through.  */
>>   	case REG:
>> +	  if (VECTOR_MODE_P (GET_MODE (op0)) && REG_P (op1))
>> +	    {
>> +              /* The cost is 1 per vector-register copied.  */
>> +              int n_minus_1 =3D (GET_MODE_SIZE (GET_MODE (op0)) - 1)
>> +			      / GET_MODE_SIZE (V4SImode);
>> +              *cost =3D COSTS_N_INSNS (n_minus_1 + 1);
>> +	    }
>>   	  /* const0_rtx is in general free, but we will use an
>>   	     instruction to set a register to 0.  */
>> -          if (REG_P (op1) || op1 =3D=3D const0_rtx)
>> -            {
>> +	  else if (REG_P (op1) || op1 =3D=3D const0_rtx)
>> +	    {
>>                 /* The cost is 1 per register copied.  */
>>                 int n_minus_1 =3D (GET_MODE_SIZE (GET_MODE (op0)) - 1)
>>   			      / UNITS_PER_WORD;
> I would not have expected control flow to reach this point, as we have:
>
>>   /* TODO: The cost infrastructure currently does not handle
>>      vector operations.  Assume that all vector operations
>>      are equally expensive.  */
>>   if (VECTOR_MODE_P (mode))
>>     {
>>       if (speed)
>> 	*cost +=3D extra_cost->vect.alu;
>>       return true;
>>     }
> But, I see that this check is broken for a set RTX (which has no mode).
> So, your patch works, but only due to a bug in my original implementation.
> This leaves the code with quite a messy design.
>
> There are two ways I see that we could clean things up, both of which
> require some reworking of your patch.
>
> Either we remove my check above and teach the RTX costs how to properly
> cost vector operations, or we fix my check to catch all vector RTX
> and add the special cases for the small subset of things we understand
> up there.
>
> The correct approach in the long term is to fix the RTX costs to correctly
> understand vector operations, so I'd much prefer to see a patch along
> these lines, though I appreciate that is a substantially more invasive
> piece of work.


Would we want to catch all vector RTXes in that check at the top
and have special vector rtx handling there? (Perhaps even in a function
of its own like aarch64_vector_rtx_costs?). Or do you think it would
be cleaner to handle them in the aarch64_rtx_costs giant switch?
Vector-specific RTX codes like vec_concat, vec_select would integrate
cleanly, but handling other common rtxen could potentially be messy?

Kyrill

>
> Thanks,
> James
>