From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-414922-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 122695 invoked by alias); 23 Nov 2015 01:24:26 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 122681 invoked by uid 89); 23 Nov 2015 01:24:25 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2
X-HELO: mail-ig0-f182.google.com
Received: from mail-ig0-f182.google.com (HELO mail-ig0-f182.google.com) (209.85.213.182) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Mon, 23 Nov 2015 01:24:24 +0000
Received: by igcmv3 with SMTP id mv3so20137504igc.0        for <gcc-patches@gcc.gnu.org>; Sun, 22 Nov 2015 17:24:21 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;        d=1e100.net; s=20130820;        h=x-gm-message-state:subject:to:references:cc:from:message-id:date         :user-agent:mime-version:in-reply-to:content-type         :content-transfer-encoding;        bh=9SOz/ZR+0np36J7Yt7UhEwCo97RuE0nDyKp+tzKmhC4=;        b=PYV73FV9t3X9WvlT8FoCieg6o6Ovxt8QDLOmXevCr1xWM+wDUXuRuIvHoQKMWylA5e         HV7pgXIJY+gq3+F5TzAiDDa/zuXvKGEN7vPV/uHSBMqFFdVkmU5mWdAp3LMGB/9H1jbr         KWpfSAMdFSpGgxiO1r5Cch2CAyR916owaygpVHGHfwRNIbokVS8ozZ0NQJRf5UPYoE8q         5J7n5vMOHsPFfKC4JHbDX4uV6X8IWDBA4fLEfGO1fHzv0PrerJ93n0IULot8qK7Y8wul         xRPJDrT2r+RFHBbFGNy1DmZrNTtH9ddu6RLOUMZunYoo9bcC3K8dI0MFg9QNEpLg4oiS         VePA==
X-Gm-Message-State: ALoCoQmjZ9jibua12f3Zl2NqvbneYt3CRT4+WqG1SnB3HKXmvwH4BDcphi86ntrGKvkj9H43Llws
X-Received: by 10.50.137.99 with SMTP id qh3mr9338912igb.27.1448241861810;        Sun, 22 Nov 2015 17:24:21 -0800 (PST)
Received: from [10.189.1.6] ([172.98.67.42])        by smtp.googlemail.com with ESMTPSA id rs2sm4236222igb.9.2015.11.22.17.24.20        (version=TLSv1/SSLv3 cipher=OTHER);        Sun, 22 Nov 2015 17:24:21 -0800 (PST)
Subject: Re: [Aarch64] Use vector wide add for mixed-mode adds
To: James Greenhalgh <james.greenhalgh@arm.com>
References: <56404283.5070503@linaro.org> <20151122154800.GC36475@arm.com>
Cc: gcc Patches <gcc-patches@gcc.gnu.org>, Richard Biener <richard.guenther@gmail.com>
From: Michael Collison <michael.collison@linaro.org>
Message-ID: <56526AC3.7050209@linaro.org>
Date: Mon, 23 Nov 2015 02:46:00 -0000
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0
MIME-Version: 1.0
In-Reply-To: <20151122154800.GC36475@arm.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-SW-Source: 2015-11/txt/msg02638.txt.bz2



On 11/22/2015 8:48 AM, James Greenhalgh wrote:
> On Sun, Nov 08, 2015 at 11:51:47PM -0700, Michael Collison wrote:
>> 2015-11-06  Michael Collison <Michael.Collison@linaro.org>
>>      * config/aarch64/aarch64-simd.md (widen_ssum, widen_usum)
>> (aarch64_<ANY_EXTEND:su><ADDSUB:optab>w<mode>_internal): New patterns
>>      * config/aarch64/iterators.md (Vhalf, VDBLW): New mode attributes.
>>      * gcc.target/aarch64/saddw-1.c: New test.
>>      * gcc.target/aarch64/saddw-2.c: New test.
>>      * gcc.target/aarch64/uaddw-1.c: New test.
>>      * gcc.target/aarch64/uaddw-2.c: New test.
>>      * gcc.target/aarch64/uaddw-3.c: New test.
>>      * lib/target-support.exp
>>      (check_effective_target_vect_widen_sum_hi_to_si_pattern):
>>      Add aarch64 to list of support targets.
>
> These hunks are all OK (with the minor style comments below applied).

Okay I will update with your comments.
>
> As we understand what's happening here, let's take the regressions below
> for now and add AArch64 to the targets affected by pr68333.
>
>>      * gcc.dg/vect/slp-multitypes-4.c: Disable test for
>>      targets with widening adds from V8HI=>V4SI.
>>      * gcc.dg/vect/slp-multitypes-5.c: Ditto.
>>      * gcc.dg/vect/vect-125.c: Ditto.
> Let's leave these for now, while we wait for pr68333.

To clarify you would like me to exclude these bits from the patch?

>
>> diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
>> index 65a2b6f..acb7cf0 100644
>> --- a/gcc/config/aarch64/aarch64-simd.md
>> +++ b/gcc/config/aarch64/aarch64-simd.md
>> @@ -2750,6 +2750,60 @@
>>   
>>   ;; <su><addsub>w<q>.
>>   
>> +(define_expand "widen_ssum<mode>3"
>> +  [(set (match_operand:<VDBLW> 0 "register_operand" "")
>> +	(plus:<VDBLW> (sign_extend:<VDBLW> (match_operand:VQW 1 "register_operand" ""))
> Split this line (more than 80 characters).
>
>> +		      (match_operand:<VDBLW> 2 "register_operand" "")))]
>> +  "TARGET_SIMD"
>> +  {
>> +    rtx p = aarch64_simd_vect_par_cnst_half (<MODE>mode, false);
>> +    rtx temp = gen_reg_rtx (GET_MODE (operands[0]));
>> +
>> +    emit_insn (gen_aarch64_saddw<mode>_internal (temp, operands[2],
>> +						operands[1], p));
>> +    emit_insn (gen_aarch64_saddw2<mode> (operands[0], temp, operands[1]));
>> +    DONE;
>> +  }
>> +)
>> +
>> +(define_expand "widen_ssum<mode>3"
>> +  [(set (match_operand:<VWIDE> 0 "register_operand" "")
>> +	(plus:<VWIDE> (sign_extend:<VWIDE>
>> +		       (match_operand:VD_BHSI 1 "register_operand" ""))
>> +		      (match_operand:<VWIDE> 2 "register_operand" "")))]
>> +  "TARGET_SIMD"
>> +{
>> +  emit_insn (gen_aarch64_saddw<mode> (operands[0], operands[2], operands[1]));
>> +  DONE;
>> +})
>> +
>> +(define_expand "widen_usum<mode>3"
>> +  [(set (match_operand:<VDBLW> 0 "register_operand" "")
>> +	(plus:<VDBLW> (zero_extend:<VDBLW> (match_operand:VQW 1 "register_operand" ""))
> Split this line (more than 80 characters).
>
>> +		      (match_operand:<VDBLW> 2 "register_operand" "")))]
>> +  "TARGET_SIMD"
>> +  {
>> +    rtx p = aarch64_simd_vect_par_cnst_half (<MODE>mode, false);
>> +    rtx temp = gen_reg_rtx (GET_MODE (operands[0]));
>> +
>> +    emit_insn (gen_aarch64_uaddw<mode>_internal (temp, operands[2],
>> +						 operands[1], p));
>> +    emit_insn (gen_aarch64_uaddw2<mode> (operands[0], temp, operands[1]));
>> +    DONE;
>> +  }
>> +)
>> +
>> +(define_expand "widen_usum<mode>3"
>> +  [(set (match_operand:<VWIDE> 0 "register_operand" "")
>> +	(plus:<VWIDE> (zero_extend:<VWIDE>
>> +		       (match_operand:VD_BHSI 1 "register_operand" ""))
>> +		      (match_operand:<VWIDE> 2 "register_operand" "")))]
>> +  "TARGET_SIMD"
>> +{
>> +  emit_insn (gen_aarch64_uaddw<mode> (operands[0], operands[2], operands[1]));
>> +  DONE;
>> +})
>> +
>>   (define_insn "aarch64_<ANY_EXTEND:su><ADDSUB:optab>w<mode>"
>>     [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
>>           (ADDSUB:<VWIDE> (match_operand:<VWIDE> 1 "register_operand" "w")
>> @@ -2760,6 +2814,18 @@
>>     [(set_attr "type" "neon_<ADDSUB:optab>_widen")]
>>   )
>>   
>> +(define_insn "aarch64_<ANY_EXTEND:su><ADDSUB:optab>w<mode>_internal"
>> +  [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
>> +        (ADDSUB:<VWIDE> (match_operand:<VWIDE> 1 "register_operand" "w")
>> +			(ANY_EXTEND:<VWIDE>
>> +			  (vec_select:<VHALF>
>> +			   (match_operand:VQW 2 "register_operand" "w")
>> +			   (match_operand:VQW 3 "vect_par_cnst_lo_half" "")))))]
>> +  "TARGET_SIMD"
>> +  "<ANY_EXTEND:su><ADDSUB:optab>w\\t%0.<Vwtype>, %1.<Vwtype>, %2.<Vhalftype>"
>> +  [(set_attr "type" "neon_<ADDSUB:optab>_widen")]
>> +)
>> +
>>   (define_insn "aarch64_<ANY_EXTEND:su><ADDSUB:optab>w2<mode>_internal"
>>     [(set (match_operand:<VWIDE> 0 "register_operand" "=w")
>>           (ADDSUB:<VWIDE> (match_operand:<VWIDE> 1 "register_operand" "w")
>> diff --git a/gcc/testsuite/gcc.target/aarch64/saddw-1.c b/gcc/testsuite/gcc.target/aarch64/saddw-1.c
>> new file mode 100644
>> index 0000000..9db5d00
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/saddw-1.c
>> @@ -0,0 +1,20 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O3" } */
>> +
>> +
> Extra newline.
>
>> +int
>> +t6(int len, void * dummy, short * __restrict x)
>> +{
>> +  len = len & ~31;
>> +  int result = 0;
>> +  __asm volatile ("");
>> +  for (int i = 0; i < len; i++)
>> +    result += x[i];
>> +  return result;
>> +}
>> +
>> +/* { dg-final { scan-assembler "saddw" } } */
>> +/* { dg-final { scan-assembler "saddw2" } } */
>> +
>> +
>> +
> Trailing newlines.
>
>> diff --git a/gcc/testsuite/gcc.target/aarch64/saddw-2.c b/gcc/testsuite/gcc.target/aarch64/saddw-2.c
>> new file mode 100644
>> index 0000000..6f8c8fd
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/saddw-2.c
>> @@ -0,0 +1,18 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O3" } */
>> +
>> +int
>> +t6(int len, void * dummy, int * __restrict x)
>> +{
>> +  len = len & ~31;
>> +  long long result = 0;
>> +  __asm volatile ("");
>> +  for (int i = 0; i < len; i++)
>> +    result += x[i];
>> +  return result;
>> +}
>> +
>> +/* { dg-final { scan-assembler "saddw" } } */
>> +/* { dg-final { scan-assembler "saddw2" } } */
>> +
>> +
> Trailing newlines.
>
>> diff --git a/gcc/testsuite/gcc.target/aarch64/uaddw-1.c b/gcc/testsuite/gcc.target/aarch64/uaddw-1.c
>> new file mode 100644
>> index 0000000..e34574f
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/uaddw-1.c
>> @@ -0,0 +1,17 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O3" } */
>> +
>> +
> Extra newline.
>
>> +int
>> +t6(int len, void * dummy, unsigned short * __restrict x)
>> +{
>> +  len = len & ~31;
>> +  unsigned int result = 0;
>> +  __asm volatile ("");
>> +  for (int i = 0; i < len; i++)
>> +    result += x[i];
>> +  return result;
>> +}
>> +
>> +/* { dg-final { scan-assembler "uaddw" } } */
>> +/* { dg-final { scan-assembler "uaddw2" } } */
>> diff --git a/gcc/testsuite/gcc.target/aarch64/uaddw-3.c b/gcc/testsuite/gcc.target/aarch64/uaddw-3.c
>> new file mode 100644
>> index 0000000..04bc7c9
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/uaddw-3.c
>> @@ -0,0 +1,20 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O3" } */
>> +
> Extra newline.
>
>> +
>> +int
>> +t6(int len, void * dummy, char * __restrict x)
>> +{
>> +  len = len & ~31;
>> +  unsigned short result = 0;
>> +  __asm volatile ("");
>> +  for (int i = 0; i < len; i++)
>> +    result += x[i];
>> +  return result;
>> +}
>> +
>> +/* { dg-final { scan-assembler "uaddw" } } */
>> +/* { dg-final { scan-assembler "uaddw2" } } */
>> +
>> +
>> +
> Trailing newlines.
>
>> diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
>> index b543519..46f41a1 100644
>> --- a/gcc/testsuite/lib/target-supports.exp
>> +++ b/gcc/testsuite/lib/target-supports.exp
>> @@ -3943,6 +3943,7 @@ proc check_effective_target_vect_widen_sum_hi_to_si_pattern { } {
>>       } else {
>>           set et_vect_widen_sum_hi_to_si_pattern_saved 0
>>           if { [istarget powerpc*-*-*]
>> +              || [istarget aarch64*-*-*]
>>                || [istarget ia64-*-*] } {
> Either line ia64 up with aarch64, or line aarch64 up with ia64.
>
> Thanks,
> James
>