public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Xingxing Pan <xxingpan@marvell.com>
To: "ramrad01@arm.com" <ramrad01@arm.com>
Cc: Julian Brown <julian@codesourcery.com>,
	       James Greenhalgh	<james.greenhalgh@arm.com>,
	       Kyrill Tkachov <kyrylo.tkachov@arm.com>,
	       Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>,
	       Richard Earnshaw	<richard.earnshaw@arm.com>,
	       "nickc@redhat.com" <nickc@redhat.com>,
	Xinyu Qi	<xyqi@marvell.com>,
	       Liping Gao <lgao1@marvell.com>, Joey Ye	<joey.ye@arm.com>,
	       "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] [ARM] Fix widen-sum pattern in neon.md.
Date: Mon, 20 Apr 2015 06:05:00 -0000	[thread overview]
Message-ID: <553496BA.2020006@marvell.com> (raw)
In-Reply-To: <CAJA7tRZBJMaHdHub+TqYPo9VzwpA9y-TjEGXSGPvbiLNK-1Xgw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2070 bytes --]

On 04/15/2015 03:13 AM, Ramana Radhakrishnan wrote:
> On Thu, Mar 5, 2015 at 1:34 PM, Xingxing Pan <xxingpan@marvell.com> wrote:
>> Hi,
>>
>> The expanding of widen-sum pattern always fails. The vectorizer expects the
>> operands to have the same size, while the current implementation of
>> widen-sum pattern dose not conform to this.
>>
>> This patch implements the widen-sum pattern with vpadal. Change the vaddw
>> pattern to anonymous. Add widen-sum test cases for neon.
>>
>
> Can you please respin addressing James and Kyrill's comments ?
>
>
> Ramana
>
>> --
>> Regards,
>> Xingxing

Hi,

Sorry for late response.

The pattern is rewritten to utilize neon_vpadal<sup><mode>'s "0" 
constraints. Have run vect.exp and neon.exp in an armv7 board.

vect.exp has two new XFAILs:
XFAIL: gcc.dg/vect/slp-reduc-3.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 1
XFAIL: gcc.dg/vect/slp-reduc-3.c -flto -ffat-lto-objects 
scan-tree-dump-times vect "vectorizing stmts using SLP" 1

This is because widen-sum optimization precedes SLP. The xfail predicate 
vect_widen_sum_hi_to_si becomes true when widen-sum is enabled.

neon.exp has four new XFAILs:
XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c 
scan-tree-dump-times vect "pattern recognized.*w\\+" 1
XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c 
scan-rtl-dump-times expand "UNSPEC_VPADAL" 1
XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s.c 
scan-tree-dump-times vect "pattern recognized.*w\\+" 1
XFAIL: gcc.target/arm/neon/vect-widen-sum-char2short-s.c 
scan-rtl-dump-times expand "UNSPEC_VPADAL" 1

If the widen-sum pattern is successfully expanded, "w+" and 
"UNSPEC_VPADAL" should appear in the dump file like other 
vect-widen-sum-*.c tests. But vect-widen-sum-char2short-s[-d].c is 
special because at tree level the signed operations will be converted 
into unsigned operations, which destroy the widen-sum pattern. That is 
due to the workaround of PR tree-optimization/25125. I just add xfail 
following gcc.dg/vect/vect-reduc-pattern-2c.c.


-- 
Regards,
Xingxing

[-- Attachment #2: fix-widen-sum.patch --]
[-- Type: text/x-patch, Size: 17736 bytes --]

commit c44b5bd19efb029b8bbd4e3c7e2d631bdc482b7c
Author: Xingxing Pan <xxingpan@marvell.com>
Date:   Sun Apr 19 15:54:43 2015 +0800

    Fix widen-sum pattern in neon.md.
    
    gcc/
    
    2015-04-19  Xingxing Pan  <xxingpan@marvell.com>
    
        * config/arm/iterators.md (VWSD): New.
          (V_widen_sum_d): New.
        * config/arm/neon.md (widen_ssum<mode>3): Redefined.
        (widen_usum<mode>3): Ditto.
        (neon_svaddw<mode>3): New anonymous define_insn.
        (neon_uvaddw<mode>3): Ditto.
    
    gcc/testsuite/
    
    2015-04-19  Xingxing Pan  <xxingpan@marvell.com>
    
        * gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c: New.
        * gcc.target/arm/neon/vect-widen-sum-char2short-s.c: New.
        * gcc.target/arm/neon/vect-widen-sum-char2short-u-d.c: New.
        * gcc.target/arm/neon/vect-widen-sum-char2short-u.c: New.
        * gcc.target/arm/neon/vect-widen-sum-short2int-s-d.c: New.
        * gcc.target/arm/neon/vect-widen-sum-short2int-s.c: New.
        * gcc.target/arm/neon/vect-widen-sum-short2int-u-d.c: New.
        * gcc.target/arm/neon/vect-widen-sum-short2int-u.c: New.
        * lib/target-supports.exp
        (check_effective_target_vect_widen_sum_hi_to_si_pattern): Return 1 for ARM NEON.
        (check_effective_target_vect_widen_sum_hi_to_si): Ditto.
        (check_effective_target_vect_widen_sum_qi_to_hi): Ditto.

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index f7f8ab7..f73278d 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -95,6 +95,9 @@
 ;; Widenable modes.
 (define_mode_iterator VW [V8QI V4HI V2SI])
 
+;; Widenable modes.  Used by widen sum.
+(define_mode_iterator VWSD [V8QI V4HI V16QI V8HI])
+
 ;; Narrowable modes.
 (define_mode_iterator VN [V8HI V4SI V2DI])
 
@@ -555,9 +558,14 @@
 ;; Same as V_widen, but lower-case.
 (define_mode_attr V_widen_l [(V8QI "v8hi") (V4HI "v4si") ( V2SI "v2di")])
 
-;; Widen. Result is half the number of elements, but widened to double-width.
+;; Widen.  Result is half the number of elements, but widened to double-width.
 (define_mode_attr V_unpack   [(V16QI "V8HI") (V8HI "V4SI") (V4SI "V2DI")])
 
+;; Widen.  Result is half the number of elements, but widened to double-width.
+;; Used by widen sum.
+(define_mode_attr V_widen_sum_d [(V8QI "V4HI") (V4HI "V2SI")
+                                 (V16QI "V8HI") (V8HI "V4SI")])
+
 ;; Conditions to be used in extend<mode>di patterns.
 (define_mode_attr qhs_zextenddi_cond [(SI "") (HI "&& arm_arch6") (QI "")])
 (define_mode_attr qhs_sextenddi_cond [(SI "") (HI "&& arm_arch6")
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 63c327e..839883f 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -1174,7 +1174,29 @@
 
 ;; Widening operations
 
-(define_insn "widen_ssum<mode>3"
+(define_expand "widen_usum<mode>3"
+ [(match_operand:<V_widen_sum_d> 0 "s_register_operand" "")
+  (match_operand:VWSD 1 "s_register_operand" "")
+  (match_operand:<V_widen_sum_d> 2 "s_register_operand" "")]
+  "TARGET_NEON"
+  {
+    emit_insn (gen_neon_vpadalu<mode> (operands[0], operands[2], operands[1]));
+    DONE;
+  }
+)
+
+(define_expand "widen_ssum<mode>3"
+ [(match_operand:<V_widen_sum_d> 0 "s_register_operand" "")
+  (match_operand:VWSD 1 "s_register_operand" "")
+  (match_operand:<V_widen_sum_d> 2 "s_register_operand" "")]
+  "TARGET_NEON"
+  {
+    emit_insn (gen_neon_vpadals<mode> (operands[0], operands[2], operands[1]));
+    DONE;
+  }
+)
+
+(define_insn "*neon_svaddw<mode>3"
   [(set (match_operand:<V_widen> 0 "s_register_operand" "=w")
 	(plus:<V_widen> (sign_extend:<V_widen>
 			  (match_operand:VW 1 "s_register_operand" "%w"))
@@ -1184,7 +1206,7 @@
   [(set_attr "type" "neon_add_widen")]
 )
 
-(define_insn "widen_usum<mode>3"
+(define_insn "*neon_uvaddw<mode>3"
   [(set (match_operand:<V_widen> 0 "s_register_operand" "=w")
 	(plus:<V_widen> (zero_extend:<V_widen>
 			  (match_operand:VW 1 "s_register_operand" "%w"))
diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c
new file mode 100644
index 0000000..8d0278c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s-d.c
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options "-O2 -ffast-math -ftree-vectorize -mvectorize-with-neon-double -fdump-tree-vect-details -fdump-rtl-expand" } */
+/* { dg-add-options arm_neon } */
+
+/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { xfail *-*-* } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
+/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { xfail *-*-* } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
+
+typedef signed char STYPE1;
+typedef signed short STYPE2;
+extern void abort (void);
+
+#define N 128
+STYPE1 sdata[N];
+
+volatile int y = 0;
+
+__attribute__ ((noinline)) int
+ssum ()
+{
+  int i;
+  STYPE2 sum = 0;
+  STYPE2 check_sum = 0;
+
+  /* widenning sum: sum chars into short.
+
+     Like gcc.dg/vect/vect-reduc-pattern-2c.c, the widening-summation pattern
+     is currently not detected because of this patch:
+
+     2005-12-26  Kazu Hirata  <kazu@codesourcery.com>
+        PR tree-optimization/25125
+   */
+
+  for (i = 0; i < N; i++)
+    {
+      sdata[i] = i*2;
+      check_sum += sdata[i];
+      /* Avoid vectorization.  */
+      if (y)
+        abort ();
+    }
+
+  /* widenning sum: sum chars into int.  */
+  for (i = 0; i < N; i++)
+    {
+      sum += sdata[i];
+    }
+
+  /* check results:  */
+  if (sum != check_sum)
+    abort ();
+
+  return 0;
+}
+
+int
+main (void)
+{
+  ssum ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s.c
new file mode 100644
index 0000000..f7384c3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-s.c
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options "-O2 -ffast-math -ftree-vectorize -fdump-tree-vect-details -fdump-rtl-expand" } */
+/* { dg-add-options arm_neon } */
+
+/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { xfail *-*-* } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
+/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { xfail *-*-* } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
+
+typedef signed char STYPE1;
+typedef signed short STYPE2;
+extern void abort (void);
+
+#define N 128
+STYPE1 sdata[N];
+
+volatile int y = 0;
+
+__attribute__ ((noinline)) int
+ssum ()
+{
+  int i;
+  STYPE2 sum = 0;
+  STYPE2 check_sum = 0;
+
+  /* widenning sum: sum chars into short.
+
+     Like gcc.dg/vect/vect-reduc-pattern-2c.c, the widening-summation pattern
+     is currently not detected because of this patch:
+
+     2005-12-26  Kazu Hirata  <kazu@codesourcery.com>
+        PR tree-optimization/25125
+   */
+
+  for (i = 0; i < N; i++)
+    {
+      sdata[i] = i*2;
+      check_sum += sdata[i];
+      /* Avoid vectorization.  */
+      if (y)
+        abort ();
+    }
+
+  /* widenning sum: sum chars into int.  */
+  for (i = 0; i < N; i++)
+    {
+      sum += sdata[i];
+    }
+
+  /* check results:  */
+  if (sum != check_sum)
+    abort ();
+
+  return 0;
+}
+
+int
+main (void)
+{
+  ssum ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-u-d.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-u-d.c
new file mode 100644
index 0000000..35f8fa7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-u-d.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options "-O2 -ffast-math -ftree-vectorize -mvectorize-with-neon-double -fdump-tree-vect-details -fdump-rtl-expand" } */
+/* { dg-add-options arm_neon } */
+
+/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { target { arm_neon } } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
+/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { target { arm_neon } } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
+
+typedef unsigned char UTYPE1;
+typedef unsigned short UTYPE2;
+extern void abort (void);
+
+#define N 128
+UTYPE1 udata[N];
+
+volatile int y = 0;
+
+__attribute__ ((noinline)) int
+usum ()
+{
+  int i;
+  UTYPE2 sum = 0;
+  UTYPE2 check_sum = 0;
+
+  for (i = 0; i < N; i++)
+    {
+      udata[i] = i*2;
+      check_sum += udata[i];
+      /* Avoid vectorization.  */
+      if (y)
+        abort ();
+    }
+
+  /* widenning sum: sum chars into int.  */
+  for (i = 0; i < N; i++)
+    {
+      sum += udata[i];
+    }
+
+  /* check results:  */
+  if (sum != check_sum)
+    abort ();
+
+  return 0;
+}
+
+int
+main (void)
+{
+  usum ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-u.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-u.c
new file mode 100644
index 0000000..38af5f0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-char2short-u.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options "-O2 -ffast-math -ftree-vectorize -fdump-tree-vect-details -fdump-rtl-expand" } */
+/* { dg-add-options arm_neon } */
+
+/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { target { arm_neon } } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
+/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { target { arm_neon } } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
+
+typedef unsigned char UTYPE1;
+typedef unsigned short UTYPE2;
+extern void abort (void);
+
+#define N 128
+UTYPE1 udata[N];
+
+volatile int y = 0;
+
+__attribute__ ((noinline)) int
+usum ()
+{
+  int i;
+  UTYPE2 sum = 0;
+  UTYPE2 check_sum = 0;
+
+  for (i = 0; i < N; i++)
+    {
+      udata[i] = i*2;
+      check_sum += udata[i];
+      /* Avoid vectorization.  */
+      if (y)
+        abort ();
+    }
+
+  /* widenning sum: sum chars into int.  */
+  for (i = 0; i < N; i++)
+    {
+      sum += udata[i];
+    }
+
+  /* check results:  */
+  if (sum != check_sum)
+    abort ();
+
+  return 0;
+}
+
+int
+main (void)
+{
+  usum ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-s-d.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-s-d.c
new file mode 100644
index 0000000..ef765de
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-s-d.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options "-O2 -ffast-math -ftree-vectorize -mvectorize-with-neon-double -fdump-tree-vect-details -fdump-rtl-expand" } */
+/* { dg-add-options arm_neon } */
+
+/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { target { arm_neon } } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
+/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { target arm_neon } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
+
+typedef signed short STYPE1;
+typedef signed int STYPE2;
+extern void abort (void);
+
+#define N 128
+STYPE1 sdata[N];
+
+volatile int y = 0;
+
+__attribute__ ((noinline)) int
+ssum ()
+{
+  int i;
+  STYPE2 sum = 0;
+  STYPE2 check_sum = 0;
+
+  for (i = 0; i < N; i++)
+    {
+      sdata[i] = i*2;
+      check_sum += sdata[i];
+      /* Avoid vectorization.  */
+      if (y)
+        abort ();
+    }
+
+  /* widenning sum: sum chars into int.  */
+  for (i = 0; i < N; i++)
+    {
+      sum += sdata[i];
+    }
+
+  /* check results:  */
+  if (sum != check_sum)
+    abort ();
+
+  return 0;
+}
+
+int
+main (void)
+{
+  ssum ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-s.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-s.c
new file mode 100644
index 0000000..fb38d56
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-s.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options "-O2 -ffast-math -ftree-vectorize -fdump-tree-vect-details -fdump-rtl-expand" } */
+/* { dg-add-options arm_neon } */
+
+/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { target { arm_neon } } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
+/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { target { arm_neon } } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
+
+typedef signed short STYPE1;
+typedef signed int STYPE2;
+extern void abort (void);
+
+#define N 128
+STYPE1 sdata[N];
+
+volatile int y = 0;
+
+__attribute__ ((noinline)) int
+ssum ()
+{
+  int i;
+  STYPE2 sum = 0;
+  STYPE2 check_sum = 0;
+
+  for (i = 0; i < N; i++)
+    {
+      sdata[i] = i*2;
+      check_sum += sdata[i];
+      /* Avoid vectorization.  */
+      if (y)
+        abort ();
+    }
+
+  /* widenning sum: sum chars into int.  */
+  for (i = 0; i < N; i++)
+    {
+      sum += sdata[i];
+    }
+
+  /* check results:  */
+  if (sum != check_sum)
+    abort ();
+
+  return 0;
+}
+
+int
+main (void)
+{
+  ssum ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-u-d.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-u-d.c
new file mode 100644
index 0000000..5a3dfd6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-u-d.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options "-O2 -ffast-math -ftree-vectorize -mvectorize-with-neon-double -fdump-tree-vect-details -fdump-rtl-expand" } */
+/* { dg-add-options arm_neon } */
+
+/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { target { arm_neon } } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
+/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { target { arm_neon } } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
+
+typedef unsigned short UTYPE1;
+typedef unsigned int UTYPE2;
+extern void abort (void);
+
+#define N 128
+UTYPE1 udata[N];
+
+volatile int y = 0;
+
+__attribute__ ((noinline)) int
+usum ()
+{
+  int i;
+  UTYPE2 sum = 0;
+  UTYPE2 check_sum = 0;
+
+  for (i = 0; i < N; i++)
+    {
+      udata[i] = i*2;
+      check_sum += udata[i];
+      /* Avoid vectorization.  */
+      if (y)
+        abort ();
+    }
+
+  /* widenning sum: sum chars into int.  */
+  for (i = 0; i < N; i++)
+    {
+      sum += udata[i];
+    }
+
+  /* check results:  */
+  if (sum != check_sum)
+    abort ();
+
+  return 0;
+}
+
+int
+main (void)
+{
+  usum ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-u.c b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-u.c
new file mode 100644
index 0000000..770b08d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/neon/vect-widen-sum-short2int-u.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options "-O2 -ffast-math -ftree-vectorize -fdump-tree-vect-details -fdump-rtl-expand" } */
+/* { dg-add-options arm_neon } */
+
+/* { dg-final { scan-tree-dump-times "pattern recognized.*w\\\+" 1 "vect" { target { arm_neon } } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
+/* { dg-final { scan-rtl-dump-times "UNSPEC_VPADAL" 1 "expand" { target { arm_neon } } } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */
+
+typedef unsigned short UTYPE1;
+typedef unsigned int UTYPE2;
+extern void abort (void);
+
+#define N 128
+UTYPE1 udata[N];
+
+volatile int y = 0;
+
+__attribute__ ((noinline)) int
+usum ()
+{
+  int i;
+  UTYPE2 sum = 0;
+  UTYPE2 check_sum = 0;
+
+  for (i = 0; i < N; i++)
+    {
+      udata[i] = i*2;
+      check_sum += udata[i];
+      /* Avoid vectorization.  */
+      if (y)
+        abort ();
+    }
+
+  /* widenning sum: sum chars into int.  */
+  for (i = 0; i < N; i++)
+    {
+      sum += udata[i];
+    }
+
+  /* check results:  */
+  if (sum != check_sum)
+    abort ();
+
+  return 0;
+}
+
+int
+main (void)
+{
+  usum ();
+  return 0;
+}
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index f632d00..477ab53 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3795,6 +3795,7 @@ proc check_effective_target_vect_widen_sum_hi_to_si_pattern { } {
     } else {
         set et_vect_widen_sum_hi_to_si_pattern_saved 0
         if { [istarget powerpc*-*-*]
+             || ([istarget arm*-*-*] && [check_effective_target_arm_neon_ok])
              || [istarget ia64-*-*] } {
             set et_vect_widen_sum_hi_to_si_pattern_saved 1
         }
@@ -3818,7 +3819,8 @@ proc check_effective_target_vect_widen_sum_hi_to_si { } {
     } else {
         set et_vect_widen_sum_hi_to_si_saved [check_effective_target_vect_unpack]
         if { [istarget powerpc*-*-*] 
-	     || [istarget ia64-*-*] } {
+             || ([istarget arm*-*-*] && [check_effective_target_arm_neon_ok])
+             || [istarget ia64-*-*] } {
             set et_vect_widen_sum_hi_to_si_saved 1
         }
     }
@@ -3841,7 +3843,7 @@ proc check_effective_target_vect_widen_sum_qi_to_hi { } {
     } else {
         set et_vect_widen_sum_qi_to_hi_saved 0
 	if { [check_effective_target_vect_unpack] 
-	     || [check_effective_target_arm_neon_ok]
+	     || ([istarget arm*-*-*] && [check_effective_target_arm_neon_ok])
 	     || [istarget ia64-*-*] } {
             set et_vect_widen_sum_qi_to_hi_saved 1
 	}

      reply	other threads:[~2015-04-20  6:05 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-05 13:35 Xingxing Pan
2015-03-05 13:55 ` Kyrill Tkachov
2015-03-05 14:16 ` James Greenhalgh
2015-04-14 19:13 ` Ramana Radhakrishnan
2015-04-20  6:05   ` Xingxing Pan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=553496BA.2020006@marvell.com \
    --to=xxingpan@marvell.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=james.greenhalgh@arm.com \
    --cc=joey.ye@arm.com \
    --cc=julian@codesourcery.com \
    --cc=kyrylo.tkachov@arm.com \
    --cc=lgao1@marvell.com \
    --cc=nickc@redhat.com \
    --cc=ramana.radhakrishnan@arm.com \
    --cc=ramrad01@arm.com \
    --cc=richard.earnshaw@arm.com \
    --cc=xyqi@marvell.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).