From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jakub@sourceware.org>
Received: by sourceware.org (Postfix, from userid 2153)
 id C20183858023; Wed,  1 Sep 2021 11:41:56 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C20183858023
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="utf-8"
From: Jakub Jelinek <jakub@gcc.gnu.org>
To: gcc-cvs@gcc.gnu.org
Subject: [gcc r11-8948] vectorizer: Fix up vectorization using
 WIDEN_MINUS_EXPR [PR102124]
X-Act-Checkin: gcc
X-Git-Author: Jakub Jelinek <jakub@redhat.com>
X-Git-Refname: refs/heads/releases/gcc-11
X-Git-Oldrev: 9929fe9e7c35b176e2dea72b99782eaa16c0509d
X-Git-Newrev: 051040f0642cfd002d31f655a70aef50e6f44d25
Message-Id: <20210901114156.C20183858023@sourceware.org>
Date: Wed,  1 Sep 2021 11:41:56 +0000 (GMT)
X-BeenThere: gcc-cvs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-cvs mailing list <gcc-cvs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-cvs>,
 <mailto:gcc-cvs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-cvs/>
List-Help: <mailto:gcc-cvs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-cvs>,
 <mailto:gcc-cvs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Wed, 01 Sep 2021 11:41:56 -0000

https://gcc.gnu.org/g:051040f0642cfd002d31f655a70aef50e6f44d25

commit r11-8948-g051040f0642cfd002d31f655a70aef50e6f44d25
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Sep 1 13:30:51 2021 +0200

    vectorizer: Fix up vectorization using WIDEN_MINUS_EXPR [PR102124]
    
    The following testcase is miscompiled on aarch64-linux at -O3 since the
    introduction of WIDEN_MINUS_EXPR.
    The problem is if the inner type (half_type) is unsigned and the result
    type in which the subtraction is performed (type) has precision more than
    twice as larger as the inner type's precision.
    For other widening operations like WIDEN_{PLUS,MULT}_EXPR, if half_type
    is unsigned, the addition/multiplication result in itype is also unsigned
    and needs to be zero-extended to type.
    But subtraction is special, even when half_type is unsigned, the subtraction
    behaves as signed (also regardless of whether the result type is signed or
    unsigned), 0xfeU - 0xffU is -1 or 0xffffffffU, not 0x0000ffff.
    
    I think it is better not to use mixed signedness of types in
    WIDEN_MINUS_EXPR (have unsigned vector of operands and signed result
    vector), so this patch instead adds another cast to make sure we always
    sign-extend the result from itype to type if type is wider than itype.
    
    2021-09-01  Jakub Jelinek  <jakub@redhat.com>
    
            PR tree-optimization/102124
            * tree-vect-patterns.c (vect_recog_widen_op_pattern): For ORIG_CODE
            MINUS_EXPR, if itype is unsigned with smaller precision than type,
            add an extra cast to signed variant of itype to ensure sign-extension.
    
            * gcc.dg/torture/pr102124.c: New test.
    
    (cherry picked from commit bea07159d1d4c9a61c8f7097e9f88c2b206b1b2f)

Diff:
---
 gcc/testsuite/gcc.dg/torture/pr102124.c | 27 +++++++++++++++++++++++++++
 gcc/tree-vect-patterns.c                | 26 +++++++++++++++++++++++++-
 2 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/torture/pr102124.c b/gcc/testsuite/gcc.dg/torture/pr102124.c
new file mode 100644
index 00000000000..a158b4a60b6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr102124.c
@@ -0,0 +1,27 @@
+/* PR tree-optimization/102124 */
+
+int
+foo (const unsigned char *a, const unsigned char *b, unsigned long len)
+{
+  int ab, ba; 
+  unsigned long i;
+  for (i = 0, ab = 0, ba = 0; i < len; i++)
+    {
+      ab |= a[i] - b[i];
+      ba |= b[i] - a[i];
+    }   
+  return (ab | ba) >= 0;
+}
+
+int
+main ()
+{
+  unsigned char a[32] = { 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a' };
+  unsigned char b[32] = { 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a' };
+  unsigned char c[32] = { 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b' };
+  if (!foo (a, b, 16))
+    __builtin_abort ();
+  if (foo (a, c, 16))
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index 18398da584a..a0e5eae9540 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -1223,11 +1223,31 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
   /* Check target support  */
   tree vectype = get_vectype_for_scalar_type (vinfo, half_type);
   tree vecitype = get_vectype_for_scalar_type (vinfo, itype);
+  tree ctype = itype;
+  tree vecctype = vecitype;
+  if (orig_code == MINUS_EXPR
+      && TYPE_UNSIGNED (itype)
+      && TYPE_PRECISION (type) > TYPE_PRECISION (itype))
+    {
+      /* Subtraction is special, even if half_type is unsigned and no matter
+	 whether type is signed or unsigned, if type is wider than itype,
+	 we need to sign-extend from the widening operation result to the
+	 result type.
+	 Consider half_type unsigned char, operand 1 0xfe, operand 2 0xff,
+	 itype unsigned short and type either int or unsigned int.
+	 Widened (unsigned short) 0xfe - (unsigned short) 0xff is
+	 (unsigned short) 0xffff, but for type int we want the result -1
+	 and for type unsigned int 0xffffffff rather than 0xffff.  */
+      ctype = build_nonstandard_integer_type (TYPE_PRECISION (itype), 0);
+      vecctype = get_vectype_for_scalar_type (vinfo, ctype);
+    }
+
   enum tree_code dummy_code;
   int dummy_int;
   auto_vec<tree> dummy_vec;
   if (!vectype
       || !vecitype
+      || !vecctype
       || !supportable_widening_operation (vinfo, wide_code, last_stmt_info,
 					  vecitype, vectype,
 					  &dummy_code, &dummy_code,
@@ -1246,8 +1266,12 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
   gimple *pattern_stmt = gimple_build_assign (var, wide_code,
 					      oprnd[0], oprnd[1]);
 
+  if (vecctype != vecitype)
+    pattern_stmt = vect_convert_output (vinfo, last_stmt_info, ctype,
+					pattern_stmt, vecitype);
+
   return vect_convert_output (vinfo, last_stmt_info,
-			      type, pattern_stmt, vecitype);
+			      type, pattern_stmt, vecctype);
 }
 
 /* Try to detect multiplication on widened inputs, converting MULT_EXPR