VEC_COND_EXPR optimizations v2

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Marc Glisse <marc.glisse@inria.fr>
To: gcc-patches@gcc.gnu.org
Subject: VEC_COND_EXPR optimizations v2
Date: Wed, 5 Aug 2020 15:32:32 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.23.453.2008051443320.18411@stedding.saclay.inria.fr> (raw)
In-Reply-To: <alpine.DEB.2.23.453.2007291859410.6927@stedding.saclay.inria.fr>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1641 bytes --]

New version that passed bootstrap+regtest during the night.

When vector comparisons were forced to use vec_cond_expr, we lost a number of 
optimizations (my fault for not adding enough testcases to prevent that). 
This patch tries to unwrap vec_cond_expr a bit so some optimizations can 
still happen.

I wasn't planning to add all those transformations together, but adding one 
caused a regression, whose fix introduced a second regression, etc.

Restricting to constant folding would not be sufficient, we also need at 
least things like X|0 or X&X. The transformations are quite conservative 
with :s and folding only if everything simplifies, we may want to relax 
this later. And of course we are going to miss things like a?b:c + a?c:b 
-> b+c.

In terms of number of operations, some transformations turning 2 
VEC_COND_EXPR into VEC_COND_EXPR + BIT_IOR_EXPR + BIT_NOT_EXPR might not look 
like a gain... I expect the bit_not disappears in most cases, and 
VEC_COND_EXPR looks more costly than a simpler BIT_IOR_EXPR.

I am a bit confused that with avx512 we get types like "vector(4) 
<signed-boolean:2>" with :2 and not :1 (is it a hack so true is 1 and not 
-1?), but that doesn't matter for this patch.

2020-08-05  Marc Glisse  <marc.glisse@inria.fr>

 	PR tree-optimization/95906
 	PR target/70314
 	* match.pd ((c ? a : b) op d, (c ? a : b) op (c ? d : e),
 	(v ? w : 0) ? a : b, c1 ? c2 ? a : b : b): New transformations.
 	(op (c ? a : b)): Update to match the new transformations.

 	* gcc.dg/tree-ssa/andnot-2.c: New file.
 	* gcc.dg/tree-ssa/pr95906.c: Likewise.
 	* gcc.target/i386/pr70314.c: Likewise.

-- 
Marc Glisse

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: TEXT/x-diff; name=vec7.patch, Size: 4623 bytes --]

diff --git a/gcc/match.pd b/gcc/match.pd
index a052c9e3dbc..f9297fcadbe 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3436,20 +3436,66 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (integer_zerop (@0))
    @2)))
 
-/* Sink unary operations to constant branches, but only if we do fold it to
-   constants.  */
+#if GIMPLE
+/* Sink unary operations to branches, but only if we do fold both.  */
 (for op (negate bit_not abs absu)
  (simplify
-  (op (vec_cond @0 VECTOR_CST@1 VECTOR_CST@2))
-  (with
-   {
-     tree cst1, cst2;
-     cst1 = const_unop (op, type, @1);
-     if (cst1)
-       cst2 = const_unop (op, type, @2);
-   }
-   (if (cst1 && cst2)
-    (vec_cond @0 { cst1; } { cst2; })))))
+  (op (vec_cond:s @0 @1 @2))
+  (vec_cond @0 (op! @1) (op! @2))))
+
+/* Sink binary operation to branches, but only if we can fold it.  */
+(for op (tcc_comparison plus minus mult bit_and bit_ior bit_xor
+	 rdiv trunc_div ceil_div floor_div round_div
+	 trunc_mod ceil_mod floor_mod round_mod min max)
+/* (c ? a : b) op (c ? d : e)  -->  c ? (a op d) : (b op e) */
+ (simplify
+  (op (vec_cond:s @0 @1 @2) (vec_cond:s @0 @3 @4))
+  (vec_cond @0 (op! @1 @3) (op! @2 @4)))
+
+/* (c ? a : b) op d  -->  c ? (a op d) : (b op d) */
+ (simplify
+  (op (vec_cond:s @0 @1 @2) @3)
+  (vec_cond @0 (op! @1 @3) (op! @2 @3)))
+ (simplify
+  (op @3 (vec_cond:s @0 @1 @2))
+  (vec_cond @0 (op! @3 @1) (op! @3 @2))))
+#endif
+
+/* (v ? w : 0) ? a : b is just (v & w) ? a : b  */
+(simplify
+ (vec_cond (vec_cond:s @0 @3 integer_zerop) @1 @2)
+ (if (types_match (@0, @3))
+  (vec_cond (bit_and @0 @3) @1 @2)))
+(simplify
+ (vec_cond (vec_cond:s @0 integer_all_onesp @3) @1 @2)
+ (if (types_match (@0, @3))
+  (vec_cond (bit_ior @0 @3) @1 @2)))
+(simplify
+ (vec_cond (vec_cond:s @0 integer_zerop @3) @1 @2)
+ (if (types_match (@0, @3))
+  (vec_cond (bit_ior @0 (bit_not @3)) @2 @1)))
+(simplify
+ (vec_cond (vec_cond:s @0 @3 integer_all_onesp) @1 @2)
+ (if (types_match (@0, @3))
+  (vec_cond (bit_and @0 (bit_not @3)) @2 @1)))
+
+/* c1 ? c2 ? a : b : b  -->  (c1 & c2) ? a : b  */
+(simplify
+ (vec_cond @0 (vec_cond:s @1 @2 @3) @3)
+ (if (types_match (@0, @1))
+  (vec_cond (bit_and @0 @1) @2 @3)))
+(simplify
+ (vec_cond @0 @2 (vec_cond:s @1 @2 @3))
+ (if (types_match (@0, @1))
+  (vec_cond (bit_ior @0 @1) @2 @3)))
+(simplify
+ (vec_cond @0 (vec_cond:s @1 @2 @3) @2)
+ (if (types_match (@0, @1))
+  (vec_cond (bit_ior (bit_not @0) @1) @2 @3)))
+(simplify
+ (vec_cond @0 @3 (vec_cond:s @1 @2 @3))
+ (if (types_match (@0, @1))
+  (vec_cond (bit_and (bit_not @0) @1) @2 @3)))
 
 /* Simplification moved from fold_cond_expr_with_comparison.  It may also
    be extended.  */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/andnot-2.c b/gcc/testsuite/gcc.dg/tree-ssa/andnot-2.c
new file mode 100644
index 00000000000..e0955ce3ffd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/andnot-2.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-forwprop3-raw -w -Wno-psabi" } */
+
+typedef long vec __attribute__((vector_size(16)));
+vec f(vec x){
+  vec y = x < 10;
+  return y & (y == 0);
+}
+
+/* { dg-final { scan-tree-dump-not "_expr" "forwprop3" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c b/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c
new file mode 100644
index 00000000000..3d820a58e93
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-forwprop3-raw -w -Wno-psabi" } */
+
+// FIXME: this should further optimize to a MAX_EXPR
+typedef signed char v16i8 __attribute__((vector_size(16)));
+v16i8 f(v16i8 a, v16i8 b)
+{
+    v16i8 cmp = (a > b);
+    return (cmp & a) | (~cmp & b);
+}
+
+/* { dg-final { scan-tree-dump-not "bit_(and|ior)_expr" "forwprop3" } } */
+/* { dg-final { scan-tree-dump-times "vec_cond_expr" 1 "forwprop3" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr70314.c b/gcc/testsuite/gcc.target/i386/pr70314.c
new file mode 100644
index 00000000000..aad8dd9b57e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr70314.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-march=skylake-avx512 -O2" } */
+/* { dg-final { scan-assembler-times "cmp" 2 } } */
+/* { dg-final { scan-assembler-not "and" } } */
+
+typedef long vec __attribute__((vector_size(16)));
+vec f(vec x, vec y){
+  return (x < 5) & (y < 8);
+}
+
+/* On x86_64, currently
+	vpcmpq	$2, .LC1(%rip), %xmm1, %k1
+	vpcmpq	$2, .LC0(%rip), %xmm0, %k0{%k1}
+	vpmovm2q	%k0, %xmm0
+*/

next prev parent reply	other threads:[~2020-08-05 13:32 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-30  7:49 VEC_COND_EXPR optimizations Marc Glisse
2020-07-31 11:18 ` Richard Sandiford
2020-07-31 11:38   ` Marc Glisse
2020-07-31 11:43     ` Richard Biener
2020-07-31 11:57       ` Marc Glisse
2020-07-31 12:50     ` Richard Sandiford
2020-07-31 12:59       ` Richard Biener
2020-07-31 13:01       ` Marc Glisse
2020-07-31 13:13         ` Marc Glisse
2020-07-31 11:35 ` Richard Biener
2020-07-31 11:39   ` Richard Biener
2020-07-31 11:47     ` Richard Biener
2020-07-31 12:08       ` Richard Biener
2020-07-31 12:12       ` Marc Glisse
2020-08-05 13:32 ` Marc Glisse [this message]
2020-08-05 14:24   ` VEC_COND_EXPR optimizations v2 Richard Biener
2020-08-06  8:17     ` Christophe Lyon
2020-08-06  9:05       ` Marc Glisse
2020-08-06 11:25         ` Christophe Lyon
2020-08-06 11:42           ` Marc Glisse
2020-08-06 12:00             ` Christophe Lyon
2020-08-06 18:07               ` Marc Glisse
2020-08-07  6:38                 ` Richard Biener
2020-08-07  8:33                   ` Marc Glisse
2020-08-07  8:47                     ` Richard Biener
2020-08-07 12:15                       ` Marc Glisse
2020-08-07 13:04                         ` Richard Biener
2020-08-06 10:29       ` Richard Biener
2020-08-06 11:11         ` Marc Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.23.453.2008051443320.18411@stedding.saclay.inria.fr \
    --to=marc.glisse@inria.fr \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).