[gcc r12-8080] middle-end: Prevent the use of the cond inversion detection code when both conditions are external.

public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed

* [gcc r12-8080] middle-end: Prevent the use of the cond inversion detection code when both conditions are external.
@ 2022-04-11 14:09 Tamar Christina
  0 siblings, 0 replies; only message in thread
From: Tamar Christina @ 2022-04-11 14:09 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:78c718490bc2843d4dadcef8a0ae14aed1d15a32

commit r12-8080-g78c718490bc2843d4dadcef8a0ae14aed1d15a32
Author: Tamar Christina <tamar.christina@arm.com>
Date:   Mon Apr 11 15:09:05 2022 +0100

    middle-end: Prevent the use of the cond inversion detection code when both conditions are external. [PR105197]
    
    Previously ifcvt used to enforce that a mask A and the inverse of said mask be
    represented as ~A. So for the masks
    
      _25 = _6 != 0;
      _44 = _4 != 0;
    
    ifcvt would produce for an operation requiring the inverse of said mask
    
      _26 = ~_25;
      _43 = ~_44;
    
    but now that VN is applied to the entire function body we get a simplification
    on the mask and produce:
    
      _26 = _6 == 0;
      _43 = _4 == 0;
    
    This in itself is not a problem semantically speaking (though it does create
    more masks that need to be tracked) but when vectorizing the masked conditional
    we would still detect _26 and _43 to be inverses of _25 and _44 and mark them
    as requiring their operands be swapped.
    
    When vectorizing we swap the operands but don't find the BIT_NOT_EXPR to remove
    and so we leave the condition as is which produces invalid code:
    
    ------>vectorizing statement: _ifc__41 = _43 ? 0 : _ifc__40;
    created new init_stmt: vect_cst__136 = { 0, ... }
    add new stmt: _137 = mask__43.26_135 & loop_mask_111
    note:  add new stmt: vect__ifc__41.27_138 = VEC_COND_EXPR <_137, vect__ifc__40.25_133, vect_cst__136>;
    
    This fixes disabling the inversion detection code when the loop isn't masked
    since both conditional would be external.  We'd then not use the new cond_code
    and would incorrectly still swap the operands.
    
    The resulting code is also better than GCC-11 with most operations now
    predicated on the loop mask rather than a ptrue.
    
    gcc/ChangeLog:
    
            PR target/105197
            * tree-vect-stmts.cc (vectorizable_condition): Prevent cond swap when
            not masked.
    
    gcc/testsuite/ChangeLog:
    
            PR target/105197
            * gcc.target/aarch64/sve/pr105197-1.c: New test.
            * gcc.target/aarch64/sve/pr105197-2.c: New test.

Diff:
---
 gcc/testsuite/gcc.target/aarch64/sve/pr105197-1.c | 20 ++++++++++++++++++++
 gcc/testsuite/gcc.target/aarch64/sve/pr105197-2.c |  9 +++++++++
 gcc/tree-vect-stmts.cc                            |  2 +-
 3 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr105197-1.c b/gcc/testsuite/gcc.target/aarch64/sve/pr105197-1.c
new file mode 100644
index 00000000000..e33532d8bed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr105197-1.c
@@ -0,0 +1,20 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-additional-options "-O -ftree-vectorize" } */
+
+unsigned char arr_7[9][3];
+unsigned char (*main_arr_7)[3] = arr_7;
+int main() {
+  char arr_2[9];
+  int arr_6[9];
+  int x;
+  unsigned i;
+  for (i = 0; i < 9; ++i) {
+    arr_2[i] = 21;
+    arr_6[i] = 6;
+  }
+  for (i = arr_2[8] - 21; i < 2; i++)
+    x = arr_6[i] ? (main_arr_7[8][i] ? main_arr_7[8][i] : 8) : (char)arr_6[i];
+  if (x != 8)
+    __builtin_abort ();
+}
+
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr105197-2.c b/gcc/testsuite/gcc.target/aarch64/sve/pr105197-2.c
new file mode 100644
index 00000000000..5eec5cd837d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr105197-2.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O -ftree-vectorize" } */
+
+void f(int n, int y, char *arr_2, char *arr_6) {
+  for (int i = y; i < n; i++)
+    arr_6[i] = arr_6[i] ? (arr_2[i] ? 3 : 8) : 1;
+}
+
+/* { dg-final { scan-assembler-not {\tand\tp[0-9]+.b} } } */
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 87368e3787b..c9534ef9b1e 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -10512,7 +10512,7 @@ vectorizable_condition (vec_info *vinfo,
 	      bool honor_nans = HONOR_NANS (TREE_TYPE (cond.op0));
 	      tree_code orig_code = cond.code;
 	      cond.code = invert_tree_comparison (cond.code, honor_nans);
-	      if (loop_vinfo->scalar_cond_masked_set.contains (cond))
+	      if (!masked && loop_vinfo->scalar_cond_masked_set.contains (cond))
 		{
 		  masks = &LOOP_VINFO_MASKS (loop_vinfo);
 		  cond_code = cond.code;


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-04-11 14:09 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-11 14:09 [gcc r12-8080] middle-end: Prevent the use of the cond inversion detection code when both conditions are external Tamar Christina

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).