public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* RE: [x86 PATCH] Tweak ix86_expand_int_compare to use PTEST for vector equality.
@ 2023-06-27 19:32 Roger Sayle
  2023-06-28  3:22 ` Hongtao Liu
  0 siblings, 1 reply; 6+ messages in thread
From: Roger Sayle @ 2023-06-27 19:32 UTC (permalink / raw)
  To: gcc-patches; +Cc: 'Uros Bizjak', 'Hongtao Liu'

[-- Attachment #1: Type: text/plain, Size: 1715 bytes --]


Doh! Wrong patch...
Roger
--

From: Roger Sayle <roger@nextmovesoftware.com> 
Sent: 27 June 2023 20:28
To: 'gcc-patches@gcc.gnu.org' <gcc-patches@gcc.gnu.org>
Cc: 'Uros Bizjak' <ubizjak@gmail.com>; 'Hongtao Liu' <crazylht@gmail.com>
Subject: [x86 PATCH] Tweak ix86_expand_int_compare to use PTEST for vector
equality.


Hi Uros,

Hopefully Hongtao will approve my patch to support SUBREG conversions
in STV https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622706.html
but for some of the examples described in the above post (and its test
case), I've also come up with an alternate/complementary/supplementary
fix of generating the PTEST during RTL expansion, rather than rely on
this being caught/optimized later during STV.

You may notice in this patch, the tests for TARGET_SSE4_1 and TImode
appear last.  When I was writing this, I initially also added support
for AVX VPTEST and OImode, before realizing that x86 doesn't (yet)
support 256-bit OImode (which also explains why we don't have an OImode
to V1OImode scalar-to-vector pass).  Retaining this clause ordering
should minimize the lines changed if things change in future.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2023-06-27  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        * config/i386/i386-expand.cc (ix86_expand_int_compare): If
        testing a TImode SUBREG of a 128-bit vector register against
        zero, use a PTEST instruction instead of first moving it to
        to scalar registers.


Please let me know what you think.
Roger
--


[-- Attachment #2: patchic.txt --]
[-- Type: text/plain, Size: 1290 bytes --]

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index 9a8d244..814d63b 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -2958,9 +2958,26 @@ ix86_expand_int_compare (enum rtx_code code, rtx op0, rtx op1)
   cmpmode = SELECT_CC_MODE (code, op0, op1);
   flags = gen_rtx_REG (cmpmode, FLAGS_REG);
 
+  /* Attempt to use PTEST, if available, when testing vector modes for
+     equality/inequality against zero.  */
+  if (op1 == const0_rtx
+      && SUBREG_P (op0)
+      && cmpmode == CCZmode
+      && SUBREG_BYTE (op0) == 0
+      && REG_P (SUBREG_REG (op0))
+      && VECTOR_MODE_P (GET_MODE (SUBREG_REG (op0)))
+      && TARGET_SSE4_1
+      && GET_MODE (op0) == TImode
+      && GET_MODE_SIZE (GET_MODE (SUBREG_REG (op0))) == 16)
+    {
+      tmp = SUBREG_REG (op0);
+      tmp = gen_rtx_UNSPEC (CCZmode, gen_rtvec (2, tmp, tmp), UNSPEC_PTEST);
+    }
+  else
+    tmp = gen_rtx_COMPARE (cmpmode, op0, op1);
+
   /* This is very simple, but making the interface the same as in the
      FP case makes the rest of the code easier.  */
-  tmp = gen_rtx_COMPARE (cmpmode, op0, op1);
   emit_insn (gen_rtx_SET (flags, tmp));
 
   /* Return the test that should be put into the flags user, i.e.

^ permalink raw reply	[flat|nested] 6+ messages in thread
* [x86 PATCH] Tweak ix86_expand_int_compare to use PTEST for vector equality.
@ 2023-06-27 19:27 Roger Sayle
  0 siblings, 0 replies; 6+ messages in thread
From: Roger Sayle @ 2023-06-27 19:27 UTC (permalink / raw)
  To: gcc-patches; +Cc: 'Uros Bizjak', 'Hongtao Liu'


[-- Attachment #1.1: Type: text/plain, Size: 1393 bytes --]

 

Hi Uros,

 

Hopefully Hongtao will approve my patch to support SUBREG conversions

in STV https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622706.html

but for some of the examples described in the above post (and its test

case), I've also come up with an alternate/complementary/supplementary

fix of generating the PTEST during RTL expansion, rather than rely on

this being caught/optimized later during STV.

 

You may notice in this patch, the tests for TARGET_SSE4_1 and TImode

appear last.  When I was writing this, I initially also added support

for AVX VPTEST and OImode, before realizing that x86 doesn't (yet)

support 256-bit OImode (which also explains why we don't have an OImode

to V1OImode scalar-to-vector pass).  Retaining this clause ordering

should minimize the lines changed if things change in future.

 

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap

and make -k check, both with and without --target_board=unix{-m32}

with no new failures.  Ok for mainline?

 

 

2023-06-27  Roger Sayle  <roger@nextmovesoftware.com>

 

gcc/ChangeLog

        * config/i386/i386-expand.cc (ix86_expand_int_compare): If

        testing a TImode SUBREG of a 128-bit vector register against

        zero, use a PTEST instruction instead of first moving it to

        to scalar registers.

 

 

Please let me know what you think.

Roger

--

 


[-- Attachment #2: patchvc.txt --]
[-- Type: text/plain, Size: 1078 bytes --]

diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc
index 4a3b07a..53bec08 100644
--- a/gcc/config/i386/i386-features.cc
+++ b/gcc/config/i386/i386-features.cc
@@ -631,7 +631,31 @@ general_scalar_chain::compute_convert_gain ()
 	    break;
 
 	  case COMPARE:
-	    /* Assume comparison cost is the same.  */
+	    if (XEXP (src, 1) != const0_rtx)
+	      {
+		/* cmp vs. pxor;pshufd;ptest.  */
+		igain += COSTS_N_INSNS (m - 3);
+	      }
+	    else if (GET_CODE (XEXP (src, 0)) != AND)
+	      {
+		/* test vs. pshufd;ptest.  */
+		igain += COSTS_N_INSNS (m - 2);
+	      }
+	    else if (GET_CODE (XEXP (XEXP (src, 0), 0)) != NOT)
+	      {
+		/* and;test vs. pshufd;ptest.  */
+		igain += COSTS_N_INSNS (2 * m - 2);
+	      }
+	    else if (TARGET_BMI)
+	      {
+		/* andn;test vs. pandn;pshufd;ptest.  */
+		igain += COSTS_N_INSNS (2 * m - 3);
+	      }
+	    else
+	      {
+		/* not;and;test vs. pandn;pshufd;ptest.  */
+		igain += COSTS_N_INSNS (3 * m - 3);
+	      }
 	    break;
 
 	  case CONST_INT:

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-07-12  7:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-27 19:32 [x86 PATCH] Tweak ix86_expand_int_compare to use PTEST for vector equality Roger Sayle
2023-06-28  3:22 ` Hongtao Liu
2023-07-11 20:57   ` Roger Sayle
2023-07-12  0:44     ` Hongtao Liu
2023-07-12  7:29       ` Roger Sayle
  -- strict thread matches above, loose matches on Subject: below --
2023-06-27 19:27 Roger Sayle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).