public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Roger Sayle" <roger@nextmovesoftware.com>
To: <gcc-patches@gcc.gnu.org>
Subject: [x86 PATCH] Move V1TI shift/rotate lowering from expand to pre-reload split.
Date: Fri, 5 Aug 2022 19:36:42 +0100	[thread overview]
Message-ID: <037701d8a8fa$4f65ed80$ee31c880$@nextmovesoftware.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 2337 bytes --]


This patch moves the lowering of 128-bit V1TImode shifts and rotations by
constant bit counts to sequences of SSE operations from the RTL expansion
pass to the pre-reload split pass.  Postponing this splitting of shifts
and rotates enables (will enable) the TImode equivalents of these
operations/
instructions to be considered as candidates by the (TImode) STV pass.
Technically, this patch changes the existing expanders to continue to
lower shifts by variable amounts, but constant operands become RTL
instructions, specified by define_insn_and_split that are triggered by
x86_pre_reload_split.  The one minor complication is that logical shifts
by multiples of eight, don't get split, but are handled by existing insn
patterns, such as sse2_ashlv1ti3 and sse2_lshrv1ti3.  There should be no
changes in generated code with this patch, which just adjusts the pass
in which transformations get applied.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}, with
no new failures.  Ok for mainline?



2022-08-05  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        * config/i386/sse.md (ashlv1ti3): Delay lowering of logical left
        shifts by constant bit counts.
        (*ashlvti3_internal): New define_insn_and_split that lowers
        logical left shifts by constant bit counts, that aren't multiples
        of 8, before reload.
        (lshrv1ti3): Delay lowering of logical right shifts by constant.
        (*lshrv1ti3_internal): New define_insn_and_split that lowers
        logical right shifts by constant bit counts, that aren't multiples
        of 8, before reload.
        (ashrv1ti3):: Delay lowering of arithmetic right shifts by
        constant bit counts.
        (*ashrv1ti3_internal): New define_insn_and_split that lowers
        arithmetic right shifts by constant bit counts before reload.
        (rotlv1ti3): Delay lowering of rotate left by constant.
        (*rotlv1ti3_internal): New define_insn_and_split that lowers
        rotate left by constant bits counts before reload.
        (rotrv1ti3): Delay lowering of rotate right by constant.
        (*rotrv1ti3_internal): New define_insn_and_split that lowers
        rotate right by constant bits counts before reload.


Thanks in advance,
Roger
--


[-- Attachment #2: patchrt.txt --]
[-- Type: text/plain, Size: 3617 bytes --]

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 14d12d1..d3ea52f 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -15995,10 +15995,30 @@
 
 (define_expand "ashlv1ti3"
   [(set (match_operand:V1TI 0 "register_operand")
+        (ashift:V1TI
+         (match_operand:V1TI 1 "register_operand")
+         (match_operand:QI 2 "general_operand")))]
+  "TARGET_SSE2 && TARGET_64BIT"
+{
+  if (!CONST_INT_P (operands[2]))
+    {
+      ix86_expand_v1ti_shift (ASHIFT, operands);
+      DONE;
+    }
+})
+
+(define_insn_and_split "*ashlv1ti3_internal"
+  [(set (match_operand:V1TI 0 "register_operand")
 	(ashift:V1TI
 	 (match_operand:V1TI 1 "register_operand")
-	 (match_operand:QI 2 "general_operand")))]
-  "TARGET_SSE2 && TARGET_64BIT"
+	 (match_operand:SI 2 "const_0_to_255_operand")))]
+  "TARGET_SSE2
+   && TARGET_64BIT
+   && (INTVAL (operands[2]) & 7) != 0
+   && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
 {
   ix86_expand_v1ti_shift (ASHIFT, operands);
   DONE;
@@ -16011,6 +16031,26 @@
 	 (match_operand:QI 2 "general_operand")))]
   "TARGET_SSE2 && TARGET_64BIT"
 {
+  if (!CONST_INT_P (operands[2]))
+    {
+      ix86_expand_v1ti_shift (LSHIFTRT, operands);
+      DONE;
+    }
+})
+
+(define_insn_and_split "*lshrv1ti3_internal"
+  [(set (match_operand:V1TI 0 "register_operand")
+	(lshiftrt:V1TI
+	 (match_operand:V1TI 1 "register_operand")
+	 (match_operand:SI 2 "const_0_to_255_operand")))]
+  "TARGET_SSE2
+   && TARGET_64BIT
+   && (INTVAL (operands[2]) & 7) != 0
+   && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
   ix86_expand_v1ti_shift (LSHIFTRT, operands);
   DONE;
 })
@@ -16022,6 +16062,26 @@
 	 (match_operand:QI 2 "general_operand")))]
   "TARGET_SSE2 && TARGET_64BIT"
 {
+  if (!CONST_INT_P (operands[2]))
+    {
+      ix86_expand_v1ti_ashiftrt (operands);
+      DONE;
+    }
+})
+
+
+(define_insn_and_split "*ashrv1ti3_internal"
+  [(set (match_operand:V1TI 0 "register_operand")
+	(ashiftrt:V1TI
+	 (match_operand:V1TI 1 "register_operand")
+	 (match_operand:SI 2 "const_0_to_255_operand")))]
+  "TARGET_SSE2
+   && TARGET_64BIT
+   && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
   ix86_expand_v1ti_ashiftrt (operands);
   DONE;
 })
@@ -16033,6 +16093,25 @@
 	 (match_operand:QI 2 "general_operand")))]
   "TARGET_SSE2 && TARGET_64BIT"
 {
+  if (!CONST_INT_P (operands[2]))
+    {
+      ix86_expand_v1ti_rotate (ROTATE, operands);
+      DONE;
+    }
+})
+
+(define_insn_and_split "*rotlv1ti3_internal"
+  [(set (match_operand:V1TI 0 "register_operand")
+	(rotate:V1TI
+	 (match_operand:V1TI 1 "register_operand")
+	 (match_operand:SI 2 "const_0_to_255_operand")))]
+  "TARGET_SSE2
+   && TARGET_64BIT
+   && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
   ix86_expand_v1ti_rotate (ROTATE, operands);
   DONE;
 })
@@ -16044,6 +16123,25 @@
 	 (match_operand:QI 2 "general_operand")))]
   "TARGET_SSE2 && TARGET_64BIT"
 {
+  if (!CONST_INT_P (operands[2]))
+    {
+      ix86_expand_v1ti_rotate (ROTATERT, operands);
+      DONE;
+    }
+})
+
+(define_insn_and_split "*rotrv1ti3_internal"
+  [(set (match_operand:V1TI 0 "register_operand")
+	(rotatert:V1TI
+	 (match_operand:V1TI 1 "register_operand")
+	 (match_operand:SI 2 "const_0_to_255_operand")))]
+  "TARGET_SSE2
+   && TARGET_64BIT
+   && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
   ix86_expand_v1ti_rotate (ROTATERT, operands);
   DONE;
 })

             reply	other threads:[~2022-08-05 18:36 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-05 18:36 Roger Sayle [this message]
2022-08-08  7:48 ` Uros Bizjak
2022-08-12 21:24   ` [x86 PATCH take #2] " Roger Sayle
2022-08-13  9:57     ` Uros Bizjak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='037701d8a8fa$4f65ed80$ee31c880$@nextmovesoftware.com' \
    --to=roger@nextmovesoftware.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).