[gcc(refs/users/meissner/heads/work146-vsize)] Add vector

public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed

* [gcc(refs/users/meissner/heads/work146-vsize)] Add vector_size(32) integer support.
@ 2023-11-20  0:59 Michael Meissner
  0 siblings, 0 replies; only message in thread
From: Michael Meissner @ 2023-11-20  0:59 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:cdf9ac5acf2b85784326ec41c29d9a8d68b18d2f

commit cdf9ac5acf2b85784326ec41c29d9a8d68b18d2f
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Sun Nov 19 19:55:05 2023 -0500

    Add vector_size(32) integer support.
    
    The first patch in the vector pair series was previous posted.  This patch
    needs that first patch.  The first patch implemented the basic modes, and it
    allows for initialization of the modes.  In addition, I added some
    optimizations for extracting and setting fields within the vector pair.
    
    The second patch in the vector pair series implemented floating point support.
    
    The third patch implements the integer vector pair support.  This adds the basic
    support for doing integer operations on vector pairs.  I have implemented most
    of the arithmetic and logical that will be needed in the future when byte
    shuffling will be added.  I did add various combiner insns to fold the logical
    instructions (i.e. ior of not becomes orc).  Since the PowerPC architecture does
    not have negative for vectors of 8/16-bit elements, I have added alternate code
    that creates a 0 and then does a subtract.
    
    The main instructions that are not supported are shift and rotate instructions.
    In addition, if people want to use vector pair support on integer types, it
    might make sense to add support for saturating adds and subtracts, along the
    various specialized instructions (bpermd, etc.).
    
    The fourth patch will provide new tests to the test suite.
    
    When I test a saxpy type loop (a[i] += (b[i] * c[i])), I generally see a 10%
    improvement over either auto-factorization, or just using the vector types.
    
    I have tested these patches on a little endian power10 system.  With
    -vector-size-32 disabled by default, there are no regressions in the
    test suite.
    
    I have also built and run the tests on both little endian power 9 and big
    endian 9 power systems, and there are no regressions.  Can I check these
    patches into the master branch?
    
    2023-11-19  Michael Meisner  <meissner@linux.ibm.com>
    
    gcc/
    
            * config/rs6000/vector-pair.md (VPAIR_INT): New mode iterator.
            (VPAIR_NEG_VNEG): Likewise.
            (VPAIR_NEG_SUB): Likewise.
            (VPAIR_INT_BINARY): New code iterator.
            (neg<mode>2, VPAIR_NEG_VNEG iterator): New insn.
            (neg<mode>2, VPAIR_NEG_SUB iterator); Likewise.
            (<vpair_op><mode>2, VPAIR_LOGICAL_UNARY and VPAIR_INT iterators):
            Likewise.
            (<vpair_op><mode>3, VPAIR_LOGICAL_BINARY and VPAIR INT iterator):
            Likewise.
            (nor<mode>3_1): Likewise.
            (nor<mode>3_2): Likewise.
            (andc<mode>3): Likewise.
            (eqv<mode>3): Likewise.
            (nand<mode>3_1): Likewise.
            (nand<mode>3_2): Likewise.
            (orc<mode>): Likewise.

Diff:
---
 gcc/config/rs6000/vector-pair.md | 252 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 252 insertions(+)

diff --git a/gcc/config/rs6000/vector-pair.md b/gcc/config/rs6000/vector-pair.md
index 8e2d7e5cc5b..dc71ea28293 100644
--- a/gcc/config/rs6000/vector-pair.md
+++ b/gcc/config/rs6000/vector-pair.md
@@ -38,6 +38,22 @@
 (define_code_iterator VPAIR_FP_UNARY  [abs neg])
 (define_code_iterator VPAIR_FP_BINARY [plus minus mult smin smax])
 
+;; Integer vector pair ops.  We need the basic logical opts to support
+;; permution on little endian systems.
+(define_mode_iterator VPAIR_INT [V32QI V16HI V8SI V4DI])
+
+;; Special iterators for NEG (V4SI and V2DI have vneg{w,d}), while V16QI and
+;; V8HI have to use a subtract from 0.
+(define_mode_iterator VPAIR_NEG_VNEG [V4DI V8SI])
+(define_mode_iterator VPAIR_NEG_SUB [V32QI V16HI])
+
+;; Iterator integer unary/binary operations.  Logical operations can be done on
+;; all VSX registers, while the binary int operators need Altivec registers.
+(define_code_iterator VPAIR_LOGICAL_UNARY  [not])
+(define_code_iterator VPAIR_LOGICAL_BINARY [and ior xor])
+
+(define_code_iterator VPAIR_INT_BINARY     [plus minus smin smax umin umax])
+
 ;; Iterator for vector pairs with double word elements
 (define_mode_iterator VPAIR_DWORD [V4DI V4DF])
 
@@ -626,4 +642,240 @@
 }
   [(set_attr "length" "8")
    (set_attr "type" "vecfloat")])
+\f
+;; Vector pair negate if we have the VNEGx instruction.
+(define_insn_and_split "neg<mode>2"
+  [(set (match_operand:VPAIR_NEG_VNEG 0 "vsx_register_operand" "=v")
+	(neg:VPAIR_NEG_VNEG
+	 (match_operand:VPAIR_NEG_VNEG 1 "vsx_register_operand" "v")))]
+  "TARGET_MMA && TARGET_VECTOR_SIZE_32"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  split_unary_vector_pair (<VPAIR_VECTOR>mode, operands,
+			   gen_neg<vpair_vector_l>2);
+  DONE;
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "vecfloat")])
+
+;; Vector pair negate if we have to do a subtract from 0
+(define_insn_and_split "neg<mode>2"
+  [(set (match_operand:VPAIR_NEG_SUB 0 "vsx_register_operand" "=v")
+	(neg:VPAIR_NEG_SUB
+	 (match_operand:VPAIR_NEG_SUB 1 "vsx_register_operand" "v")))
+   (clobber (match_scratch:<VPAIR_VECTOR> 2 "=&v"))]
+  "TARGET_MMA && TARGET_VECTOR_SIZE_32"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  enum machine_mode mode = <VPAIR_VECTOR>mode;
+  rtx tmp = operands[2];
+  unsigned reg0 = reg_or_subregno (operands[0]);
+  unsigned reg1 = reg_or_subregno (operands[1]);
+
+  emit_move_insn (tmp, CONST0_RTX (mode));
+  emit_insn (gen_sub<vpair_vector_l>3 (gen_rtx_REG (mode, reg0),
+				       tmp,
+				       gen_rtx_REG (mode, reg1)));
+
+  emit_insn (gen_sub<vpair_vector_l>3 (gen_rtx_REG (mode, reg0 + 1),
+				       tmp,
+				       gen_rtx_REG (mode, reg1 + 1)));
+
+  DONE;
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "vecfloat")])
+\f
+;; Vector pair logical unary operations.  These operations can use all VSX
+;; registers.
+(define_insn_and_split "<vpair_op><mode>2"
+  [(set (match_operand:VPAIR_INT 0 "vsx_register_operand" "=wa")
+	(VPAIR_LOGICAL_UNARY:VPAIR_INT
+	 (match_operand:VPAIR_INT 1 "vsx_register_operand" "wa")))]
+  "TARGET_MMA && TARGET_VECTOR_SIZE_32"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  split_unary_vector_pair (<VPAIR_VECTOR>mode, operands,
+			   gen_<vpair_op><vpair_vector_l>2);
+  DONE;
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "veclogical")])
+
+;; Vector pair logical binary operations.  These operations can use all VSX
+;; registers.
+(define_insn_and_split "<vpair_op><mode>3"
+  [(set (match_operand:VPAIR_INT 0 "vsx_register_operand" "=wa")
+	(VPAIR_LOGICAL_BINARY:VPAIR_INT
+	 (match_operand:VPAIR_INT 1 "vsx_register_operand" "wa")
+	 (match_operand:VPAIR_INT 2 "vsx_register_operand" "wa")))]
+  "TARGET_MMA && TARGET_VECTOR_SIZE_32"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  split_binary_vector_pair (<VPAIR_VECTOR>mode, operands,
+			    gen_<vpair_op><vpair_vector_l>3);
+  DONE;
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "veclogical")])
+
+;; Vector pair logical binary operations.  These operations require Altivec
+;; registers.
+(define_insn_and_split "<vpair_op><mode>3"
+  [(set (match_operand:VPAIR_INT 0 "vsx_register_operand" "=v")
+	(VPAIR_INT_BINARY:VPAIR_INT
+	 (match_operand:VPAIR_INT 1 "vsx_register_operand" "v")
+	 (match_operand:VPAIR_INT 2 "vsx_register_operand" "v")))]
+  "TARGET_MMA && TARGET_VECTOR_SIZE_32"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  split_binary_vector_pair (<VPAIR_VECTOR>mode, operands,
+			    gen_<vpair_op><vpair_vector_l>3);
+  DONE;
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "vecsimple")])
+
+;; Optiomize vector pair ~(a | b)  or ((~a) & (~b)) to produce xxlnor
+(define_insn_and_split "*nor<mode>3_1"
+  [(set (match_operand:VPAIR_INT 0 "vsx_register_operand" "=wa")
+	(not:VPAIR_INT
+	 (ior:VPAIR_INT
+	  (match_operand:VPAIR_INT 1 "vsx_register_operand" "wa")
+	  (match_operand:VPAIR_INT 2 "vsx_register_operand" "wa"))))]
+  "TARGET_MMA && TARGET_VECTOR_SIZE_32"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  split_binary_vector_pair (<VPAIR_VECTOR>mode, operands,
+			    gen_nor<vpair_vector_l>3);
+  DONE;
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "veclogical")])
+
+(define_insn_and_split "*nor<mode>3_2"
+  [(set (match_operand:VPAIR_INT 0 "vsx_register_operand" "=wa")
+	(and:VPAIR_INT
+	 (not:VPAIR_INT
+	  (match_operand:VPAIR_INT 1 "vsx_register_operand" "wa"))
+	 (not:VPAIR_INT
+	  (match_operand:VPAIR_INT 2 "vsx_register_operand" "wa"))))]
+  "TARGET_MMA && TARGET_VECTOR_SIZE_32"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  split_binary_vector_pair (<VPAIR_VECTOR>mode, operands,
+			    gen_nor<vpair_vector_l>3);
+  DONE;
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "veclogical")])
+
+;; Optimize vector pair (~a) & b to use xxlandc
+(define_insn_and_split "*andc<mode>3"
+  [(set (match_operand:VPAIR_INT 0 "vsx_register_operand" "=wa")
+	(and:VPAIR_INT
+	 (not:VPAIR_INT
+	  (match_operand:VPAIR_INT 1 "vsx_register_operand" "wa"))
+	 (match_operand:VPAIR_INT 2 "vsx_register_operand" "wa")))]
+  "TARGET_MMA && TARGET_VECTOR_SIZE_32"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  split_binary_vector_pair (<VPAIR_VECTOR>mode, operands,
+			    gen_andc<vpair_vector_l>3);
+  DONE;
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "veclogical")])
+
+;; Optimize vector pair ~(a ^ b) to produce xxleqv
+(define_insn_and_split "*eqv<mode>3"
+  [(set (match_operand:VPAIR_INT 0 "vsx_register_operand" "=wa")
+	(not:VPAIR_INT
+	 (xor:VPAIR_INT
+	  (match_operand:VPAIR_INT 1 "vsx_register_operand" "wa")
+	  (match_operand:VPAIR_INT 2 "vsx_register_operand" "wa"))))]
+  "TARGET_MMA && TARGET_VECTOR_SIZE_32"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  split_binary_vector_pair (<VPAIR_VECTOR>mode, operands,
+			    gen_nor<vpair_vector_l>3);
+  DONE;
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "veclogical")])
 
+
+;; Optiomize vector pair ~(a & b) or ((~a) | (~b)) to produce xxlnand
+(define_insn_and_split "*nand<mode>3_1"
+  [(set (match_operand:VPAIR_INT 0 "vsx_register_operand" "=wa")
+	(not:VPAIR_INT
+	 (and:VPAIR_INT
+	  (match_operand:VPAIR_INT 1 "vsx_register_operand" "wa")
+	  (match_operand:VPAIR_INT 2 "vsx_register_operand" "wa"))))]
+  "TARGET_MMA && TARGET_VECTOR_SIZE_32"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  split_binary_vector_pair (<VPAIR_VECTOR>mode, operands,
+			    gen_nand<vpair_vector_l>3);
+  DONE;
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "veclogical")])
+
+(define_insn_and_split "*nand<mode>3_2"
+  [(set (match_operand:VPAIR_INT 0 "vsx_register_operand" "=wa")
+	(ior:VPAIR_INT
+	 (not:VPAIR_INT
+	  (match_operand:VPAIR_INT 1 "vsx_register_operand" "wa"))
+	 (not:VPAIR_INT
+	  (match_operand:VPAIR_INT 2 "vsx_register_operand" "wa"))))]
+  "TARGET_MMA && TARGET_VECTOR_SIZE_32"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  split_binary_vector_pair (<VPAIR_VECTOR>mode, operands,
+			    gen_nand<vpair_vector_l>3);
+  DONE;
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "veclogical")])
+
+;; Optimize vector pair (~a) | b to produce xxlorc
+(define_insn_and_split "*orc<mode>3"
+  [(set (match_operand:VPAIR_INT 0 "vsx_register_operand" "=wa")
+	(ior:VPAIR_INT
+	 (not:VPAIR_INT
+	  (match_operand:VPAIR_INT 1 "vsx_register_operand" "wa"))
+	 (match_operand:VPAIR_INT 2 "vsx_register_operand" "wa")))]
+  "TARGET_MMA && TARGET_VECTOR_SIZE_32"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  split_binary_vector_pair (<VPAIR_VECTOR>mode, operands,
+			    gen_orc<vpair_vector_l>3);
+  DONE;
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "veclogical")])

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-11-20  0:59 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-20  0:59 [gcc(refs/users/meissner/heads/work146-vsize)] Add vector_size(32) integer support Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).