[PATCH] aarch64: Add support for +cssc

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] aarch64: Add support for +cssc
@ 2022-11-11 10:25 Kyrylo Tkachov
  2022-11-11 20:10 ` Andrew Pinski
  0 siblings, 1 reply; 2+ messages in thread
From: Kyrylo Tkachov @ 2022-11-11 10:25 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1717 bytes --]

Hi all,

This patch adds codegen for FEAT_CSSC from the 2022 Architecture extensions.
It fits various existing optabs in GCC quite well.
There are instructions for scalar signed/unsigned min/max, abs, ctz, popcount.
We have expanders for these already, so they are wired up to emit single-insn
patterns for the new TARGET_CSSC.

These instructions are enabled by the +cssc command-line extension.
Bootstrapped and tested on aarch64-none-linux-gnu.

I'll push it once the Binutils patch from Andre for this gets committed

Thanks,
Kyrill

gcc/ChangeLog:

	* config/aarch64/aarch64-option-extensions.def (cssc): Define.
	* config/aarch64/aarch64.h (AARCH64_ISA_CSSC): Define.
	(TARGET_CSSC): Likewise.
	* config/aarch64/aarch64.md (aarch64_abs<mode>2_insn): New define_insn.
	(abs<mode>2): Adjust for the above.
	(aarch64_umax<mode>3_insn): New define_insn.
	(umax<mode>3): Adjust for the above.
	(aarch64_popcount<mode>2_insn): New define_insn.
	(popcount<mode>2): Adjust for the above.
	(<optab><mode>3): New define_insn.
	* config/aarch64/constraints.md (Usm): Define.
	(Uum): Likewise.
	* doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst:
	Document +cssc.
	* config/aarch64/iterators.md (MAXMIN_NOUMAX): New code iterator.
	* config/aarch64/predicates.md (aarch64_sminmax_immediate): Define.
	(aarch64_sminmax_operand): Likewise.
	(aarch64_uminmax_immediate): Likewise.
	(aarch64_uminmax_operand): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/cssc_1.c: New test.
	* gcc.target/aarch64/cssc_2.c: New test.
	* gcc.target/aarch64/cssc_3.c: New test.
	* gcc.target/aarch64/cssc_4.c: New test.
	* gcc.target/aarch64/cssc_5.c: New test.

[-- Attachment #2: cssc.patch --]
[-- Type: application/octet-stream, Size: 16881 bytes --]

diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index bdf4baf309c02a08f74eec7b9cb77bc1f3247de3..8d1a703fb8c9379c5eca82d69b8058d436932fac 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -149,4 +149,6 @@ AARCH64_OPT_EXTENSION("ls64", LS64, (), (), (), "")
 
 AARCH64_OPT_EXTENSION("mops", MOPS, (), (), (), "")
 
+AARCH64_OPT_EXTENSION("cssc", CSSC, (), (), (), "")
+
 #undef AARCH64_OPT_EXTENSION
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index e60f9bce023b2cd5e7233ee9b8c61fc93c1494c2..bbbb7e4213de40a17e1e371b2fefdc2083a38f17 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -221,6 +221,7 @@ enum class aarch64_feature : unsigned char {
 #define AARCH64_ISA_V9_3A          (aarch64_isa_flags & AARCH64_FL_V9_3A)
 #define AARCH64_ISA_MOPS	   (aarch64_isa_flags & AARCH64_FL_MOPS)
 #define AARCH64_ISA_LS64	   (aarch64_isa_flags & AARCH64_FL_LS64)
+#define AARCH64_ISA_CSSC	   (aarch64_isa_flags & AARCH64_FL_CSSC)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (AARCH64_ISA_CRYPTO)
@@ -316,6 +317,9 @@ enum class aarch64_feature : unsigned char {
 /* LS64 instructions are enabled through +ls64.  */
 #define TARGET_LS64 (AARCH64_ISA_LS64)
 
+/* CSSC instructions are enabled through +cssc.  */
+#define TARGET_CSSC (AARCH64_ISA_CSSC)
+
 /* Make sure this is always defined so we don't have to check for ifdefs
    but rather use normal ifs.  */
 #ifndef TARGET_FIX_ERR_A53_835769_DEFAULT
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 8a5843cacbdd93aeb74809c96a458a0f53496ea6..ddf23561736ea4ab378a6b2ebe7cb4049d5e21ab 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3580,11 +3580,24 @@ (define_insn "*sub_uxtsi_shift2_uxtw"
   [(set_attr "type" "alu_ext")]
 )
 
+(define_insn "aarch64_abs<mode>2_insn"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(abs:GPI (match_operand:GPI 1 "register_operand" "r")))]
+  "TARGET_CSSC"
+  "abs\\t%<w>0, %<w>1"
+  [(set_attr "type" "alu_sreg")]
+)
+
 (define_expand "abs<mode>2"
   [(match_operand:GPI 0 "register_operand")
    (match_operand:GPI 1 "register_operand")]
   ""
   {
+    if (TARGET_CSSC)
+      {
+	emit_insn (gen_aarch64_abs<mode>2_insn (operands[0], operands[1]));
+	DONE;
+      }
     rtx ccreg = aarch64_gen_compare_reg (LT, operands[1], const0_rtx);
     rtx x = gen_rtx_LT (VOIDmode, ccreg, const0_rtx);
     emit_insn (gen_csneg3<mode>_insn (operands[0], x, operands[1], operands[1]));
@@ -4382,6 +4395,17 @@ (define_insn "*csinv3_uxtw_insn3"
   [(set_attr "type" "csel")]
 )
 
+(define_insn "aarch64_umax<mode>3_insn"
+  [(set (match_operand:GPI 0 "register_operand" "=r,r")
+        (umax:GPI (match_operand:GPI 1 "register_operand" "r,r")
+		(match_operand:GPI 2 "aarch64_uminmax_operand" "r,Uum")))]
+  "TARGET_CSSC"
+  "@
+   umax\\t%<w>0, %<w>1, %<w>2
+   umax\\t%<w>0, %<w>1, %2"
+  [(set_attr "type" "alu_sreg,alu_imm")]
+)
+
 ;; If X can be loaded by a single CNT[BHWD] instruction,
 ;;
 ;;    A = UMAX (B, X)
@@ -4412,11 +4436,23 @@ (define_expand "umax<mode>3"
   [(set (match_operand:GPI 0 "register_operand")
 	(umax:GPI (match_operand:GPI 1 "")
 		  (match_operand:GPI 2 "")))]
-  "TARGET_SVE"
+  "TARGET_SVE || TARGET_CSSC"
   {
     if (aarch64_sve_cnt_immediate (operands[1], <MODE>mode))
       std::swap (operands[1], operands[2]);
-    else if (!aarch64_sve_cnt_immediate (operands[2], <MODE>mode))
+    else if (!aarch64_sve_cnt_immediate (operands[2], <MODE>mode)
+	     && TARGET_CSSC)
+      {
+	if (aarch64_uminmax_immediate (operands[1], <MODE>mode))
+	  std::swap (operands[1], operands[2]);
+	operands[1] = force_reg (<MODE>mode, operands[1]);
+	if (!aarch64_uminmax_operand (operands[2], <MODE>mode))
+	  operands[2] = force_reg (<MODE>mode, operands[2]);
+	emit_insn (gen_aarch64_umax<mode>3_insn (operands[0], operands[1],
+						 operands[2]));
+	DONE;
+      }
+    else
       FAIL;
     rtx temp = gen_reg_rtx (<MODE>mode);
     operands[1] = force_reg (<MODE>mode, operands[1]);
@@ -4966,8 +5002,16 @@ (define_expand "ffs<mode>2"
   }
 )
 
-;; Pop count be done via the "CNT" instruction in AdvSIMD.
-;;
+(define_insn "aarch64_popcount<mode>2_insn"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (popcount:GPI (match_operand:GPI 1 "register_operand" "r")))]
+  "TARGET_CSSC"
+  "cnt\\t%<w>0, %<w>1"
+  [(set_attr "type" "clz")]
+)
+;; The CSSC instructions can do popcount in the GP registers directly through
+;; CNT.  If it is not available then we can use CNT on the Advanced SIMD side
+;; through:
 ;; MOV	v.1d, x0
 ;; CNT	v1.8b, v.8b
 ;; ADDV b2, v1.8b
@@ -4976,8 +5020,14 @@ (define_expand "ffs<mode>2"
 (define_expand "popcount<mode>2"
   [(match_operand:GPI 0 "register_operand")
    (match_operand:GPI 1 "register_operand")]
-  "TARGET_SIMD"
+  "TARGET_CSSC || TARGET_SIMD"
 {
+  if (TARGET_CSSC)
+    {
+      emit_insn (gen_aarch64_popcount<mode>2_insn (operands[0], operands[1]));
+      DONE;
+    }
+
   rtx v = gen_reg_rtx (V8QImode);
   rtx v1 = gen_reg_rtx (V8QImode);
   rtx in = operands[1];
@@ -5016,14 +5066,14 @@ (define_insn "@aarch64_rbit<mode>"
 ;; Split after reload into RBIT + CLZ.  Since RBIT is represented as an UNSPEC
 ;; it is unlikely to fold with any other operation, so keep this as a CTZ
 ;; expression and split after reload to enable scheduling them apart if
-;; needed.
+;; needed.  For TARGET_CSSC we have a single CTZ instruction that can do this.
 
 (define_insn_and_split "ctz<mode>2"
  [(set (match_operand:GPI           0 "register_operand" "=r")
        (ctz:GPI (match_operand:GPI  1 "register_operand" "r")))]
   ""
-  "#"
-  "reload_completed"
+  { return TARGET_CSSC ? "ctz\\t%<w>0, %<w>1" : "#"; }
+  "reload_completed && !TARGET_CSSC"
   [(const_int 0)]
   "
   emit_insn (gen_aarch64_rbit (<MODE>mode, operands[0], operands[1]));
@@ -6662,6 +6712,17 @@ (define_insn "abs<mode>2"
   [(set_attr "type" "ffarith<stype>")]
 )
 
+(define_insn "<optab><mode>3"
+  [(set (match_operand:GPI 0 "register_operand" "=r,r")
+        (MAXMIN_NOUMAX:GPI (match_operand:GPI 1 "register_operand" "r,r")
+		(match_operand:GPI 2 "aarch64_<su>minmax_operand" "r,U<su>m")))]
+  "TARGET_CSSC"
+  "@
+   <optab>\\t%<w>0, %<w>1, %<w>2
+   <optab>\\t%<w>0, %<w>1, %2"
+  [(set_attr "type" "alu_sreg,alu_imm")]
+)
+
 ;; Given that smax/smin do not specify the result when either input is NaN,
 ;; we could use either FMAXNM or FMAX for smax, and either FMINNM or FMIN
 ;; for smin.
diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md
index ee7587cca1673208e2bfd6b503a21d0c8b69bf75..29efb6c0cff7574c9b239ef358acaca96dd75d03 100644
--- a/gcc/config/aarch64/constraints.md
+++ b/gcc/config/aarch64/constraints.md
@@ -152,6 +152,11 @@ (define_constraint "Usa"
        (match_test "aarch64_symbolic_address_p (op)")
        (match_test "aarch64_mov_operand_p (op, GET_MODE (op))")))
 
+(define_constraint "Usm"
+ "A constant that can be used with the S[MIN/MAX] CSSC instructions."
+ (and (match_code "const_int")
+      (match_test "aarch64_sminmax_immediate (op, VOIDmode)")))
+
 ;; const is needed here to support UNSPEC_SALT_ADDR.
 (define_constraint "Usw"
   "@internal
@@ -389,6 +394,11 @@ (define_constraint "Ufc"
   (and (match_code "const_double,const_vector")
        (match_test "aarch64_float_const_representable_p (op)")))
 
+(define_constraint "Uum"
+ "A constant that can be used with the U[MIN/MAX] CSSC instructions."
+ (and (match_code "const_int")
+      (match_test "aarch64_uminmax_immediate (op, VOIDmode)")))
+
 (define_constraint "Uvi"
   "A floating point constant which can be used with a\
    MOVI immediate operation."
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 5e6ff595c0e2a084ecf88ac9b60158d9983039f8..0d4ee4474726b66eab62e8637efddc0a5c266ba8 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -2186,6 +2186,9 @@ (define_code_iterator FLOATUORS [float unsigned_float])
 ;; Code iterator for variants of vector max and min.
 (define_code_iterator MAXMIN [smax smin umax umin])
 
+;; Code iterator for min/max ops but without UMAX.
+(define_code_iterator MAXMIN_NOUMAX [smax smin umin])
+
 (define_code_iterator FMAXMIN [smax smin])
 
 ;; Signed and unsigned max operations.
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index c308015ac2c13d24cd6bcec71247ec45df8cf5e6..6175875fbc6be6b0bff77873f55e5e6d44378d0c 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -148,6 +148,22 @@ (define_predicate "aarch64_pluslong_immediate"
   (and (match_code "const_int")
        (match_test "(INTVAL (op) < 0xffffff && INTVAL (op) > -0xffffff)")))
 
+(define_predicate "aarch64_sminmax_immediate"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), -128, 127)")))
+
+(define_predicate "aarch64_sminmax_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "aarch64_sminmax_immediate")))
+
+(define_predicate "aarch64_uminmax_immediate"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 255)")))
+
+(define_predicate "aarch64_uminmax_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "aarch64_uminmax_immediate")))
+
 (define_predicate "aarch64_pluslong_strict_immedate"
   (and (match_operand 0 "aarch64_pluslong_immediate")
        (not (match_operand 0 "aarch64_plus_immediate"))))
diff --git a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst
index c2b23a6ee97ef2b7c74119f22c1d3e3d85385f4d..c91ff68984d7b38650afa57cd4edf511e41594c8 100644
--- a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst
+++ b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst
@@ -544,6 +544,9 @@ the following and their inverses no :samp:`{feature}` :
 :samp:`pauth`
   Enable the Pointer Authentication Extension.
 
+:samp:`cssc`
+  Enable the Common Short Sequence Compression feature.
+
 Feature ``crypto`` implies ``aes``, ``sha2``, and ``simd``,
 which implies ``fp``.
 Conversely, ``nofp`` implies ``nosimd``, which implies
diff --git a/gcc/testsuite/gcc.target/aarch64/cssc_1.c b/gcc/testsuite/gcc.target/aarch64/cssc_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..eecd00b25366dfbc760d6af8ffaf0a831fa07b54
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cssc_1.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "--save-temps -O1" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#include <stdint.h>
+
+#pragma GCC target "+cssc"
+
+/*
+** absw:
+**      abs	w0, w0
+**      ret
+*/
+
+int32_t
+absw (int32_t a)
+{
+  return __builtin_abs (a);
+}
+
+/*
+** absx:
+**      abs	x0, x0
+**      ret
+*/
+
+int64_t
+absx (int64_t a)
+{
+  return __builtin_labs (a);
+}
+
diff --git a/gcc/testsuite/gcc.target/aarch64/cssc_2.c b/gcc/testsuite/gcc.target/aarch64/cssc_2.c
new file mode 100644
index 0000000000000000000000000000000000000000..1637d29c6dc20c0eb55c5f3d05157b5b6c581fdc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cssc_2.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "--save-temps -O1" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#include <stdint.h>
+
+#pragma GCC target "+cssc"
+
+/*
+** cntw:
+**      cnt	w0, w0
+**      ret
+*/
+
+int32_t
+cntw (int32_t a)
+{
+  return __builtin_popcount (a);
+}
+
+/*
+** cntx:
+**      cnt	x0, x0
+**      ret
+*/
+
+int64_t
+cntx (int64_t a)
+{
+  return __builtin_popcountll (a);
+}
+
diff --git a/gcc/testsuite/gcc.target/aarch64/cssc_3.c b/gcc/testsuite/gcc.target/aarch64/cssc_3.c
new file mode 100644
index 0000000000000000000000000000000000000000..8965256c08e1e9a83c06a094e7ec533baae61619
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cssc_3.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "--save-temps -O1" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#include <stdint.h>
+
+#pragma GCC target "+cssc"
+
+/*
+** ctzw:
+**      ctz	w0, w0
+**      ret
+*/
+
+int32_t
+ctzw (int32_t a)
+{
+  return __builtin_ctz (a);
+}
+
+/*
+** ctzx:
+**      ctz	x0, x0
+**      ret
+*/
+
+int64_t
+ctzx (int64_t a)
+{
+  return __builtin_ctzll (a);
+}
+
diff --git a/gcc/testsuite/gcc.target/aarch64/cssc_4.c b/gcc/testsuite/gcc.target/aarch64/cssc_4.c
new file mode 100644
index 0000000000000000000000000000000000000000..34ccd0e7c3b03a214fa36a3d07955d787afb3ccd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cssc_4.c
@@ -0,0 +1,107 @@
+/* { dg-do compile } */
+/* { dg-additional-options "--save-temps -O1" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#include <stdint.h>
+
+#pragma GCC target "+cssc"
+
+#define MIN(X, Y) ((X) > (Y) ? (Y) : (X))
+#define MAX(X, Y) ((X) > (Y) ? (X) : (Y))
+
+/*
+** uminw:
+**      umin	w0, w[01], w[01]
+**      ret
+*/
+
+uint32_t
+uminw (uint32_t a, uint32_t b)
+{
+  return MIN (a, b);
+}
+
+/*
+** uminx:
+**      umin	x0, x[01], x[01]
+**      ret
+*/
+
+uint64_t
+uminx (uint64_t a, uint64_t b)
+{
+  return MIN (a, b);
+}
+
+/*
+** sminw:
+**      smin	w0, w[01], w[01]
+**      ret
+*/
+
+int32_t
+sminw (int32_t a, int32_t b)
+{
+  return MIN (a, b);
+}
+
+/*
+** sminx:
+**      smin	x0, x[01], x[01]
+**      ret
+*/
+
+int64_t
+sminx (int64_t a, int64_t b)
+{
+  return MIN (a, b);
+}
+
+/*
+** umaxw:
+**      umax	w0, w[01], w[01]
+**      ret
+*/
+
+uint32_t
+umaxw (uint32_t a, uint32_t b)
+{
+  return MAX (a, b);
+}
+
+/*
+** umaxx:
+**      umax	x0, x[01], x[01]
+**      ret
+*/
+
+uint64_t
+umaxx (uint64_t a, uint64_t b)
+{
+  return MAX (a, b);
+}
+
+/*
+** smaxw:
+**      smax	w0, w[01], w[01]
+**      ret
+*/
+
+int32_t
+smaxw (int32_t a, int32_t b)
+{
+  return MAX (a, b);
+}
+
+/*
+** smaxx:
+**      smax	x0, x[01], x[01]
+**      ret
+*/
+
+int64_t
+smaxx (int64_t a, int64_t b)
+{
+  return MAX (a, b);
+}
+
diff --git a/gcc/testsuite/gcc.target/aarch64/cssc_5.c b/gcc/testsuite/gcc.target/aarch64/cssc_5.c
new file mode 100644
index 0000000000000000000000000000000000000000..51495195eaea745b62df00e47f590faaaba348ae
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cssc_5.c
@@ -0,0 +1,154 @@
+/* { dg-do compile } */
+/* { dg-additional-options "--save-temps -O1" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#include <stdint.h>
+
+#pragma GCC target "+cssc"
+
+#define MIN(X, Y) ((X) > (Y) ? (Y) : (X))
+#define MAX(X, Y) ((X) > (Y) ? (X) : (Y))
+
+#define FUNC(T, OP, IMM)                \
+T                                       \
+T##_##OP##_##IMM (T a)                  \
+{                                       \
+  return OP (a, IMM);                   \
+}                                       \
+
+#define FUNCNEG(T, OP, IMM)             \
+T                                       \
+T##_##OP##_m##IMM (T a)                 \
+{                                       \
+  return OP (a, - (IMM));               \
+}                                       \
+
+/*
+** uint32_t_MIN_255:
+**      umin	w0, w0, 255
+**      ret
+*/
+
+FUNC (uint32_t, MIN, 255)
+
+/*
+** uint64_t_MIN_255:
+**      umin	x0, x0, 255
+**      ret
+*/
+
+FUNC (uint64_t, MIN, 255)
+
+/*
+** uint32_t_MAX_255:
+**      umax	w0, w0, 255
+**      ret
+*/
+
+FUNC (uint32_t, MAX, 255)
+
+
+/*
+** uint64_t_MAX_255:
+**      umax	x0, x0, 255
+**      ret
+*/
+
+FUNC (uint64_t, MAX, 255)
+
+/*
+** int32_t_MIN_m128:
+**      smin	w0, w0, -128
+**      ret
+*/
+
+FUNCNEG (int32_t, MIN, 128)
+
+/*
+** int32_t_MIN_127:
+**      smin	w0, w0, 127
+**      ret
+*/
+
+FUNC (int32_t, MIN, 127)
+
+/*
+** int64_t_MIN_m128:
+**      smin	x0, x0, -128
+**      ret
+*/
+
+FUNCNEG (int64_t, MIN, 128)
+
+/*
+** int64_t_MIN_127:
+**      smin	x0, x0, 127
+**      ret
+*/
+
+FUNC (int64_t, MIN, 127)
+
+/*
+** int32_t_MAX_m128:
+**      smax	w0, w0, -128
+**      ret
+*/
+
+FUNCNEG (int32_t, MAX, 128)
+
+/*
+** int32_t_MAX_127:
+**      smax	w0, w0, 127
+**      ret
+*/
+
+FUNC (int32_t, MAX, 127)
+
+/*
+** int64_t_MAX_m128:
+**      smax	x0, x0, -128
+**      ret
+*/
+
+FUNCNEG (int64_t, MAX, 128)
+
+/*
+** int64_t_MAX_127:
+**      smax	x0, x0, 127
+**      ret
+*/
+
+FUNC (int64_t, MAX, 127)
+
+/*
+** int32_t_MIN_0:
+**      smin	w0, w0, 0
+**      ret
+*/
+
+FUNC (int32_t, MIN, 0)
+
+/*
+** int64_t_MIN_0:
+**      smin	x0, x0, 0
+**      ret
+*/
+
+FUNC (int64_t, MIN, 0)
+
+/*
+** int32_t_MAX_0:
+**      smax	w0, w0, 0
+**      ret
+*/
+
+FUNC (int32_t, MAX, 0)
+
+/*
+** int64_t_MAX_0:
+**      smax	x0, x0, 0
+**      ret
+*/
+
+FUNC (int64_t, MAX, 0)
+

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] aarch64: Add support for +cssc
  2022-11-11 10:25 [PATCH] aarch64: Add support for +cssc Kyrylo Tkachov
@ 2022-11-11 20:10 ` Andrew Pinski
  0 siblings, 0 replies; 2+ messages in thread
From: Andrew Pinski @ 2022-11-11 20:10 UTC (permalink / raw)
  To: Kyrylo Tkachov; +Cc: gcc-patches

On Fri, Nov 11, 2022 at 2:26 AM Kyrylo Tkachov via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi all,
>
> This patch adds codegen for FEAT_CSSC from the 2022 Architecture extensions.
> It fits various existing optabs in GCC quite well.
> There are instructions for scalar signed/unsigned min/max, abs, ctz, popcount.
> We have expanders for these already, so they are wired up to emit single-insn
> patterns for the new TARGET_CSSC.
>
> These instructions are enabled by the +cssc command-line extension.
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> I'll push it once the Binutils patch from Andre for this gets committed



@@ -4976,8 +5020,14 @@ (define_expand "ffs<mode>2"
 (define_expand "popcount<mode>2"
   [(match_operand:GPI 0 "register_operand")
    (match_operand:GPI 1 "register_operand")]
-  "TARGET_SIMD"
+  "TARGET_CSSC || TARGET_SIMD"
 {
+  if (TARGET_CSSC)
+    {
+      emit_insn (gen_aarch64_popcount<mode>2_insn (operands[0], operands[1]));
+      DONE;
+    }
+
   rtx v = gen_reg_rtx (V8QImode);
   rtx v1 = gen_reg_rtx (V8QImode);
   rtx in = operands[1];

I think the easy way is to this instead:
 (define_expand "popcount<mode>2"
   [(set (match_operand:GPI 0 "register_operand")
     (popcount:GPI  (match_operand:GPI 1 "register_operand")))]
  "TARGET_CSSC || TARGET_SIMD"
{
  if (!TARGET_CSSC)
    {
// Current code
     DONE;
    }
}

And then you don't need to name the aarch64_popcount pattern. Or use a *.
Yes it does mess up the diff but the end result seems cleaner.
I suspect all of the expands you are changing should be done this
similar way too.

Thanks,
Andrew Pinski

>
> Thanks,
> Kyrill
>
> gcc/ChangeLog:
>
>         * config/aarch64/aarch64-option-extensions.def (cssc): Define.
>         * config/aarch64/aarch64.h (AARCH64_ISA_CSSC): Define.
>         (TARGET_CSSC): Likewise.
>         * config/aarch64/aarch64.md (aarch64_abs<mode>2_insn): New define_insn.
>         (abs<mode>2): Adjust for the above.
>         (aarch64_umax<mode>3_insn): New define_insn.
>         (umax<mode>3): Adjust for the above.
>         (aarch64_popcount<mode>2_insn): New define_insn.
>         (popcount<mode>2): Adjust for the above.
>         (<optab><mode>3): New define_insn.
>         * config/aarch64/constraints.md (Usm): Define.
>         (Uum): Likewise.
>         * doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst:
>         Document +cssc.
>         * config/aarch64/iterators.md (MAXMIN_NOUMAX): New code iterator.
>         * config/aarch64/predicates.md (aarch64_sminmax_immediate): Define.
>         (aarch64_sminmax_operand): Likewise.
>         (aarch64_uminmax_immediate): Likewise.
>         (aarch64_uminmax_operand): Likewise.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/aarch64/cssc_1.c: New test.
>         * gcc.target/aarch64/cssc_2.c: New test.
>         * gcc.target/aarch64/cssc_3.c: New test.
>         * gcc.target/aarch64/cssc_4.c: New test.
>         * gcc.target/aarch64/cssc_5.c: New test.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-11-11 20:11 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-11 10:25 [PATCH] aarch64: Add support for +cssc Kyrylo Tkachov
2022-11-11 20:10 ` Andrew Pinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).