* [PATCH] aarch64: Add support for +cssc
@ 2022-11-11 10:25 Kyrylo Tkachov
2022-11-11 20:10 ` Andrew Pinski
0 siblings, 1 reply; 2+ messages in thread
From: Kyrylo Tkachov @ 2022-11-11 10:25 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1717 bytes --]
Hi all,
This patch adds codegen for FEAT_CSSC from the 2022 Architecture extensions.
It fits various existing optabs in GCC quite well.
There are instructions for scalar signed/unsigned min/max, abs, ctz, popcount.
We have expanders for these already, so they are wired up to emit single-insn
patterns for the new TARGET_CSSC.
These instructions are enabled by the +cssc command-line extension.
Bootstrapped and tested on aarch64-none-linux-gnu.
I'll push it once the Binutils patch from Andre for this gets committed
Thanks,
Kyrill
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def (cssc): Define.
* config/aarch64/aarch64.h (AARCH64_ISA_CSSC): Define.
(TARGET_CSSC): Likewise.
* config/aarch64/aarch64.md (aarch64_abs<mode>2_insn): New define_insn.
(abs<mode>2): Adjust for the above.
(aarch64_umax<mode>3_insn): New define_insn.
(umax<mode>3): Adjust for the above.
(aarch64_popcount<mode>2_insn): New define_insn.
(popcount<mode>2): Adjust for the above.
(<optab><mode>3): New define_insn.
* config/aarch64/constraints.md (Usm): Define.
(Uum): Likewise.
* doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst:
Document +cssc.
* config/aarch64/iterators.md (MAXMIN_NOUMAX): New code iterator.
* config/aarch64/predicates.md (aarch64_sminmax_immediate): Define.
(aarch64_sminmax_operand): Likewise.
(aarch64_uminmax_immediate): Likewise.
(aarch64_uminmax_operand): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/cssc_1.c: New test.
* gcc.target/aarch64/cssc_2.c: New test.
* gcc.target/aarch64/cssc_3.c: New test.
* gcc.target/aarch64/cssc_4.c: New test.
* gcc.target/aarch64/cssc_5.c: New test.
[-- Attachment #2: cssc.patch --]
[-- Type: application/octet-stream, Size: 16881 bytes --]
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index bdf4baf309c02a08f74eec7b9cb77bc1f3247de3..8d1a703fb8c9379c5eca82d69b8058d436932fac 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -149,4 +149,6 @@ AARCH64_OPT_EXTENSION("ls64", LS64, (), (), (), "")
AARCH64_OPT_EXTENSION("mops", MOPS, (), (), (), "")
+AARCH64_OPT_EXTENSION("cssc", CSSC, (), (), (), "")
+
#undef AARCH64_OPT_EXTENSION
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index e60f9bce023b2cd5e7233ee9b8c61fc93c1494c2..bbbb7e4213de40a17e1e371b2fefdc2083a38f17 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -221,6 +221,7 @@ enum class aarch64_feature : unsigned char {
#define AARCH64_ISA_V9_3A (aarch64_isa_flags & AARCH64_FL_V9_3A)
#define AARCH64_ISA_MOPS (aarch64_isa_flags & AARCH64_FL_MOPS)
#define AARCH64_ISA_LS64 (aarch64_isa_flags & AARCH64_FL_LS64)
+#define AARCH64_ISA_CSSC (aarch64_isa_flags & AARCH64_FL_CSSC)
/* Crypto is an optional extension to AdvSIMD. */
#define TARGET_CRYPTO (AARCH64_ISA_CRYPTO)
@@ -316,6 +317,9 @@ enum class aarch64_feature : unsigned char {
/* LS64 instructions are enabled through +ls64. */
#define TARGET_LS64 (AARCH64_ISA_LS64)
+/* CSSC instructions are enabled through +cssc. */
+#define TARGET_CSSC (AARCH64_ISA_CSSC)
+
/* Make sure this is always defined so we don't have to check for ifdefs
but rather use normal ifs. */
#ifndef TARGET_FIX_ERR_A53_835769_DEFAULT
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 8a5843cacbdd93aeb74809c96a458a0f53496ea6..ddf23561736ea4ab378a6b2ebe7cb4049d5e21ab 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3580,11 +3580,24 @@ (define_insn "*sub_uxtsi_shift2_uxtw"
[(set_attr "type" "alu_ext")]
)
+(define_insn "aarch64_abs<mode>2_insn"
+ [(set (match_operand:GPI 0 "register_operand" "=r")
+ (abs:GPI (match_operand:GPI 1 "register_operand" "r")))]
+ "TARGET_CSSC"
+ "abs\\t%<w>0, %<w>1"
+ [(set_attr "type" "alu_sreg")]
+)
+
(define_expand "abs<mode>2"
[(match_operand:GPI 0 "register_operand")
(match_operand:GPI 1 "register_operand")]
""
{
+ if (TARGET_CSSC)
+ {
+ emit_insn (gen_aarch64_abs<mode>2_insn (operands[0], operands[1]));
+ DONE;
+ }
rtx ccreg = aarch64_gen_compare_reg (LT, operands[1], const0_rtx);
rtx x = gen_rtx_LT (VOIDmode, ccreg, const0_rtx);
emit_insn (gen_csneg3<mode>_insn (operands[0], x, operands[1], operands[1]));
@@ -4382,6 +4395,17 @@ (define_insn "*csinv3_uxtw_insn3"
[(set_attr "type" "csel")]
)
+(define_insn "aarch64_umax<mode>3_insn"
+ [(set (match_operand:GPI 0 "register_operand" "=r,r")
+ (umax:GPI (match_operand:GPI 1 "register_operand" "r,r")
+ (match_operand:GPI 2 "aarch64_uminmax_operand" "r,Uum")))]
+ "TARGET_CSSC"
+ "@
+ umax\\t%<w>0, %<w>1, %<w>2
+ umax\\t%<w>0, %<w>1, %2"
+ [(set_attr "type" "alu_sreg,alu_imm")]
+)
+
;; If X can be loaded by a single CNT[BHWD] instruction,
;;
;; A = UMAX (B, X)
@@ -4412,11 +4436,23 @@ (define_expand "umax<mode>3"
[(set (match_operand:GPI 0 "register_operand")
(umax:GPI (match_operand:GPI 1 "")
(match_operand:GPI 2 "")))]
- "TARGET_SVE"
+ "TARGET_SVE || TARGET_CSSC"
{
if (aarch64_sve_cnt_immediate (operands[1], <MODE>mode))
std::swap (operands[1], operands[2]);
- else if (!aarch64_sve_cnt_immediate (operands[2], <MODE>mode))
+ else if (!aarch64_sve_cnt_immediate (operands[2], <MODE>mode)
+ && TARGET_CSSC)
+ {
+ if (aarch64_uminmax_immediate (operands[1], <MODE>mode))
+ std::swap (operands[1], operands[2]);
+ operands[1] = force_reg (<MODE>mode, operands[1]);
+ if (!aarch64_uminmax_operand (operands[2], <MODE>mode))
+ operands[2] = force_reg (<MODE>mode, operands[2]);
+ emit_insn (gen_aarch64_umax<mode>3_insn (operands[0], operands[1],
+ operands[2]));
+ DONE;
+ }
+ else
FAIL;
rtx temp = gen_reg_rtx (<MODE>mode);
operands[1] = force_reg (<MODE>mode, operands[1]);
@@ -4966,8 +5002,16 @@ (define_expand "ffs<mode>2"
}
)
-;; Pop count be done via the "CNT" instruction in AdvSIMD.
-;;
+(define_insn "aarch64_popcount<mode>2_insn"
+ [(set (match_operand:GPI 0 "register_operand" "=r")
+ (popcount:GPI (match_operand:GPI 1 "register_operand" "r")))]
+ "TARGET_CSSC"
+ "cnt\\t%<w>0, %<w>1"
+ [(set_attr "type" "clz")]
+)
+;; The CSSC instructions can do popcount in the GP registers directly through
+;; CNT. If it is not available then we can use CNT on the Advanced SIMD side
+;; through:
;; MOV v.1d, x0
;; CNT v1.8b, v.8b
;; ADDV b2, v1.8b
@@ -4976,8 +5020,14 @@ (define_expand "ffs<mode>2"
(define_expand "popcount<mode>2"
[(match_operand:GPI 0 "register_operand")
(match_operand:GPI 1 "register_operand")]
- "TARGET_SIMD"
+ "TARGET_CSSC || TARGET_SIMD"
{
+ if (TARGET_CSSC)
+ {
+ emit_insn (gen_aarch64_popcount<mode>2_insn (operands[0], operands[1]));
+ DONE;
+ }
+
rtx v = gen_reg_rtx (V8QImode);
rtx v1 = gen_reg_rtx (V8QImode);
rtx in = operands[1];
@@ -5016,14 +5066,14 @@ (define_insn "@aarch64_rbit<mode>"
;; Split after reload into RBIT + CLZ. Since RBIT is represented as an UNSPEC
;; it is unlikely to fold with any other operation, so keep this as a CTZ
;; expression and split after reload to enable scheduling them apart if
-;; needed.
+;; needed. For TARGET_CSSC we have a single CTZ instruction that can do this.
(define_insn_and_split "ctz<mode>2"
[(set (match_operand:GPI 0 "register_operand" "=r")
(ctz:GPI (match_operand:GPI 1 "register_operand" "r")))]
""
- "#"
- "reload_completed"
+ { return TARGET_CSSC ? "ctz\\t%<w>0, %<w>1" : "#"; }
+ "reload_completed && !TARGET_CSSC"
[(const_int 0)]
"
emit_insn (gen_aarch64_rbit (<MODE>mode, operands[0], operands[1]));
@@ -6662,6 +6712,17 @@ (define_insn "abs<mode>2"
[(set_attr "type" "ffarith<stype>")]
)
+(define_insn "<optab><mode>3"
+ [(set (match_operand:GPI 0 "register_operand" "=r,r")
+ (MAXMIN_NOUMAX:GPI (match_operand:GPI 1 "register_operand" "r,r")
+ (match_operand:GPI 2 "aarch64_<su>minmax_operand" "r,U<su>m")))]
+ "TARGET_CSSC"
+ "@
+ <optab>\\t%<w>0, %<w>1, %<w>2
+ <optab>\\t%<w>0, %<w>1, %2"
+ [(set_attr "type" "alu_sreg,alu_imm")]
+)
+
;; Given that smax/smin do not specify the result when either input is NaN,
;; we could use either FMAXNM or FMAX for smax, and either FMINNM or FMIN
;; for smin.
diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md
index ee7587cca1673208e2bfd6b503a21d0c8b69bf75..29efb6c0cff7574c9b239ef358acaca96dd75d03 100644
--- a/gcc/config/aarch64/constraints.md
+++ b/gcc/config/aarch64/constraints.md
@@ -152,6 +152,11 @@ (define_constraint "Usa"
(match_test "aarch64_symbolic_address_p (op)")
(match_test "aarch64_mov_operand_p (op, GET_MODE (op))")))
+(define_constraint "Usm"
+ "A constant that can be used with the S[MIN/MAX] CSSC instructions."
+ (and (match_code "const_int")
+ (match_test "aarch64_sminmax_immediate (op, VOIDmode)")))
+
;; const is needed here to support UNSPEC_SALT_ADDR.
(define_constraint "Usw"
"@internal
@@ -389,6 +394,11 @@ (define_constraint "Ufc"
(and (match_code "const_double,const_vector")
(match_test "aarch64_float_const_representable_p (op)")))
+(define_constraint "Uum"
+ "A constant that can be used with the U[MIN/MAX] CSSC instructions."
+ (and (match_code "const_int")
+ (match_test "aarch64_uminmax_immediate (op, VOIDmode)")))
+
(define_constraint "Uvi"
"A floating point constant which can be used with a\
MOVI immediate operation."
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 5e6ff595c0e2a084ecf88ac9b60158d9983039f8..0d4ee4474726b66eab62e8637efddc0a5c266ba8 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -2186,6 +2186,9 @@ (define_code_iterator FLOATUORS [float unsigned_float])
;; Code iterator for variants of vector max and min.
(define_code_iterator MAXMIN [smax smin umax umin])
+;; Code iterator for min/max ops but without UMAX.
+(define_code_iterator MAXMIN_NOUMAX [smax smin umin])
+
(define_code_iterator FMAXMIN [smax smin])
;; Signed and unsigned max operations.
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index c308015ac2c13d24cd6bcec71247ec45df8cf5e6..6175875fbc6be6b0bff77873f55e5e6d44378d0c 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -148,6 +148,22 @@ (define_predicate "aarch64_pluslong_immediate"
(and (match_code "const_int")
(match_test "(INTVAL (op) < 0xffffff && INTVAL (op) > -0xffffff)")))
+(define_predicate "aarch64_sminmax_immediate"
+ (and (match_code "const_int")
+ (match_test "IN_RANGE (INTVAL (op), -128, 127)")))
+
+(define_predicate "aarch64_sminmax_operand"
+ (ior (match_operand 0 "register_operand")
+ (match_operand 0 "aarch64_sminmax_immediate")))
+
+(define_predicate "aarch64_uminmax_immediate"
+ (and (match_code "const_int")
+ (match_test "IN_RANGE (INTVAL (op), 0, 255)")))
+
+(define_predicate "aarch64_uminmax_operand"
+ (ior (match_operand 0 "register_operand")
+ (match_operand 0 "aarch64_uminmax_immediate")))
+
(define_predicate "aarch64_pluslong_strict_immedate"
(and (match_operand 0 "aarch64_pluslong_immediate")
(not (match_operand 0 "aarch64_plus_immediate"))))
diff --git a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst
index c2b23a6ee97ef2b7c74119f22c1d3e3d85385f4d..c91ff68984d7b38650afa57cd4edf511e41594c8 100644
--- a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst
+++ b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst
@@ -544,6 +544,9 @@ the following and their inverses no :samp:`{feature}` :
:samp:`pauth`
Enable the Pointer Authentication Extension.
+:samp:`cssc`
+ Enable the Common Short Sequence Compression feature.
+
Feature ``crypto`` implies ``aes``, ``sha2``, and ``simd``,
which implies ``fp``.
Conversely, ``nofp`` implies ``nosimd``, which implies
diff --git a/gcc/testsuite/gcc.target/aarch64/cssc_1.c b/gcc/testsuite/gcc.target/aarch64/cssc_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..eecd00b25366dfbc760d6af8ffaf0a831fa07b54
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cssc_1.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "--save-temps -O1" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#include <stdint.h>
+
+#pragma GCC target "+cssc"
+
+/*
+** absw:
+** abs w0, w0
+** ret
+*/
+
+int32_t
+absw (int32_t a)
+{
+ return __builtin_abs (a);
+}
+
+/*
+** absx:
+** abs x0, x0
+** ret
+*/
+
+int64_t
+absx (int64_t a)
+{
+ return __builtin_labs (a);
+}
+
diff --git a/gcc/testsuite/gcc.target/aarch64/cssc_2.c b/gcc/testsuite/gcc.target/aarch64/cssc_2.c
new file mode 100644
index 0000000000000000000000000000000000000000..1637d29c6dc20c0eb55c5f3d05157b5b6c581fdc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cssc_2.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "--save-temps -O1" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#include <stdint.h>
+
+#pragma GCC target "+cssc"
+
+/*
+** cntw:
+** cnt w0, w0
+** ret
+*/
+
+int32_t
+cntw (int32_t a)
+{
+ return __builtin_popcount (a);
+}
+
+/*
+** cntx:
+** cnt x0, x0
+** ret
+*/
+
+int64_t
+cntx (int64_t a)
+{
+ return __builtin_popcountll (a);
+}
+
diff --git a/gcc/testsuite/gcc.target/aarch64/cssc_3.c b/gcc/testsuite/gcc.target/aarch64/cssc_3.c
new file mode 100644
index 0000000000000000000000000000000000000000..8965256c08e1e9a83c06a094e7ec533baae61619
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cssc_3.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "--save-temps -O1" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#include <stdint.h>
+
+#pragma GCC target "+cssc"
+
+/*
+** ctzw:
+** ctz w0, w0
+** ret
+*/
+
+int32_t
+ctzw (int32_t a)
+{
+ return __builtin_ctz (a);
+}
+
+/*
+** ctzx:
+** ctz x0, x0
+** ret
+*/
+
+int64_t
+ctzx (int64_t a)
+{
+ return __builtin_ctzll (a);
+}
+
diff --git a/gcc/testsuite/gcc.target/aarch64/cssc_4.c b/gcc/testsuite/gcc.target/aarch64/cssc_4.c
new file mode 100644
index 0000000000000000000000000000000000000000..34ccd0e7c3b03a214fa36a3d07955d787afb3ccd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cssc_4.c
@@ -0,0 +1,107 @@
+/* { dg-do compile } */
+/* { dg-additional-options "--save-temps -O1" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#include <stdint.h>
+
+#pragma GCC target "+cssc"
+
+#define MIN(X, Y) ((X) > (Y) ? (Y) : (X))
+#define MAX(X, Y) ((X) > (Y) ? (X) : (Y))
+
+/*
+** uminw:
+** umin w0, w[01], w[01]
+** ret
+*/
+
+uint32_t
+uminw (uint32_t a, uint32_t b)
+{
+ return MIN (a, b);
+}
+
+/*
+** uminx:
+** umin x0, x[01], x[01]
+** ret
+*/
+
+uint64_t
+uminx (uint64_t a, uint64_t b)
+{
+ return MIN (a, b);
+}
+
+/*
+** sminw:
+** smin w0, w[01], w[01]
+** ret
+*/
+
+int32_t
+sminw (int32_t a, int32_t b)
+{
+ return MIN (a, b);
+}
+
+/*
+** sminx:
+** smin x0, x[01], x[01]
+** ret
+*/
+
+int64_t
+sminx (int64_t a, int64_t b)
+{
+ return MIN (a, b);
+}
+
+/*
+** umaxw:
+** umax w0, w[01], w[01]
+** ret
+*/
+
+uint32_t
+umaxw (uint32_t a, uint32_t b)
+{
+ return MAX (a, b);
+}
+
+/*
+** umaxx:
+** umax x0, x[01], x[01]
+** ret
+*/
+
+uint64_t
+umaxx (uint64_t a, uint64_t b)
+{
+ return MAX (a, b);
+}
+
+/*
+** smaxw:
+** smax w0, w[01], w[01]
+** ret
+*/
+
+int32_t
+smaxw (int32_t a, int32_t b)
+{
+ return MAX (a, b);
+}
+
+/*
+** smaxx:
+** smax x0, x[01], x[01]
+** ret
+*/
+
+int64_t
+smaxx (int64_t a, int64_t b)
+{
+ return MAX (a, b);
+}
+
diff --git a/gcc/testsuite/gcc.target/aarch64/cssc_5.c b/gcc/testsuite/gcc.target/aarch64/cssc_5.c
new file mode 100644
index 0000000000000000000000000000000000000000..51495195eaea745b62df00e47f590faaaba348ae
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cssc_5.c
@@ -0,0 +1,154 @@
+/* { dg-do compile } */
+/* { dg-additional-options "--save-temps -O1" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#include <stdint.h>
+
+#pragma GCC target "+cssc"
+
+#define MIN(X, Y) ((X) > (Y) ? (Y) : (X))
+#define MAX(X, Y) ((X) > (Y) ? (X) : (Y))
+
+#define FUNC(T, OP, IMM) \
+T \
+T##_##OP##_##IMM (T a) \
+{ \
+ return OP (a, IMM); \
+} \
+
+#define FUNCNEG(T, OP, IMM) \
+T \
+T##_##OP##_m##IMM (T a) \
+{ \
+ return OP (a, - (IMM)); \
+} \
+
+/*
+** uint32_t_MIN_255:
+** umin w0, w0, 255
+** ret
+*/
+
+FUNC (uint32_t, MIN, 255)
+
+/*
+** uint64_t_MIN_255:
+** umin x0, x0, 255
+** ret
+*/
+
+FUNC (uint64_t, MIN, 255)
+
+/*
+** uint32_t_MAX_255:
+** umax w0, w0, 255
+** ret
+*/
+
+FUNC (uint32_t, MAX, 255)
+
+
+/*
+** uint64_t_MAX_255:
+** umax x0, x0, 255
+** ret
+*/
+
+FUNC (uint64_t, MAX, 255)
+
+/*
+** int32_t_MIN_m128:
+** smin w0, w0, -128
+** ret
+*/
+
+FUNCNEG (int32_t, MIN, 128)
+
+/*
+** int32_t_MIN_127:
+** smin w0, w0, 127
+** ret
+*/
+
+FUNC (int32_t, MIN, 127)
+
+/*
+** int64_t_MIN_m128:
+** smin x0, x0, -128
+** ret
+*/
+
+FUNCNEG (int64_t, MIN, 128)
+
+/*
+** int64_t_MIN_127:
+** smin x0, x0, 127
+** ret
+*/
+
+FUNC (int64_t, MIN, 127)
+
+/*
+** int32_t_MAX_m128:
+** smax w0, w0, -128
+** ret
+*/
+
+FUNCNEG (int32_t, MAX, 128)
+
+/*
+** int32_t_MAX_127:
+** smax w0, w0, 127
+** ret
+*/
+
+FUNC (int32_t, MAX, 127)
+
+/*
+** int64_t_MAX_m128:
+** smax x0, x0, -128
+** ret
+*/
+
+FUNCNEG (int64_t, MAX, 128)
+
+/*
+** int64_t_MAX_127:
+** smax x0, x0, 127
+** ret
+*/
+
+FUNC (int64_t, MAX, 127)
+
+/*
+** int32_t_MIN_0:
+** smin w0, w0, 0
+** ret
+*/
+
+FUNC (int32_t, MIN, 0)
+
+/*
+** int64_t_MIN_0:
+** smin x0, x0, 0
+** ret
+*/
+
+FUNC (int64_t, MIN, 0)
+
+/*
+** int32_t_MAX_0:
+** smax w0, w0, 0
+** ret
+*/
+
+FUNC (int32_t, MAX, 0)
+
+/*
+** int64_t_MAX_0:
+** smax x0, x0, 0
+** ret
+*/
+
+FUNC (int64_t, MAX, 0)
+
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH] aarch64: Add support for +cssc
2022-11-11 10:25 [PATCH] aarch64: Add support for +cssc Kyrylo Tkachov
@ 2022-11-11 20:10 ` Andrew Pinski
0 siblings, 0 replies; 2+ messages in thread
From: Andrew Pinski @ 2022-11-11 20:10 UTC (permalink / raw)
To: Kyrylo Tkachov; +Cc: gcc-patches
On Fri, Nov 11, 2022 at 2:26 AM Kyrylo Tkachov via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi all,
>
> This patch adds codegen for FEAT_CSSC from the 2022 Architecture extensions.
> It fits various existing optabs in GCC quite well.
> There are instructions for scalar signed/unsigned min/max, abs, ctz, popcount.
> We have expanders for these already, so they are wired up to emit single-insn
> patterns for the new TARGET_CSSC.
>
> These instructions are enabled by the +cssc command-line extension.
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> I'll push it once the Binutils patch from Andre for this gets committed
@@ -4976,8 +5020,14 @@ (define_expand "ffs<mode>2"
(define_expand "popcount<mode>2"
[(match_operand:GPI 0 "register_operand")
(match_operand:GPI 1 "register_operand")]
- "TARGET_SIMD"
+ "TARGET_CSSC || TARGET_SIMD"
{
+ if (TARGET_CSSC)
+ {
+ emit_insn (gen_aarch64_popcount<mode>2_insn (operands[0], operands[1]));
+ DONE;
+ }
+
rtx v = gen_reg_rtx (V8QImode);
rtx v1 = gen_reg_rtx (V8QImode);
rtx in = operands[1];
I think the easy way is to this instead:
(define_expand "popcount<mode>2"
[(set (match_operand:GPI 0 "register_operand")
(popcount:GPI (match_operand:GPI 1 "register_operand")))]
"TARGET_CSSC || TARGET_SIMD"
{
if (!TARGET_CSSC)
{
// Current code
DONE;
}
}
And then you don't need to name the aarch64_popcount pattern. Or use a *.
Yes it does mess up the diff but the end result seems cleaner.
I suspect all of the expands you are changing should be done this
similar way too.
Thanks,
Andrew Pinski
>
> Thanks,
> Kyrill
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-option-extensions.def (cssc): Define.
> * config/aarch64/aarch64.h (AARCH64_ISA_CSSC): Define.
> (TARGET_CSSC): Likewise.
> * config/aarch64/aarch64.md (aarch64_abs<mode>2_insn): New define_insn.
> (abs<mode>2): Adjust for the above.
> (aarch64_umax<mode>3_insn): New define_insn.
> (umax<mode>3): Adjust for the above.
> (aarch64_popcount<mode>2_insn): New define_insn.
> (popcount<mode>2): Adjust for the above.
> (<optab><mode>3): New define_insn.
> * config/aarch64/constraints.md (Usm): Define.
> (Uum): Likewise.
> * doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst:
> Document +cssc.
> * config/aarch64/iterators.md (MAXMIN_NOUMAX): New code iterator.
> * config/aarch64/predicates.md (aarch64_sminmax_immediate): Define.
> (aarch64_sminmax_operand): Likewise.
> (aarch64_uminmax_immediate): Likewise.
> (aarch64_uminmax_operand): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/cssc_1.c: New test.
> * gcc.target/aarch64/cssc_2.c: New test.
> * gcc.target/aarch64/cssc_3.c: New test.
> * gcc.target/aarch64/cssc_4.c: New test.
> * gcc.target/aarch64/cssc_5.c: New test.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-11-11 20:11 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-11 10:25 [PATCH] aarch64: Add support for +cssc Kyrylo Tkachov
2022-11-11 20:10 ` Andrew Pinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).