public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][committed] aarch64: Improve representation of vpaddd intrinsics
@ 2023-06-06 10:06 Kyrylo Tkachov
  0 siblings, 0 replies; only message in thread
From: Kyrylo Tkachov @ 2023-06-06 10:06 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1079 bytes --]

Hi all,

The aarch64_addpdi pattern is redundant as the reduc_plus_scal_<mode> pattern can already generate
the required form of the ADDP instruction, and is mostly folded to GIMPLE early on so can benefit from more optimisations.
Though it turns out that we were missing the folding for the unsigned variants.
This patch adds that and wires up the vpaddd_u64 and vpaddd_s64 intrinsics through the above pattern instead
so that we can remove a redundant pattern and get more optimisation earlier.

Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

	* config/aarch64/aarch64-builtins.cc (aarch64_general_gimple_fold_builtin):
	Handle unsigned reduc_plus_scal_ builtins.
	* config/aarch64/aarch64-simd-builtins.def (addp): Delete DImode instances.
	* config/aarch64/aarch64-simd.md (aarch64_addpdi): Delete.
	* config/aarch64/arm_neon.h (vpaddd_s64): Reimplement with
	__builtin_aarch64_reduc_plus_scal_v2di.
	(vpaddd_u64): Reimplement with __builtin_aarch64_reduc_plus_scal_v2di_uu.

[-- Attachment #2: vpaddd.patch --]
[-- Type: application/octet-stream, Size: 2700 bytes --]

diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/config/aarch64/aarch64-builtins.cc
index e0bb2128e02961435600bdfda2aac1bd5b329d85..50c20c8fa8696abac173ad5e28b749dc5fdaaf8c 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -3049,6 +3049,7 @@ aarch64_general_gimple_fold_builtin (unsigned int fcode, gcall *stmt,
   switch (fcode)
     {
       BUILTIN_VALL (UNOP, reduc_plus_scal_, 10, ALL)
+      BUILTIN_VDQ_I (UNOPU, reduc_plus_scal_, 10, NONE)
 	new_stmt = gimple_build_call_internal (IFN_REDUC_PLUS,
 					       1, args[0]);
 	gimple_call_set_lhs (new_stmt, gimple_call_lhs (stmt));
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 1beaa08c1e7c94bc13a64865ddb677345534699c..94ff3f1852f2849a644a57813257bb59dfd9581e 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -53,8 +53,6 @@
   BUILTIN_VHSDF_DF (UNOP, sqrt, 2, FP)
   BUILTIN_VDQ_I (BINOP, addp, 0, NONE)
   BUILTIN_VDQ_I (BINOPU, addp, 0, NONE)
-  VAR1 (UNOP, addp, 0, NONE, di)
-  VAR1 (UNOPU, addp, 0, NONE, di)
   BUILTIN_VDQ_BHSI (UNOP, clrsb, 2, NONE)
   BUILTIN_VDQ_BHSI (UNOP, clz, 2, NONE)
   BUILTIN_VS (UNOP, ctz, 2, NONE)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 0c80a49051950283d27e70dfac3f8838c426ca65..c8e405c60dcd6267426e4dacecd410675f5c36f5 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -7120,17 +7120,6 @@ (define_expand "aarch64_addp<mode>"
   }
 )
 
-
-(define_insn "aarch64_addpdi"
-  [(set (match_operand:DI 0 "register_operand" "=w")
-        (unspec:DI
-          [(match_operand:V2DI 1 "register_operand" "w")]
-          UNSPEC_ADDP))]
-  "TARGET_SIMD"
-  "addp\t%d0, %1.2d"
-  [(set_attr "type" "neon_reduc_add")]
-)
-
 ;; sqrt
 
 (define_expand "sqrt<mode>2"
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index afe205cb83cde89ddeede4c2b370a9de8911b172..0bb98396b4c9ec5a5e24edf1beb21bad2f9c1f53 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -17588,14 +17588,14 @@ __extension__ extern __inline int64_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vpaddd_s64 (int64x2_t __a)
 {
-  return __builtin_aarch64_addpdi (__a);
+  return __builtin_aarch64_reduc_plus_scal_v2di (__a);
 }
 
 __extension__ extern __inline uint64_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vpaddd_u64 (uint64x2_t __a)
 {
-  return __builtin_aarch64_addpdi_uu (__a);
+  return __builtin_aarch64_reduc_plus_scal_v2di_uu (__a);
 }
 
 /* vqabs */

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-06-06 10:07 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-06 10:06 [PATCH][committed] aarch64: Improve representation of vpaddd intrinsics Kyrylo Tkachov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).