public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc r14-1566] aarch64: Improve representation of vpaddd intrinsics
@ 2023-06-06 10:10 Kyrylo Tkachov
  0 siblings, 0 replies; only message in thread
From: Kyrylo Tkachov @ 2023-06-06 10:10 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:6be5d852216d36f5b0024cd581c2508c168647a6

commit r14-1566-g6be5d852216d36f5b0024cd581c2508c168647a6
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Tue Jun 6 11:09:12 2023 +0100

    aarch64: Improve representation of vpaddd intrinsics
    
    The aarch64_addpdi pattern is redundant as the reduc_plus_scal_<mode> pattern can already generate
    the required form of the ADDP instruction, and is mostly folded to GIMPLE early on so can benefit from more optimisations.
    Though it turns out that we were missing the folding for the unsigned variants.
    This patch adds that and wires up the vpaddd_u64 and vpaddd_s64 intrinsics through the above pattern instead
    so that we can remove a redundant pattern and get more optimisation earlier.
    
    Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
    
    gcc/ChangeLog:
    
            * config/aarch64/aarch64-builtins.cc (aarch64_general_gimple_fold_builtin):
            Handle unsigned reduc_plus_scal_ builtins.
            * config/aarch64/aarch64-simd-builtins.def (addp): Delete DImode instances.
            * config/aarch64/aarch64-simd.md (aarch64_addpdi): Delete.
            * config/aarch64/arm_neon.h (vpaddd_s64): Reimplement with
            __builtin_aarch64_reduc_plus_scal_v2di.
            (vpaddd_u64): Reimplement with __builtin_aarch64_reduc_plus_scal_v2di_uu.

Diff:
---
 gcc/config/aarch64/aarch64-builtins.cc       |  1 +
 gcc/config/aarch64/aarch64-simd-builtins.def |  2 --
 gcc/config/aarch64/aarch64-simd.md           | 10 ----------
 gcc/config/aarch64/arm_neon.h                |  4 ++--
 4 files changed, 3 insertions(+), 14 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/config/aarch64/aarch64-builtins.cc
index e0bb2128e02..50c20c8fa86 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -3049,6 +3049,7 @@ aarch64_general_gimple_fold_builtin (unsigned int fcode, gcall *stmt,
   switch (fcode)
     {
       BUILTIN_VALL (UNOP, reduc_plus_scal_, 10, ALL)
+      BUILTIN_VDQ_I (UNOPU, reduc_plus_scal_, 10, NONE)
 	new_stmt = gimple_build_call_internal (IFN_REDUC_PLUS,
 					       1, args[0]);
 	gimple_call_set_lhs (new_stmt, gimple_call_lhs (stmt));
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 1beaa08c1e7..94ff3f1852f 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -53,8 +53,6 @@
   BUILTIN_VHSDF_DF (UNOP, sqrt, 2, FP)
   BUILTIN_VDQ_I (BINOP, addp, 0, NONE)
   BUILTIN_VDQ_I (BINOPU, addp, 0, NONE)
-  VAR1 (UNOP, addp, 0, NONE, di)
-  VAR1 (UNOPU, addp, 0, NONE, di)
   BUILTIN_VDQ_BHSI (UNOP, clrsb, 2, NONE)
   BUILTIN_VDQ_BHSI (UNOP, clz, 2, NONE)
   BUILTIN_VS (UNOP, ctz, 2, NONE)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index dd1b084f856..dbd6fc68914 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -7025,16 +7025,6 @@
   [(set_attr "type" "neon_reduc_add<q>")]
 )
 
-(define_insn "aarch64_addpdi"
-  [(set (match_operand:DI 0 "register_operand" "=w")
-        (unspec:DI
-          [(match_operand:V2DI 1 "register_operand" "w")]
-          UNSPEC_ADDP))]
-  "TARGET_SIMD"
-  "addp\t%d0, %1.2d"
-  [(set_attr "type" "neon_reduc_add")]
-)
-
 ;; sqrt
 
 (define_expand "sqrt<mode>2"
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index afe205cb83c..0bb98396b4c 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -17588,14 +17588,14 @@ __extension__ extern __inline int64_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vpaddd_s64 (int64x2_t __a)
 {
-  return __builtin_aarch64_addpdi (__a);
+  return __builtin_aarch64_reduc_plus_scal_v2di (__a);
 }
 
 __extension__ extern __inline uint64_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vpaddd_u64 (uint64x2_t __a)
 {
-  return __builtin_aarch64_addpdi_uu (__a);
+  return __builtin_aarch64_reduc_plus_scal_v2di_uu (__a);
 }
 
 /* vqabs */

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-06-06 10:10 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-06 10:10 [gcc r14-1566] aarch64: Improve representation of vpaddd intrinsics Kyrylo Tkachov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).