* [AARCH64] Adding constant folding for __builtin_fmulx* with scalar 32 and 64 bit arguments
@ 2015-11-09 11:40 Bilyan Borisov
2015-11-23 14:35 ` James Greenhalgh
0 siblings, 1 reply; 2+ messages in thread
From: Bilyan Borisov @ 2015-11-09 11:40 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1026 bytes --]
This patch adds an extension to aarch64_gimple_fold_builtin () that does
constant folding on __builtin_fmulx* calls for 32 and 64 bit floating point
scalar modes. We fold when both arguments are constant, as well as when only one
is. The special cases of 0*inf, -0*inf, 0*-inf, and -0*-inf are also
handled. The case for vector constant arguments will be dealt with in a future
patch since the tests for that would be obscure and would unnecessarily
complicate this patch.
Added tests to check for proper handling of constant folding. Tested on targets
aarch64-none-elf and aarch64_be-none-elf.
---
gcc/
2015-XX-XX Bilyan Borisov <bilyan.borisov@arm.com>
* config/aarch64/aarch64-builtins.c (aarch64_gimple_fold_builtin): Added
constant folding.
gcc/testsuite/
2015-XX-XX Bilyan Borisov <bilyan.borisov@arm.com>
* gcc.target/aarch64/simd/vmulx.x: New.
* gcc.target/aarch64/simd/vmulx_f64_2.c: Likewise.
* gcc.target/aarch64/simd/vmulxd_f64_2.c: Likewise.
* gcc.target/aarch64/simd/vmulxs_f32_2.c: Likewise.
[-- Attachment #2: rb4724.patch --]
[-- Type: text/x-patch, Size: 11576 bytes --]
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index a1998ed550ac801e4d80baae122bf58e394a563f..339054d344900c942d5ce7c047479de3bbb4e61b 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -1362,7 +1362,7 @@ aarch64_gimple_fold_builtin (gimple_stmt_iterator *gsi)
if (fndecl)
{
int fcode = DECL_FUNCTION_CODE (fndecl);
- int nargs = gimple_call_num_args (stmt);
+ unsigned nargs = gimple_call_num_args (stmt);
tree *args = (nargs > 0
? gimple_call_arg_ptr (stmt, 0)
: &error_mark_node);
@@ -1386,7 +1386,54 @@ aarch64_gimple_fold_builtin (gimple_stmt_iterator *gsi)
new_stmt = gimple_build_assign (gimple_call_lhs (stmt),
REDUC_MIN_EXPR, args[0]);
break;
-
+ BUILTIN_GPF (BINOP, fmulx, 0)
+ {
+ gcc_assert (nargs == 2);
+ bool a0_cst_p = TREE_CODE (args[0]) == REAL_CST;
+ bool a1_cst_p = TREE_CODE (args[1]) == REAL_CST;
+ if (a0_cst_p || a1_cst_p)
+ {
+ if (a0_cst_p && a1_cst_p)
+ {
+ tree t0 = TREE_TYPE (args[0]);
+ real_value a0 = (TREE_REAL_CST (args[0]));
+ real_value a1 = (TREE_REAL_CST (args[1]));
+ if (real_equal (&a1, &dconst0))
+ std::swap (a0, a1);
+ /* According to real_equal (), +0 equals -0. */
+ if (real_equal (&a0, &dconst0) && real_isinf (&a1))
+ {
+ real_value res = dconst2;
+ res.sign = a0.sign ^ a1.sign;
+ new_stmt =
+ gimple_build_assign (gimple_call_lhs (stmt),
+ REAL_CST,
+ build_real (t0, res));
+ }
+ else
+ new_stmt =
+ gimple_build_assign (gimple_call_lhs (stmt),
+ MULT_EXPR,
+ args[0], args[1]);
+ }
+ else /* a0_cst_p ^ a1_cst_p. */
+ {
+ real_value const_part = a0_cst_p
+ ? TREE_REAL_CST (args[0]) : TREE_REAL_CST (args[1]);
+ if (!real_equal (&const_part, &dconst0)
+ && !real_isinf (&const_part))
+ new_stmt =
+ gimple_build_assign (gimple_call_lhs (stmt),
+ MULT_EXPR, args[0], args[1]);
+ }
+ }
+ if (new_stmt)
+ {
+ gimple_set_vuse (new_stmt, gimple_vuse (stmt));
+ gimple_set_vdef (new_stmt, gimple_vdef (stmt));
+ }
+ break;
+ }
default:
break;
}
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vmulx.x b/gcc/testsuite/gcc.target/aarch64/simd/vmulx.x
new file mode 100644
index 0000000000000000000000000000000000000000..8968a64a95cb40a466dd77fea4e9f9f63ad707dc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/simd/vmulx.x
@@ -0,0 +1,46 @@
+#define PASS_ARRAY(...) {__VA_ARGS__}
+
+#define SETUP_TEST_CASE_VEC(I, INTRINSIC, BASE_TYPE, TYPE1, TYPE2, \
+ VALS1, VALS2, EXPS, LEN, FM, Q_LD, Q_ST, \
+ V1, V2) \
+ do \
+ { \
+ int i##I; \
+ BASE_TYPE vec##I##_1_data[] = VALS1; \
+ BASE_TYPE vec##I##_2_data[] = VALS2; \
+ V1 TYPE1 vec##I##_1 = vld1##Q_LD##_##FM (vec##I##_1_data); \
+ V2 TYPE2 vec##I##_2 = vld1##Q_LD##_##FM (vec##I##_2_data); \
+ TYPE1 actual##I##_v = INTRINSIC (vec##I##_1, vec##I##_2); \
+ volatile BASE_TYPE expected##I[] = EXPS; \
+ BASE_TYPE actual##I[LEN]; \
+ vst1##Q_ST##_##FM (actual##I, actual##I##_v); \
+ for (i##I = 0; i##I < LEN; ++i##I) \
+ if (actual##I[i##I] != expected##I[i##I]) \
+ abort (); \
+ } \
+ while (0) \
+
+#define SETUP_TEST_CASE_SCALAR(I, INTRINSIC, TYPE, VAL1, VAL2, EXP) \
+ do \
+ { \
+ TYPE vec_##I##_1 = VAL1; \
+ TYPE vec_##I##_2 = VAL2; \
+ TYPE expected_##I = EXP; \
+ volatile TYPE actual_##I = INTRINSIC (vec_##I##_1, vec_##I##_2); \
+ if (actual_##I != expected_##I) \
+ abort (); \
+ } \
+ while (0) \
+
+/* Functions used to return values that won't be optimised away. */
+float32_t __attribute__ ((noinline))
+foo32 ()
+{
+ return 1.0;
+}
+
+float64_t __attribute__ ((noinline))
+foo64 ()
+{
+ return 1.0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vmulx_f64_2.c b/gcc/testsuite/gcc.target/aarch64/simd/vmulx_f64_2.c
new file mode 100644
index 0000000000000000000000000000000000000000..2d11675ed0baa170c64c03669f2841faa73f7009
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/simd/vmulx_f64_2.c
@@ -0,0 +1,59 @@
+/* Test the vmulx_f64 AArch64 SIMD intrinsic. */
+
+/* { dg-do run } */
+/* { dg-options "-save-temps -O3" } */
+
+#include "arm_neon.h"
+#include "vmulx.x"
+
+extern void abort (void);
+
+int
+main (void)
+{
+ float64_t v1 = 3.14159265359;
+ float64_t v2 = 1.383894;
+
+ /* Constant * constant, shouldn't generete fmulx or fmul, only fmov. */
+ SETUP_TEST_CASE_VEC (1, vmulx_f64, float64_t, float64x1_t, float64x1_t,
+ PASS_ARRAY (v1), PASS_ARRAY (v2), PASS_ARRAY (v1 * v2),
+ 1, f64, , , ,);
+ SETUP_TEST_CASE_VEC (2, vmulx_f64, float64_t, float64x1_t, float64x1_t,
+ PASS_ARRAY (0.0), PASS_ARRAY (__builtin_huge_val ()),
+ PASS_ARRAY (2.0), 1, f64, , , ,);
+ SETUP_TEST_CASE_VEC (3, vmulx_f64, float64_t, float64x1_t, float64x1_t,
+ PASS_ARRAY (0.0), PASS_ARRAY (-__builtin_huge_val ()),
+ PASS_ARRAY (-2.0), 1, f64, , , ,);
+ SETUP_TEST_CASE_VEC (4, vmulx_f64, float64_t, float64x1_t, float64x1_t,
+ PASS_ARRAY (-0.0), PASS_ARRAY (__builtin_huge_val ()),
+ PASS_ARRAY (-2.0), 1, f64, , , ,);
+ SETUP_TEST_CASE_VEC (5, vmulx_f64, float64_t, float64x1_t, float64x1_t,
+ PASS_ARRAY (-0.0), PASS_ARRAY (-__builtin_huge_val ()),
+ PASS_ARRAY (2.0), 1, f64, , , ,);
+ /* Constant +/- 0 or +/- inf * non-constant should generate fmulx. */
+ SETUP_TEST_CASE_VEC (6, vmulx_f64, float64_t, float64x1_t, float64x1_t,
+ PASS_ARRAY (/* volatile. */1.0),
+ PASS_ARRAY (-__builtin_huge_val ()),
+ PASS_ARRAY (-__builtin_huge_val ()), 1, f64, , , volatile
+ ,);
+ SETUP_TEST_CASE_VEC (7, vmulx_f64, float64_t, float64x1_t, float64x1_t,
+ PASS_ARRAY (/* volatile. */1.0),
+ PASS_ARRAY (__builtin_huge_val ()),
+ PASS_ARRAY (__builtin_huge_val ()), 1, f64, , , volatile
+ ,);
+ SETUP_TEST_CASE_VEC (8, vmulx_f64, float64_t, float64x1_t, float64x1_t,
+ PASS_ARRAY (/* volatile. */1.0), PASS_ARRAY (0.0),
+ PASS_ARRAY (0.0), 1, f64, , , volatile,);
+ SETUP_TEST_CASE_VEC (9, vmulx_f64, float64_t, float64x1_t, float64x1_t,
+ PASS_ARRAY (/* volatile. */1.0), PASS_ARRAY (-0.0),
+ PASS_ARRAY (-0.0), 1, f64, , , volatile,);
+ /* Constant non +/- 0 or non +/- inf * non-constant should generate fmul. */
+ SETUP_TEST_CASE_VEC (10, vmulx_f64, float64_t, float64x1_t, float64x1_t,
+ PASS_ARRAY (/* volatile. */1.0), PASS_ARRAY (v1),
+ PASS_ARRAY (v1), 1, f64, , , volatile,);
+ return 0;
+}
+/* { dg-final { scan-assembler-times "fmulx\[ \t\]+\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+\n" 4 } } */
+/* { dg-final { scan-assembler-times "fmul\[ \t\]+\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+\n" 1 } } */
+/* { dg-final { scan-assembler-times "fmov\[ \t\]+\[dD\]\[0-9\]+, ?2.0e\\+0\n" 1 } } */
+/* { dg-final { scan-assembler-times "fmov\[ \t\]+\[dD\]\[0-9\]+, ?-2.0e\\+0\n" 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vmulxd_f64_2.c b/gcc/testsuite/gcc.target/aarch64/simd/vmulxd_f64_2.c
new file mode 100644
index 0000000000000000000000000000000000000000..b1f4bcd33fb66d0fd85468a46b40bedf872bacc7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/simd/vmulxd_f64_2.c
@@ -0,0 +1,45 @@
+/* Test the vmulxd_f64 AArch64 SIMD intrinsic. */
+
+/* { dg-do run } */
+/* { dg-options "-save-temps -O3" } */
+
+#include "arm_neon.h"
+#include "vmulx.x"
+
+extern void abort (void);
+
+int
+main (void)
+{
+ float64_t v1 = 3.14159265359;
+ float64_t v2 = 1.383894;
+
+ /* Constant * constant, shouldn't generete fmulx or fmul, only fmov. */
+ SETUP_TEST_CASE_SCALAR (1, vmulxd_f64, float64_t, v1, v2, v1 * v2);
+ SETUP_TEST_CASE_SCALAR (2, vmulxd_f64, float64_t, 0.0,
+ __builtin_huge_val (), 2.0);
+ SETUP_TEST_CASE_SCALAR (3, vmulxd_f64, float64_t, 0.0,
+ -__builtin_huge_val (), -2.0);
+ SETUP_TEST_CASE_SCALAR (4, vmulxd_f64, float64_t, -0.0,
+ __builtin_huge_val (), -2.0);
+ SETUP_TEST_CASE_SCALAR (5, vmulxd_f64, float64_t, -0.0,
+ -__builtin_huge_val (), 2.0);
+ /* Constant +/- 0 or +/- inf * non-constant should generate fmulx. */
+ SETUP_TEST_CASE_SCALAR (6, vmulxd_f64, float64_t, foo64 (),
+ -__builtin_huge_val (), -__builtin_huge_val ());
+ SETUP_TEST_CASE_SCALAR (7, vmulxd_f64, float64_t, foo64 (),
+ __builtin_huge_val (), __builtin_huge_val ());
+ SETUP_TEST_CASE_SCALAR (8, vmulxd_f64, float64_t, foo64 (),
+ 0, 0);
+ SETUP_TEST_CASE_SCALAR (9, vmulxd_f64, float64_t, foo64 (),
+ -0.0, -0.0);
+ /* Constant non +/- 0 or non +/- inf * non-constant should generate fmul. */
+ SETUP_TEST_CASE_SCALAR (10, vmulxd_f64, float64_t, foo64 (),
+ v1, v1);
+
+ return 0;
+}
+/* { dg-final { scan-assembler-times "fmulx\[ \t\]+\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+\n" 4 } } */
+/* { dg-final { scan-assembler-times "fmul\[ \t\]+\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+\n" 1 } } */
+/* { dg-final { scan-assembler-times "fmov\[ \t\]+\[dD\]\[0-9\]+, ?2.0e\\+0\n" 1 } } */
+/* { dg-final { scan-assembler-times "fmov\[ \t\]+\[dD\]\[0-9\]+, ?-2.0e\\+0\n" 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vmulxs_f32_2.c b/gcc/testsuite/gcc.target/aarch64/simd/vmulxs_f32_2.c
new file mode 100644
index 0000000000000000000000000000000000000000..3d9139859cea04c0b659ee14d142fc901dc736ba
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/simd/vmulxs_f32_2.c
@@ -0,0 +1,44 @@
+/* Test the vmulxs_f32 AArch64 SIMD intrinsic. */
+
+/* { dg-do run } */
+/* { dg-options "-save-temps -O3" } */
+
+#include "arm_neon.h"
+#include "vmulx.x"
+
+extern void abort (void);
+
+int
+main (void)
+{
+ float32_t v1 = 3.14159265359;
+ float32_t v2 = 1.383894;
+
+ /* Constant * constant, shouldn't generete fmulx or fmul, only fmov. */
+ SETUP_TEST_CASE_SCALAR (1, vmulxs_f32, float32_t, v1, v2, v1 * v2);
+ SETUP_TEST_CASE_SCALAR (2, vmulxs_f32, float32_t, 0.0,
+ __builtin_huge_valf (), 2.0);
+ SETUP_TEST_CASE_SCALAR (3, vmulxs_f32, float32_t, 0.0,
+ -__builtin_huge_valf (), -2.0);
+ SETUP_TEST_CASE_SCALAR (4, vmulxs_f32, float32_t, -0.0,
+ __builtin_huge_valf (), -2.0);
+ SETUP_TEST_CASE_SCALAR (5, vmulxs_f32, float32_t, -0.0,
+ -__builtin_huge_valf (), 2.0);
+ /* Constant +/- 0 or +/- inf * non-constant should generate fmulx. */
+ SETUP_TEST_CASE_SCALAR (6, vmulxs_f32, float32_t, foo32 (),
+ -__builtin_huge_valf (), -__builtin_huge_valf ());
+ SETUP_TEST_CASE_SCALAR (7, vmulxs_f32, float32_t, foo32 (),
+ __builtin_huge_valf (), __builtin_huge_valf ());
+ SETUP_TEST_CASE_SCALAR (8, vmulxs_f32, float32_t, foo32 (),
+ 0, 0);
+ SETUP_TEST_CASE_SCALAR (9, vmulxs_f32, float32_t, foo32 (),
+ -0.0, -0.0);
+ /* Constant non +/- 0 or non +/- inf * non-constant should generate fmul. */
+ SETUP_TEST_CASE_SCALAR (10, vmulxs_f32, float32_t, foo32 (),
+ v1, v1);
+ return 0;
+}
+/* { dg-final { scan-assembler-times "fmulx\[ \t\]+\[sS\]\[0-9\]+, ?\[sS\]\[0-9\]+, ?\[sS\]\[0-9\]+\n" 4 } } */
+/* { dg-final { scan-assembler-times "fmul\[ \t\]+\[sS\]\[0-9\]+, ?\[sS\]\[0-9\]+, ?\[sS\]\[0-9\]+\n" 1 } } */
+/* { dg-final { scan-assembler-times "fmov\[ \t\]+\[sS\]\[0-9\]+, ?2.0e\\+0\n" 1 } } */
+/* { dg-final { scan-assembler-times "fmov\[ \t\]+\[sS\]\[0-9\]+, ?-2.0e\\+0\n" 1 } } */
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [AARCH64] Adding constant folding for __builtin_fmulx* with scalar 32 and 64 bit arguments
2015-11-09 11:40 [AARCH64] Adding constant folding for __builtin_fmulx* with scalar 32 and 64 bit arguments Bilyan Borisov
@ 2015-11-23 14:35 ` James Greenhalgh
0 siblings, 0 replies; 2+ messages in thread
From: James Greenhalgh @ 2015-11-23 14:35 UTC (permalink / raw)
To: Bilyan Borisov; +Cc: gcc-patches
On Mon, Nov 09, 2015 at 11:40:11AM +0000, Bilyan Borisov wrote:
> This patch adds an extension to aarch64_gimple_fold_builtin () that does
> constant folding on __builtin_fmulx* calls for 32 and 64 bit floating point
> scalar modes. We fold when both arguments are constant, as well as when only one
> is. The special cases of 0*inf, -0*inf, 0*-inf, and -0*-inf are also
> handled. The case for vector constant arguments will be dealt with in a future
> patch since the tests for that would be obscure and would unnecessarily
> complicate this patch.
>
> Added tests to check for proper handling of constant folding. Tested on targets
> aarch64-none-elf and aarch64_be-none-elf.
>
> ---
>
> gcc/
>
> 2015-XX-XX Bilyan Borisov <bilyan.borisov@arm.com>
>
> * config/aarch64/aarch64-builtins.c (aarch64_gimple_fold_builtin): Added
> constant folding.
>
> gcc/testsuite/
>
> 2015-XX-XX Bilyan Borisov <bilyan.borisov@arm.com>
>
> * gcc.target/aarch64/simd/vmulx.x: New.
> * gcc.target/aarch64/simd/vmulx_f64_2.c: Likewise.
> * gcc.target/aarch64/simd/vmulxd_f64_2.c: Likewise.
> * gcc.target/aarch64/simd/vmulxs_f32_2.c: Likewise.
>
OK, thanks.
I've committed this on your behalf as revision 230758 with a slight tweak to
the changelog to read:
* config/aarch64/aarch64-builtins.c
(aarch64_gimple_fold_builtin): Fold FMULX.
Thanks,
James
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2015-11-23 14:25 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-09 11:40 [AARCH64] Adding constant folding for __builtin_fmulx* with scalar 32 and 64 bit arguments Bilyan Borisov
2015-11-23 14:35 ` James Greenhalgh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).