* Re: [AArch64] Use scvtf fbits option where appropriate @ 2019-06-13 17:26 Wilco Dijkstra 2019-06-18 9:11 ` Joel Hutton 0 siblings, 1 reply; 11+ messages in thread From: Wilco Dijkstra @ 2019-06-13 17:26 UTC (permalink / raw) To: Joel Hutton; +Cc: nd, GCC Patches Hi Joel, A few comments below: +/* If X is a positive CONST_DOUBLE with a value that is the reciprocal of a + power of 2 (i.e 1/2^n) return the number of float bits. e.g. for x==(1/2^n) + return log2 (n). Otherwise return 0. */ +int +aarch64_fpconst_pow2_recip (rtx x) +{ + REAL_VALUE_TYPE r0; + + if (!CONST_DOUBLE_P (x)) + return 0; + + r0 = *CONST_DOUBLE_REAL_VALUE (x); + if (exact_real_inverse (DFmode, &r0) + && !REAL_VALUE_NEGATIVE (r0)) + { + if (exact_real_truncate (DFmode, &r0)) Truncate to double? That doesn't do anything... + { + HOST_WIDE_INT value = real_to_integer (&r0); + value = value & 0xffffffff; + if ((value != 0) && ( (value & (value - 1)) == 0)) + { + int ret = exact_log2 (value); + gcc_assert (IN_RANGE (ret, 0, 31)); + return ret; + } Wouldn't it be easier to just do exact_log2 (real_to_integer (&r0)) and then check the range is in 1..31? --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -6016,6 +6016,40 @@ [(set_attr "type" "f_cvtf2i")] ) +(define_insn "*aarch64_<su_optab>cvtf_<fcvt_target>_<GPF:mode>2_mult" + [(set (match_operand:GPF 0 "register_operand" "=w,w") + (mult:GPF (FLOATUORS:GPF + (match_operand:<FCVT_TARGET> 1 "register_operand" "w,?r")) + (match_operand 2 "aarch64_fp_pow2_recip""Dt,Dt")))] We should add a comment before both define_insn similar to the other conversions, explaining what they do and why there are 2 separate patterns (the default versions of the conversions appear to be missing a comment too). Wilco ^ permalink raw reply [flat|nested] 11+ messages in thread
* [AArch64] Use scvtf fbits option where appropriate 2019-06-13 17:26 [AArch64] Use scvtf fbits option where appropriate Wilco Dijkstra @ 2019-06-18 9:11 ` Joel Hutton 2019-06-18 10:37 ` Richard Earnshaw (lists) 0 siblings, 1 reply; 11+ messages in thread From: Joel Hutton @ 2019-06-18 9:11 UTC (permalink / raw) To: Wilco Dijkstra; +Cc: nd, GCC Patches [-- Attachment #1: Type: text/plain, Size: 927 bytes --] Hi, On 13/06/2019 18:26, Wilco Dijkstra wrote: > Wouldn't it be easier to just do exact_log2 (real_to_integer (&r0)) > and then check the range is in 1..31? I've revised this section. > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -6016,6 +6016,40 @@ > [(set_attr "type" "f_cvtf2i")] > ) > > +(define_insn "*aarch64_<su_optab>cvtf_<fcvt_target>_<GPF:mode>2_mult" > + [(set (match_operand:GPF 0 "register_operand" "=w,w") > + (mult:GPF (FLOATUORS:GPF > + (match_operand:<FCVT_TARGET> 1 "register_operand" "w,?r")) > + (match_operand 2 "aarch64_fp_pow2_recip""Dt,Dt")))] > > We should add a comment before both define_insn similar to the other > conversions, explaining what they do and why there are 2 separate patterns > (the default versions of the conversions appear to be missing a comment too). I've added comments to the new and existing patterns [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-SCVTF-fbits.patch --] [-- Type: text/x-patch; name="0001-SCVTF-fbits.patch", Size: 10867 bytes --] From 5a9dfa6c6eb1c5b9c8c464780b7098058989d472 Mon Sep 17 00:00:00 2001 From: Joel Hutton <Joel.Hutton@arm.com> Date: Thu, 13 Jun 2019 11:08:56 +0100 Subject: [PATCH] SCVTF fbits --- gcc/config/aarch64/aarch64-protos.h | 1 + gcc/config/aarch64/aarch64.c | 28 ++++ gcc/config/aarch64/aarch64.md | 39 +++++ gcc/config/aarch64/constraints.md | 7 + gcc/config/aarch64/predicates.md | 4 + gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c | 140 ++++++++++++++++++ 6 files changed, 219 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 1e3b1c91db1..ad1ba458a3f 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -494,6 +494,7 @@ enum aarch64_symbol_type aarch64_classify_tls_symbol (rtx); enum reg_class aarch64_regno_regclass (unsigned); int aarch64_asm_preferred_eh_data_format (int, int); int aarch64_fpconst_pow_of_2 (rtx); +int aarch64_fpconst_pow2_recip (rtx); machine_mode aarch64_hard_regno_caller_save_mode (unsigned, unsigned, machine_mode); int aarch64_uxt_size (int, HOST_WIDE_INT); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 9a035dd9ed8..424ca6c9932 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -18707,6 +18707,34 @@ aarch64_fpconst_pow_of_2 (rtx x) return exact_log2 (real_to_integer (r)); } +/* If X is a positive CONST_DOUBLE with a value that is the reciprocal of a + power of 2 (i.e 1/2^n) return the number of float bits. e.g. for x==(1/2^n) + return n. Otherwise return -1. */ +int +aarch64_fpconst_pow2_recip (rtx x) +{ + REAL_VALUE_TYPE r0; + + if (!CONST_DOUBLE_P (x)) + return -1; + + r0 = *CONST_DOUBLE_REAL_VALUE (x); + if (exact_real_inverse (DFmode, &r0) + && !REAL_VALUE_NEGATIVE (r0)) + { + int ret = exact_log2 (real_to_integer (&r0)); + if (ret >= 1 && ret <= 31) + { + return ret; + } + else + { + return -1; + } + } + return -1; +} + /* If X is a vector of equal CONST_DOUBLE values and that value is Y, return the aarch64_fpconst_pow_of_2 of Y. Otherwise return -1. */ diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 526c7fb0dab..d9812aa238e 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -6016,6 +6016,44 @@ [(set_attr "type" "f_cvtf2i")] ) +;; equal width integer to fp combine +(define_insn "*aarch64_<su_optab>cvtf_<fcvt_target>_<GPF:mode>2_mult" + [(set (match_operand:GPF 0 "register_operand" "=w,w") + (mult:GPF (FLOATUORS:GPF + (match_operand:<FCVT_TARGET> 1 "register_operand" "w,?r")) + (match_operand 2 "aarch64_fp_pow2_recip""Dt,Dt")))] + "TARGET_FLOAT" + { + operands[2] = GEN_INT (aarch64_fpconst_pow2_recip (operands[2])); + switch (which_alternative) + { + case 0: + return "<su_optab>cvtf\t%<GPF:s>0, %<s>1, #%2"; + case 1: + return "<su_optab>cvtf\t%<GPF:s>0, %<w1>1, #%2"; + default: + gcc_unreachable(); + } + } + [(set_attr "type" "neon_int_to_fp_<Vetype>,f_cvti2f") + (set_attr "arch" "simd,fp")] +) + +;; inequal width integer to fp combine +(define_insn "*aarch64_<su_optab>cvtf_<fcvt_iesize>_<GPF:mode>2_mult" + [(set (match_operand:GPF 0 "register_operand" "=w") + (mult:GPF (FLOATUORS:GPF + (match_operand:<FCVT_IESIZE> 1 "register_operand" "r")) + (match_operand 2 "aarch64_fp_pow2_recip" "Dt")))] + "TARGET_FLOAT" + { + operands[2] = GEN_INT (aarch64_fpconst_pow2_recip (operands[2])); + return "<su_optab>cvtf\t%<GPF:s>0, %<w2>1, #%2"; + } + [(set_attr "type" "f_cvti2f")] +) + +;; equal width integer to fp conversion (define_insn "<optab><fcvt_target><GPF:mode>2" [(set (match_operand:GPF 0 "register_operand" "=w,w") (FLOATUORS:GPF (match_operand:<FCVT_TARGET> 1 "register_operand" "w,?r")))] @@ -6027,6 +6065,7 @@ (set_attr "arch" "simd,fp")] ) +;; inequal width integer to fp conversions (define_insn "<optab><fcvt_iesize><GPF:mode>2" [(set (match_operand:GPF 0 "register_operand" "=w") (FLOATUORS:GPF (match_operand:<FCVT_IESIZE> 1 "register_operand" "r")))] diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index 21f9549e660..a7731a033ea 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -329,6 +329,13 @@ (match_test "aarch64_simd_scalar_immediate_valid_for_move (op, QImode)"))) +(define_constraint "Dt" + "@internal + A const_double which is the reciprocal of an exact power of two, can be + used in an scvtf with fract bits operation" + (and (match_code "const_double") + (match_test "aarch64_fpconst_pow2_recip (op)"))) + (define_constraint "Dl" "@internal A constraint that matches vector of immediates for left shifts." diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 10100ca830a..da295981286 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -98,6 +98,10 @@ (and (match_code "const_double") (match_test "aarch64_fpconst_pow_of_2 (op) > 0"))) +(define_predicate "aarch64_fp_pow2_recip" + (and (match_code "const_double") + (match_test "aarch64_fpconst_pow2_recip (op) > 0"))) + (define_predicate "aarch64_fp_vec_pow2" (match_test "aarch64_vec_fpconst_pow_of_2 (op) > 0")) diff --git a/gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c b/gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c new file mode 100644 index 00000000000..e8d1de6279b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c @@ -0,0 +1,140 @@ +/* { dg-do run } */ +/* { dg-options "-save-temps -O2 -fno-inline" } */ + +#define FUNC_DEFS(__a) \ + float \ +fsfoo##__a (int x) \ +{ \ + return ((float) x)/(1u << __a); \ +} \ +float \ +fusfoo##__a (unsigned int x) \ +{ \ + return ((float) x)/(1u << __a); \ +} \ +float \ +fslfoo##__a (long x) \ +{ \ + return ((float) x)/(1u << __a); \ +} \ +float \ +fulfoo##__a (unsigned long x) \ +{ \ + return ((float) x)/(1u << __a); \ +} \ + +#define FUNC_DEFD(__a) \ +double \ +dsfoo##__a (int x) \ +{ \ + return ((double) x)/(1u << __a);\ +} \ +double \ +dusfoo##__a (unsigned int x) \ +{ \ + return ((double) x)/(1u << __a);\ +} \ +double \ +dslfoo##__a (long x) \ +{ \ + return ((double) x)/(1u << __a);\ +} \ +double \ +dulfoo##__a (unsigned long x) \ +{ \ + return ((double) x)/(1u << __a);\ +} + +FUNC_DEFS (4) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#4" 1 } } */ + +FUNC_DEFD (4) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#4" 1 } } */ + +FUNC_DEFS (8) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#8" 1 } } */ + +FUNC_DEFD (8) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#8" 1 } } */ + +FUNC_DEFS (16) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#16" 1 } } */ + +FUNC_DEFD (16) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#16" 1 } } */ + +FUNC_DEFS (31) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#31" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#31" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#31" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#31" 1 } } */ + +FUNC_DEFD (31) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#31" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#31" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#31" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#31" 1 } } */ + +#define FUNC_TESTS(__a, __b) \ +do \ +{ \ + if (fsfoo##__a (__b) != ((int) i) * (1.0f/(1u << __a)) ) \ + __builtin_abort (); \ + if (fusfoo##__a (__b) != ((int) i) * (1.0f/(1u << __a)) ) \ + __builtin_abort (); \ + if (fslfoo##__a (__b) != ((int) i) * (1.0f/(1u << __a)) ) \ + __builtin_abort (); \ + if (fulfoo##__a (__b) != ((int) i) * (1.0f/(1u << __a)) ) \ + __builtin_abort (); \ +} while (0) + +#define FUNC_TESTD(__a, __b) \ +do \ +{ \ + if (fsfoo##__a (__b) != ((int) i) * (1.0d/(1u << __a)) ) \ + __builtin_abort (); \ + if (fusfoo##__a (__b) != ((int) i) * (1.0d/(1u << __a)) ) \ + __builtin_abort (); \ + if (fslfoo##__a (__b) != ((int) i) * (1.0d/(1u << __a)) ) \ + __builtin_abort (); \ + if (fulfoo##__a (__b) != ((int) i) * (1.0d/(1u << __a)) ) \ + __builtin_abort (); \ +} while (0) + + int +main (void) +{ + int i; + + for (i = 0; i < 32; i ++) + { + FUNC_TESTS (4, i); + FUNC_TESTS (8, i); + FUNC_TESTS (16, i); + FUNC_TESTS (31, i); + + FUNC_TESTD (4, i); + FUNC_TESTD (8, i); + FUNC_TESTD (16, i); + FUNC_TESTD (31, i); + } + return 0; +} -- 2.17.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [AArch64] Use scvtf fbits option where appropriate 2019-06-18 9:11 ` Joel Hutton @ 2019-06-18 10:37 ` Richard Earnshaw (lists) 2019-06-18 11:12 ` Wilco Dijkstra 0 siblings, 1 reply; 11+ messages in thread From: Richard Earnshaw (lists) @ 2019-06-18 10:37 UTC (permalink / raw) To: Joel Hutton, Wilco Dijkstra; +Cc: nd, GCC Patches On 18/06/2019 10:11, Joel Hutton wrote: > Hi, > > On 13/06/2019 18:26, Wilco Dijkstra wrote: >> Wouldn't it be easier to just do exact_log2 (real_to_integer (&r0)) >> and then check the range is in 1..31? > I've revised this section. >> --- a/gcc/config/aarch64/aarch64.md >> +++ b/gcc/config/aarch64/aarch64.md >> @@ -6016,6 +6016,40 @@ >> [(set_attr "type" "f_cvtf2i")] >> ) >> >> +(define_insn "*aarch64_<su_optab>cvtf_<fcvt_target>_<GPF:mode>2_mult" >> + [(set (match_operand:GPF 0 "register_operand" "=w,w") >> + (mult:GPF (FLOATUORS:GPF >> + (match_operand:<FCVT_TARGET> 1 "register_operand" "w,?r")) >> + (match_operand 2 "aarch64_fp_pow2_recip""Dt,Dt")))] >> >> We should add a comment before both define_insn similar to the other >> conversions, explaining what they do and why there are 2 separate patterns >> (the default versions of the conversions appear to be missing a comment too). > I've added comments to the new and existing patterns > > > 0001-SCVTF-fbits.patch > > From 5a9dfa6c6eb1c5b9c8c464780b7098058989d472 Mon Sep 17 00:00:00 2001 > From: Joel Hutton <Joel.Hutton@arm.com> > Date: Thu, 13 Jun 2019 11:08:56 +0100 > Subject: [PATCH] SCVTF fbits > > --- > gcc/config/aarch64/aarch64-protos.h | 1 + > gcc/config/aarch64/aarch64.c | 28 ++++ > gcc/config/aarch64/aarch64.md | 39 +++++ > gcc/config/aarch64/constraints.md | 7 + > gcc/config/aarch64/predicates.md | 4 + > gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c | 140 ++++++++++++++++++ > 6 files changed, 219 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c > > diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h > index 1e3b1c91db1..ad1ba458a3f 100644 > --- a/gcc/config/aarch64/aarch64-protos.h > +++ b/gcc/config/aarch64/aarch64-protos.h > @@ -494,6 +494,7 @@ enum aarch64_symbol_type aarch64_classify_tls_symbol (rtx); > enum reg_class aarch64_regno_regclass (unsigned); > int aarch64_asm_preferred_eh_data_format (int, int); > int aarch64_fpconst_pow_of_2 (rtx); > +int aarch64_fpconst_pow2_recip (rtx); > machine_mode aarch64_hard_regno_caller_save_mode (unsigned, unsigned, > machine_mode); > int aarch64_uxt_size (int, HOST_WIDE_INT); > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 9a035dd9ed8..424ca6c9932 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -18707,6 +18707,34 @@ aarch64_fpconst_pow_of_2 (rtx x) > return exact_log2 (real_to_integer (r)); > } > > +/* If X is a positive CONST_DOUBLE with a value that is the reciprocal of a > + power of 2 (i.e 1/2^n) return the number of float bits. e.g. for x==(1/2^n) > + return n. Otherwise return -1. */ > +int > +aarch64_fpconst_pow2_recip (rtx x) > +{ > + REAL_VALUE_TYPE r0; > + > + if (!CONST_DOUBLE_P (x)) > + return -1; CONST_DOUBLE can be used for things other than floating point. You should really check that the mode on the double in is in class MODE_FLOAT. > + > + r0 = *CONST_DOUBLE_REAL_VALUE (x); > + if (exact_real_inverse (DFmode, &r0) > + && !REAL_VALUE_NEGATIVE (r0)) > + { > + int ret = exact_log2 (real_to_integer (&r0)); > + if (ret >= 1 && ret <= 31) > + { > + return ret; > + } > + else > + { > + return -1; > + } > + } > + return -1; > +} > + > /* If X is a vector of equal CONST_DOUBLE values and that value is > Y, return the aarch64_fpconst_pow_of_2 of Y. Otherwise return -1. */ > > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 526c7fb0dab..d9812aa238e 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -6016,6 +6016,44 @@ > [(set_attr "type" "f_cvtf2i")] > ) > > +;; equal width integer to fp combine > +(define_insn "*aarch64_<su_optab>cvtf_<fcvt_target>_<GPF:mode>2_mult" > + [(set (match_operand:GPF 0 "register_operand" "=w,w") > + (mult:GPF (FLOATUORS:GPF > + (match_operand:<FCVT_TARGET> 1 "register_operand" "w,?r")) > + (match_operand 2 "aarch64_fp_pow2_recip""Dt,Dt")))] Missing mode on operand 2. Missing white space between constraint and predicate. > + "TARGET_FLOAT" > + { > + operands[2] = GEN_INT (aarch64_fpconst_pow2_recip (operands[2])); > + switch (which_alternative) > + { > + case 0: > + return "<su_optab>cvtf\t%<GPF:s>0, %<s>1, #%2"; > + case 1: > + return "<su_optab>cvtf\t%<GPF:s>0, %<w1>1, #%2"; > + default: > + gcc_unreachable(); > + } > + } > + [(set_attr "type" "neon_int_to_fp_<Vetype>,f_cvti2f") > + (set_attr "arch" "simd,fp")] > +) > + > +;; inequal width integer to fp combine > +(define_insn "*aarch64_<su_optab>cvtf_<fcvt_iesize>_<GPF:mode>2_mult" > + [(set (match_operand:GPF 0 "register_operand" "=w") > + (mult:GPF (FLOATUORS:GPF > + (match_operand:<FCVT_IESIZE> 1 "register_operand" "r")) > + (match_operand 2 "aarch64_fp_pow2_recip" "Dt")))] Likewise. > + "TARGET_FLOAT" > + { > + operands[2] = GEN_INT (aarch64_fpconst_pow2_recip (operands[2])); > + return "<su_optab>cvtf\t%<GPF:s>0, %<w2>1, #%2"; > + } > + [(set_attr "type" "f_cvti2f")] > +) > + > +;; equal width integer to fp conversion > (define_insn "<optab><fcvt_target><GPF:mode>2" > [(set (match_operand:GPF 0 "register_operand" "=w,w") > (FLOATUORS:GPF (match_operand:<FCVT_TARGET> 1 "register_operand" "w,?r")))] > @@ -6027,6 +6065,7 @@ > (set_attr "arch" "simd,fp")] > ) > > +;; inequal width integer to fp conversions Start sentences with a capital letter. End them with a full stop. "inequal" isn't a word: you probably mean "unequal". > (define_insn "<optab><fcvt_iesize><GPF:mode>2" > [(set (match_operand:GPF 0 "register_operand" "=w") > (FLOATUORS:GPF (match_operand:<FCVT_IESIZE> 1 "register_operand" "r")))] > diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md > index 21f9549e660..a7731a033ea 100644 > --- a/gcc/config/aarch64/constraints.md > +++ b/gcc/config/aarch64/constraints.md > @@ -329,6 +329,13 @@ > (match_test "aarch64_simd_scalar_immediate_valid_for_move (op, > QImode)"))) > > +(define_constraint "Dt" > + "@internal > + A const_double which is the reciprocal of an exact power of two, can be > + used in an scvtf with fract bits operation" > + (and (match_code "const_double") > + (match_test "aarch64_fpconst_pow2_recip (op)"))) The test returns -1 on failure, but you're using this as a boolean predicate (ie != 0). R. > + > (define_constraint "Dl" > "@internal > A constraint that matches vector of immediates for left shifts." > diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md > index 10100ca830a..da295981286 100644 > --- a/gcc/config/aarch64/predicates.md > +++ b/gcc/config/aarch64/predicates.md > @@ -98,6 +98,10 @@ > (and (match_code "const_double") > (match_test "aarch64_fpconst_pow_of_2 (op) > 0"))) > > +(define_predicate "aarch64_fp_pow2_recip" > + (and (match_code "const_double") > + (match_test "aarch64_fpconst_pow2_recip (op) > 0"))) > + > (define_predicate "aarch64_fp_vec_pow2" > (match_test "aarch64_vec_fpconst_pow_of_2 (op) > 0")) > > diff --git a/gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c b/gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c > new file mode 100644 > index 00000000000..e8d1de6279b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c > @@ -0,0 +1,140 @@ > +/* { dg-do run } */ > +/* { dg-options "-save-temps -O2 -fno-inline" } */ > + > +#define FUNC_DEFS(__a) \ > + float \ > +fsfoo##__a (int x) \ > +{ \ > + return ((float) x)/(1u << __a); \ > +} \ > +float \ > +fusfoo##__a (unsigned int x) \ > +{ \ > + return ((float) x)/(1u << __a); \ > +} \ > +float \ > +fslfoo##__a (long x) \ > +{ \ > + return ((float) x)/(1u << __a); \ > +} \ > +float \ > +fulfoo##__a (unsigned long x) \ > +{ \ > + return ((float) x)/(1u << __a); \ > +} \ > + > +#define FUNC_DEFD(__a) \ > +double \ > +dsfoo##__a (int x) \ > +{ \ > + return ((double) x)/(1u << __a);\ > +} \ > +double \ > +dusfoo##__a (unsigned int x) \ > +{ \ > + return ((double) x)/(1u << __a);\ > +} \ > +double \ > +dslfoo##__a (long x) \ > +{ \ > + return ((double) x)/(1u << __a);\ > +} \ > +double \ > +dulfoo##__a (unsigned long x) \ > +{ \ > + return ((double) x)/(1u << __a);\ > +} > + > +FUNC_DEFS (4) > + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#4" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#4" 1 } } */ > + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#4" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#4" 1 } } */ > + > +FUNC_DEFD (4) > + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#4" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#4" 1 } } */ > + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#4" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#4" 1 } } */ > + > +FUNC_DEFS (8) > + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#8" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#8" 1 } } */ > + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#8" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#8" 1 } } */ > + > +FUNC_DEFD (8) > + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#8" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#8" 1 } } */ > + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#8" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#8" 1 } } */ > + > +FUNC_DEFS (16) > + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#16" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#16" 1 } } */ > + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#16" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#16" 1 } } */ > + > +FUNC_DEFD (16) > + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#16" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#16" 1 } } */ > + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#16" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#16" 1 } } */ > + > +FUNC_DEFS (31) > + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#31" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#31" 1 } } */ > + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#31" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#31" 1 } } */ > + > +FUNC_DEFD (31) > + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#31" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#31" 1 } } */ > + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#31" 1 } } */ > + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#31" 1 } } */ > + > +#define FUNC_TESTS(__a, __b) \ > +do \ > +{ \ > + if (fsfoo##__a (__b) != ((int) i) * (1.0f/(1u << __a)) ) \ > + __builtin_abort (); \ > + if (fusfoo##__a (__b) != ((int) i) * (1.0f/(1u << __a)) ) \ > + __builtin_abort (); \ > + if (fslfoo##__a (__b) != ((int) i) * (1.0f/(1u << __a)) ) \ > + __builtin_abort (); \ > + if (fulfoo##__a (__b) != ((int) i) * (1.0f/(1u << __a)) ) \ > + __builtin_abort (); \ > +} while (0) > + > +#define FUNC_TESTD(__a, __b) \ > +do \ > +{ \ > + if (fsfoo##__a (__b) != ((int) i) * (1.0d/(1u << __a)) ) \ > + __builtin_abort (); \ > + if (fusfoo##__a (__b) != ((int) i) * (1.0d/(1u << __a)) ) \ > + __builtin_abort (); \ > + if (fslfoo##__a (__b) != ((int) i) * (1.0d/(1u << __a)) ) \ > + __builtin_abort (); \ > + if (fulfoo##__a (__b) != ((int) i) * (1.0d/(1u << __a)) ) \ > + __builtin_abort (); \ > +} while (0) > + > + int > +main (void) > +{ > + int i; > + > + for (i = 0; i < 32; i ++) > + { > + FUNC_TESTS (4, i); > + FUNC_TESTS (8, i); > + FUNC_TESTS (16, i); > + FUNC_TESTS (31, i); > + > + FUNC_TESTD (4, i); > + FUNC_TESTD (8, i); > + FUNC_TESTD (16, i); > + FUNC_TESTD (31, i); > + } > + return 0; > +} > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [AArch64] Use scvtf fbits option where appropriate 2019-06-18 10:37 ` Richard Earnshaw (lists) @ 2019-06-18 11:12 ` Wilco Dijkstra 2019-06-18 12:30 ` Richard Sandiford 0 siblings, 1 reply; 11+ messages in thread From: Wilco Dijkstra @ 2019-06-18 11:12 UTC (permalink / raw) To: Richard Earnshaw, Joel Hutton; +Cc: nd, GCC Patches Hi, And a few more comments: > +/* If X is a positive CONST_DOUBLE with a value that is the reciprocal of a > + power of 2 (i.e 1/2^n) return the number of float bits. e.g. for x==(1/2^n) > + return n. Otherwise return -1. */ > +int > +aarch64_fpconst_pow2_recip (rtx x) > +{ > + REAL_VALUE_TYPE r0; > + > + if (!CONST_DOUBLE_P (x)) > + return -1; > CONST_DOUBLE can be used for things other than floating point. You > should really check that the mode on the double in is in class MODE_FLOAT. Several other functions (eg aarch64_fpconst_pow_of_2) do the same since this function is only called with HF/SF/DF mode. We could add an assert for SCALAR_FLOAT_MODE_P (but then aarch64_fpconst_pow_of_2 should do the same). > + > + r0 = *CONST_DOUBLE_REAL_VALUE (x); > + if (exact_real_inverse (DFmode, &r0) > + && !REAL_VALUE_NEGATIVE (r0)) > + { > + int ret = exact_log2 (real_to_integer (&r0)); > + if (ret >= 1 && ret <= 31) > + { > + return ret; > + } Redundant braces > + else > + { > + return -1; > + } The else is redundant because... > + } > + return -1; ... of this. > +} > + > /* If X is a vector of equal CONST_DOUBLE values and that value is > Y, return the aarch64_fpconst_pow_of_2 of Y. Otherwise return -1. */ > > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 526c7fb0dab..d9812aa238e 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -6016,6 +6016,44 @@ > [(set_attr "type" "f_cvtf2i")] > ) > > +;; equal width integer to fp combine > +(define_insn "*aarch64_<su_optab>cvtf_<fcvt_target>_<GPF:mode>2_mult" > + [(set (match_operand:GPF 0 "register_operand" "=w,w") > + (mult:GPF (FLOATUORS:GPF > + (match_operand:<FCVT_TARGET> 1 "register_operand" "w,?r")) > + (match_operand 2 "aarch64_fp_pow2_recip""Dt,Dt")))] > Missing mode on operand 2. Missing white space between constraint and > predicate. Yes, operand 2 should use GPF as well (odd this doesn't give a warning at least). Also the indentation is off - the multiply operands should be indented to the same level - match operand 1 should be indented more to the right. Wilco ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [AArch64] Use scvtf fbits option where appropriate 2019-06-18 11:12 ` Wilco Dijkstra @ 2019-06-18 12:30 ` Richard Sandiford 2019-06-18 15:34 ` Joel Hutton 0 siblings, 1 reply; 11+ messages in thread From: Richard Sandiford @ 2019-06-18 12:30 UTC (permalink / raw) To: Wilco Dijkstra; +Cc: Richard Earnshaw, Joel Hutton, nd, GCC Patches Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes: > > +/* If X is a positive CONST_DOUBLE with a value that is the reciprocal of a > > + power of 2 (i.e 1/2^n) return the number of float bits. e.g. for x==(1/2^n) > > + return n. Otherwise return -1. */ > > +int > > +aarch64_fpconst_pow2_recip (rtx x) > > +{ > > + REAL_VALUE_TYPE r0; > > + > > + if (!CONST_DOUBLE_P (x)) > > + return -1; > >> CONST_DOUBLE can be used for things other than floating point. You >> should really check that the mode on the double in is in class MODE_FLOAT. > > Several other functions (eg aarch64_fpconst_pow_of_2) do the same since > this function is only called with HF/SF/DF mode. We could add an assert for > SCALAR_FLOAT_MODE_P (but then aarch64_fpconst_pow_of_2 should do > the same). IMO we should leave it as-is. aarch64.h has: #define TARGET_SUPPORTS_WIDE_INT 1 which makes it invalid to use CONST_DOUBLE for anything other than floating-point constants. The handling of CONST_DOUBLEs with integer modes is effectively compiled out in key places so it would be very hard to create one accidentally. And even if somehow we did, it would fail noisily in other ways. So I think it would be redundant to assert that CONST_DOUBLE has a float mode here, much like we (rightly) don't assert that CONST_VECTORs have vector modes. Thanks, Richard ^ permalink raw reply [flat|nested] 11+ messages in thread
* [AArch64] Use scvtf fbits option where appropriate 2019-06-18 12:30 ` Richard Sandiford @ 2019-06-18 15:34 ` Joel Hutton 2019-06-26 9:35 ` [PING][AArch64] " Joel Hutton 0 siblings, 1 reply; 11+ messages in thread From: Joel Hutton @ 2019-06-18 15:34 UTC (permalink / raw) To: Richard Sandiford; +Cc: Wilco Dijkstra, Richard Earnshaw, nd, GCC Patches [-- Attachment #1: Type: text/plain, Size: 1431 bytes --] On 18/06/2019 11:37, Richard Earnshaw (lists) wrote: > Start sentences with a capital letter. End them with a full stop. > "inequal" isn't a word: you probably mean "unequal". I've fixed this, the iterator is, however defined as 'fcvt_iesize' and described in the adjacent comment in iterators.md as 'inequal'. I've addressed your other comments. On 18/06/2019 13:30, Richard Sandiford wrote: > Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes: >> > +/* If X is a positive CONST_DOUBLE with a value that is the >> reciprocal of a >> > + power of 2 (i.e 1/2^n) return the number of float bits. e.g. >> for x==(1/2^n) >> > + return n. Otherwise return -1. */ >> > +int >> > +aarch64_fpconst_pow2_recip (rtx x) >> > +{ >> > + REAL_VALUE_TYPE r0; >> > + >> > + if (!CONST_DOUBLE_P (x)) >> > + return -1; >>> CONST_DOUBLE can be used for things other than floating point. You >>> should really check that the mode on the double in is in class >>> MODE_FLOAT. >> Several other functions (eg aarch64_fpconst_pow_of_2) do the same >> since >> this function is only called with HF/SF/DF mode. We could add an >> assert for >> SCALAR_FLOAT_MODE_P (but then aarch64_fpconst_pow_of_2 should do >> the same). > IMO we should leave it as-is. aarch64.h has: I've gone with the majority and left it as-is, but I don't have strong feelings on it. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-SCVTF-fbits.patch --] [-- Type: text/x-patch; name="0001-SCVTF-fbits.patch", Size: 10833 bytes --] From 1e44ef7e999527a0b03316cf0ea002f8d4437052 Mon Sep 17 00:00:00 2001 From: Joel Hutton <Joel.Hutton@arm.com> Date: Thu, 13 Jun 2019 11:08:56 +0100 Subject: [PATCH] SCVTF fbits --- gcc/config/aarch64/aarch64-protos.h | 1 + gcc/config/aarch64/aarch64.c | 23 +++ gcc/config/aarch64/aarch64.md | 39 +++++ gcc/config/aarch64/constraints.md | 7 + gcc/config/aarch64/predicates.md | 4 + gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c | 140 ++++++++++++++++++ 6 files changed, 214 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 1e3b1c91db1..ad1ba458a3f 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -494,6 +494,7 @@ enum aarch64_symbol_type aarch64_classify_tls_symbol (rtx); enum reg_class aarch64_regno_regclass (unsigned); int aarch64_asm_preferred_eh_data_format (int, int); int aarch64_fpconst_pow_of_2 (rtx); +int aarch64_fpconst_pow2_recip (rtx); machine_mode aarch64_hard_regno_caller_save_mode (unsigned, unsigned, machine_mode); int aarch64_uxt_size (int, HOST_WIDE_INT); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 9a035dd9ed8..028da32174d 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -18707,6 +18707,29 @@ aarch64_fpconst_pow_of_2 (rtx x) return exact_log2 (real_to_integer (r)); } +/* If X is a positive CONST_DOUBLE with a value that is the reciprocal of a + power of 2 (i.e 1/2^n) return the number of float bits. e.g. for x==(1/2^n) + return n. Otherwise return -1. */ + +int +aarch64_fpconst_pow2_recip (rtx x) +{ + REAL_VALUE_TYPE r0; + + if (!CONST_DOUBLE_P (x)) + return -1; + + r0 = *CONST_DOUBLE_REAL_VALUE (x); + if (exact_real_inverse (DFmode, &r0) + && !REAL_VALUE_NEGATIVE (r0)) + { + int ret = exact_log2 (real_to_integer (&r0)); + if (ret >= 1 && ret <= 31) + return ret; + } + return -1; +} + /* If X is a vector of equal CONST_DOUBLE values and that value is Y, return the aarch64_fpconst_pow_of_2 of Y. Otherwise return -1. */ diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 526c7fb0dab..c7c6a18b0ff 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -6016,6 +6016,44 @@ [(set_attr "type" "f_cvtf2i")] ) +;; Equal width integer to fp combine. +(define_insn "*aarch64_<su_optab>cvtf_<fcvt_target>_<GPF:mode>2_mult" + [(set (match_operand:GPF 0 "register_operand" "=w,w") + (mult:GPF (FLOATUORS:GPF + (match_operand:<FCVT_TARGET> 1 "register_operand" "w,?r")) + (match_operand:GPF 2 "aarch64_fp_pow2_recip" "Dt,Dt")))] + "TARGET_FLOAT" + { + operands[2] = GEN_INT (aarch64_fpconst_pow2_recip (operands[2])); + switch (which_alternative) + { + case 0: + return "<su_optab>cvtf\t%<GPF:s>0, %<s>1, #%2"; + case 1: + return "<su_optab>cvtf\t%<GPF:s>0, %<w1>1, #%2"; + default: + gcc_unreachable (); + } + } + [(set_attr "type" "neon_int_to_fp_<Vetype>,f_cvti2f") + (set_attr "arch" "simd,fp")] +) + +;; Unequal width integer to fp combine. +(define_insn "*aarch64_<su_optab>cvtf_<fcvt_iesize>_<GPF:mode>2_mult" + [(set (match_operand:GPF 0 "register_operand" "=w") + (mult:GPF (FLOATUORS:GPF + (match_operand:<FCVT_IESIZE> 1 "register_operand" "r")) + (match_operand:GPF 2 "aarch64_fp_pow2_recip" "Dt")))] + "TARGET_FLOAT" + { + operands[2] = GEN_INT (aarch64_fpconst_pow2_recip (operands[2])); + return "<su_optab>cvtf\t%<GPF:s>0, %<w2>1, #%2"; + } + [(set_attr "type" "f_cvti2f")] +) + +;; Equal width integer to fp conversion. (define_insn "<optab><fcvt_target><GPF:mode>2" [(set (match_operand:GPF 0 "register_operand" "=w,w") (FLOATUORS:GPF (match_operand:<FCVT_TARGET> 1 "register_operand" "w,?r")))] @@ -6027,6 +6065,7 @@ (set_attr "arch" "simd,fp")] ) +;; Unequal width integer to fp conversions. (define_insn "<optab><fcvt_iesize><GPF:mode>2" [(set (match_operand:GPF 0 "register_operand" "=w") (FLOATUORS:GPF (match_operand:<FCVT_IESIZE> 1 "register_operand" "r")))] diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index 21f9549e660..b0caa13b435 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -329,6 +329,13 @@ (match_test "aarch64_simd_scalar_immediate_valid_for_move (op, QImode)"))) +(define_constraint "Dt" + "@internal + A const_double which is the reciprocal of an exact power of two, can be + used in an scvtf with fract bits operation" + (and (match_code "const_double") + (match_test "aarch64_fpconst_pow2_recip (op) > 0"))) + (define_constraint "Dl" "@internal A constraint that matches vector of immediates for left shifts." diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 10100ca830a..da295981286 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -98,6 +98,10 @@ (and (match_code "const_double") (match_test "aarch64_fpconst_pow_of_2 (op) > 0"))) +(define_predicate "aarch64_fp_pow2_recip" + (and (match_code "const_double") + (match_test "aarch64_fpconst_pow2_recip (op) > 0"))) + (define_predicate "aarch64_fp_vec_pow2" (match_test "aarch64_vec_fpconst_pow_of_2 (op) > 0")) diff --git a/gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c b/gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c new file mode 100644 index 00000000000..e8d1de6279b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/fmul_scvtf.c @@ -0,0 +1,140 @@ +/* { dg-do run } */ +/* { dg-options "-save-temps -O2 -fno-inline" } */ + +#define FUNC_DEFS(__a) \ + float \ +fsfoo##__a (int x) \ +{ \ + return ((float) x)/(1u << __a); \ +} \ +float \ +fusfoo##__a (unsigned int x) \ +{ \ + return ((float) x)/(1u << __a); \ +} \ +float \ +fslfoo##__a (long x) \ +{ \ + return ((float) x)/(1u << __a); \ +} \ +float \ +fulfoo##__a (unsigned long x) \ +{ \ + return ((float) x)/(1u << __a); \ +} \ + +#define FUNC_DEFD(__a) \ +double \ +dsfoo##__a (int x) \ +{ \ + return ((double) x)/(1u << __a);\ +} \ +double \ +dusfoo##__a (unsigned int x) \ +{ \ + return ((double) x)/(1u << __a);\ +} \ +double \ +dslfoo##__a (long x) \ +{ \ + return ((double) x)/(1u << __a);\ +} \ +double \ +dulfoo##__a (unsigned long x) \ +{ \ + return ((double) x)/(1u << __a);\ +} + +FUNC_DEFS (4) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#4" 1 } } */ + +FUNC_DEFD (4) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#4" 1 } } */ + +FUNC_DEFS (8) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#8" 1 } } */ + +FUNC_DEFD (8) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#8" 1 } } */ + +FUNC_DEFS (16) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#16" 1 } } */ + +FUNC_DEFD (16) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#16" 1 } } */ + +FUNC_DEFS (31) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#31" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#31" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#31" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#31" 1 } } */ + +FUNC_DEFD (31) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#31" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#31" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#31" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#31" 1 } } */ + +#define FUNC_TESTS(__a, __b) \ +do \ +{ \ + if (fsfoo##__a (__b) != ((int) i) * (1.0f/(1u << __a)) ) \ + __builtin_abort (); \ + if (fusfoo##__a (__b) != ((int) i) * (1.0f/(1u << __a)) ) \ + __builtin_abort (); \ + if (fslfoo##__a (__b) != ((int) i) * (1.0f/(1u << __a)) ) \ + __builtin_abort (); \ + if (fulfoo##__a (__b) != ((int) i) * (1.0f/(1u << __a)) ) \ + __builtin_abort (); \ +} while (0) + +#define FUNC_TESTD(__a, __b) \ +do \ +{ \ + if (fsfoo##__a (__b) != ((int) i) * (1.0d/(1u << __a)) ) \ + __builtin_abort (); \ + if (fusfoo##__a (__b) != ((int) i) * (1.0d/(1u << __a)) ) \ + __builtin_abort (); \ + if (fslfoo##__a (__b) != ((int) i) * (1.0d/(1u << __a)) ) \ + __builtin_abort (); \ + if (fulfoo##__a (__b) != ((int) i) * (1.0d/(1u << __a)) ) \ + __builtin_abort (); \ +} while (0) + + int +main (void) +{ + int i; + + for (i = 0; i < 32; i ++) + { + FUNC_TESTS (4, i); + FUNC_TESTS (8, i); + FUNC_TESTS (16, i); + FUNC_TESTS (31, i); + + FUNC_TESTD (4, i); + FUNC_TESTD (8, i); + FUNC_TESTD (16, i); + FUNC_TESTD (31, i); + } + return 0; +} -- 2.17.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PING][AArch64] Use scvtf fbits option where appropriate 2019-06-18 15:34 ` Joel Hutton @ 2019-06-26 9:35 ` Joel Hutton 2019-07-01 12:14 ` Wilco Dijkstra 2019-07-01 17:03 ` James Greenhalgh 0 siblings, 2 replies; 11+ messages in thread From: Joel Hutton @ 2019-06-26 9:35 UTC (permalink / raw) To: GCC Patches; +Cc: Richard Sandiford, Wilco Dijkstra, Richard Earnshaw, nd [-- Attachment #1: Type: text/plain, Size: 834 bytes --] Ping, plus minor rework (mostly non-functional changes) gcc/ChangeLog: 2019-06-12 Joel Hutton <Joel.Hutton@arm.com> * config/aarch64/aarch64-protos.h (aarch64_fpconst_pow2_recip): New prototype * config/aarch64/aarch64.c (aarch64_fpconst_pow2_recip): New function * config/aarch64/aarch64.md (*aarch64_<su_optab>cvtf<fcvt_target><GPF:mode>2_mult): New pattern (*aarch64_<su_optab>cvtf<fcvt_iesize><GPF:mode>2_mult): New pattern * config/aarch64/constraints.md (Dt): New constraint * config/aarch64/predicates.md (aarch64_fpconst_pow2_recip): New predicate gcc/testsuite/ChangeLog: 2019-06-12 Joel Hutton <Joel.Hutton@arm.com> * gcc.target/aarch64/fmul_scvtf_1.c: New test. Bootstrapped and regression tested on aarch64-linux-none target. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-SCVTF.patch --] [-- Type: text/x-patch; name="0001-SCVTF.patch", Size: 11300 bytes --] From e866ce55c9febd92ab8e6314bf79b067085b2d1b Mon Sep 17 00:00:00 2001 From: Joel Hutton <Joel.Hutton@arm.com> Date: Wed, 19 Jun 2019 17:24:38 +0100 Subject: [PATCH] SCVTF --- gcc/config/aarch64/aarch64-protos.h | 1 + gcc/config/aarch64/aarch64.c | 23 +++ gcc/config/aarch64/aarch64.md | 39 +++++ gcc/config/aarch64/constraints.md | 7 + gcc/config/aarch64/predicates.md | 4 + .../gcc.target/aarch64/fmul_scvtf_1.c | 140 ++++++++++++++++++ 6 files changed, 214 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/fmul_scvtf_1.c diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 1e3b1c91db1026a44f32b144a6e97398c0659feb..ad1ba458a3fa081d83acf806776e911aa789b5d0 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -494,6 +494,7 @@ enum aarch64_symbol_type aarch64_classify_tls_symbol (rtx); enum reg_class aarch64_regno_regclass (unsigned); int aarch64_asm_preferred_eh_data_format (int, int); int aarch64_fpconst_pow_of_2 (rtx); +int aarch64_fpconst_pow2_recip (rtx); machine_mode aarch64_hard_regno_caller_save_mode (unsigned, unsigned, machine_mode); int aarch64_uxt_size (int, HOST_WIDE_INT); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 9a035dd9ed8665274249581f8c404d18ae72e873..d88716576850eedd1070de108da152838c127c36 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -18707,6 +18707,29 @@ aarch64_fpconst_pow_of_2 (rtx x) return exact_log2 (real_to_integer (r)); } +/* If X is a positive CONST_DOUBLE with a value that is the reciprocal of a + power of 2 (i.e 1/2^n) return the number of float bits. e.g. for x==(1/2^n) + return n. Otherwise return -1. */ + +int +aarch64_fpconst_pow2_recip (rtx x) +{ + REAL_VALUE_TYPE r0; + + if (!CONST_DOUBLE_P (x)) + return -1; + + r0 = *CONST_DOUBLE_REAL_VALUE (x); + if (exact_real_inverse (DFmode, &r0) + && !REAL_VALUE_NEGATIVE (r0)) + { + int ret = exact_log2 (real_to_integer (&r0)); + if (ret >= 1 && ret <= 32) + return ret; + } + return -1; +} + /* If X is a vector of equal CONST_DOUBLE values and that value is Y, return the aarch64_fpconst_pow_of_2 of Y. Otherwise return -1. */ diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 526c7fb0dabc540065d77d4a7922aeca16a402aa..0ccd5de3d807f079614b0076ac439c1cb8e56ab8 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -6016,6 +6016,44 @@ [(set_attr "type" "f_cvtf2i")] ) +;; Equal width integer to fp and multiply combine. +(define_insn "*aarch64_<su_optab>cvtf<fcvt_target><GPF:mode>2_mult" + [(set (match_operand:GPF 0 "register_operand" "=w,w") + (mult:GPF (FLOATUORS:GPF + (match_operand:<FCVT_TARGET> 1 "register_operand" "w,?r")) + (match_operand:GPF 2 "aarch64_fp_pow2_recip" "Dt,Dt")))] + "TARGET_FLOAT" + { + operands[2] = GEN_INT (aarch64_fpconst_pow2_recip (operands[2])); + switch (which_alternative) + { + case 0: + return "<su_optab>cvtf\t%<GPF:s>0, %<s>1, #%2"; + case 1: + return "<su_optab>cvtf\t%<GPF:s>0, %<w1>1, #%2"; + default: + gcc_unreachable (); + } + } + [(set_attr "type" "neon_int_to_fp_<Vetype>,f_cvti2f") + (set_attr "arch" "simd,fp")] +) + +;; Unequal width integer to fp and multiply combine. +(define_insn "*aarch64_<su_optab>cvtf<fcvt_iesize><GPF:mode>2_mult" + [(set (match_operand:GPF 0 "register_operand" "=w") + (mult:GPF (FLOATUORS:GPF + (match_operand:<FCVT_IESIZE> 1 "register_operand" "r")) + (match_operand:GPF 2 "aarch64_fp_pow2_recip" "Dt")))] + "TARGET_FLOAT" + { + operands[2] = GEN_INT (aarch64_fpconst_pow2_recip (operands[2])); + return "<su_optab>cvtf\t%<GPF:s>0, %<w2>1, #%2"; + } + [(set_attr "type" "f_cvti2f")] +) + +;; Equal width integer to fp conversion. (define_insn "<optab><fcvt_target><GPF:mode>2" [(set (match_operand:GPF 0 "register_operand" "=w,w") (FLOATUORS:GPF (match_operand:<FCVT_TARGET> 1 "register_operand" "w,?r")))] @@ -6027,6 +6065,7 @@ (set_attr "arch" "simd,fp")] ) +;; Unequal width integer to fp conversions. (define_insn "<optab><fcvt_iesize><GPF:mode>2" [(set (match_operand:GPF 0 "register_operand" "=w") (FLOATUORS:GPF (match_operand:<FCVT_IESIZE> 1 "register_operand" "r")))] diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index 21f9549e660868900256157ea2f7154164ddd607..b0caa13b4358e89281cd5c0a75f459ceee2040f1 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -329,6 +329,13 @@ (match_test "aarch64_simd_scalar_immediate_valid_for_move (op, QImode)"))) +(define_constraint "Dt" + "@internal + A const_double which is the reciprocal of an exact power of two, can be + used in an scvtf with fract bits operation" + (and (match_code "const_double") + (match_test "aarch64_fpconst_pow2_recip (op) > 0"))) + (define_constraint "Dl" "@internal A constraint that matches vector of immediates for left shifts." diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 10100ca830a0cd753ef5759e3ce09914b1046d26..da295981286fb782c153037a7ee94203500e6f2a 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -98,6 +98,10 @@ (and (match_code "const_double") (match_test "aarch64_fpconst_pow_of_2 (op) > 0"))) +(define_predicate "aarch64_fp_pow2_recip" + (and (match_code "const_double") + (match_test "aarch64_fpconst_pow2_recip (op) > 0"))) + (define_predicate "aarch64_fp_vec_pow2" (match_test "aarch64_vec_fpconst_pow_of_2 (op) > 0")) diff --git a/gcc/testsuite/gcc.target/aarch64/fmul_scvtf_1.c b/gcc/testsuite/gcc.target/aarch64/fmul_scvtf_1.c new file mode 100644 index 0000000000000000000000000000000000000000..c4f271083dda212b3e78953356656ea97fe583db --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/fmul_scvtf_1.c @@ -0,0 +1,140 @@ +/* { dg-do run } */ +/* { dg-options "-save-temps -O2 -fno-inline" } */ + +#define FUNC_DEFS(__a) \ +float \ +fsfoo##__a (int x) \ +{ \ + return ((float) x)/(1lu << __a); \ +} \ +float \ +fusfoo##__a (unsigned int x) \ +{ \ + return ((float) x)/(1lu << __a); \ +} \ +float \ +fslfoo##__a (long x) \ +{ \ + return ((float) x)/(1lu << __a); \ +} \ +float \ +fulfoo##__a (unsigned long x) \ +{ \ + return ((float) x)/(1lu << __a); \ +} \ + +#define FUNC_DEFD(__a) \ +double \ +dsfoo##__a (int x) \ +{ \ + return ((double) x)/(1lu << __a); \ +} \ +double \ +dusfoo##__a (unsigned int x) \ +{ \ + return ((double) x)/(1lu << __a); \ +} \ +double \ +dslfoo##__a (long x) \ +{ \ + return ((double) x)/(1lu << __a); \ +} \ +double \ +dulfoo##__a (unsigned long x) \ +{ \ + return ((double) x)/(1lu << __a); \ +} + +FUNC_DEFS (4) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#4" 1 } } */ + +FUNC_DEFD (4) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#4" 1 } } */ + +FUNC_DEFS (8) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#8" 1 } } */ + +FUNC_DEFD (8) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#8" 1 } } */ + +FUNC_DEFS (16) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#16" 1 } } */ + +FUNC_DEFD (16) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#16" 1 } } */ + +FUNC_DEFS (32) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#32" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#32" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#32" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#32" 1 } } */ + +FUNC_DEFD (32) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#32" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#32" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#32" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#32" 1 } } */ + +#define FUNC_TESTS(__a, __b) \ +do \ +{ \ + if (fsfoo##__a (__b) != ((int) i) * (1.0f/(1lu << __a)) ) \ + __builtin_abort (); \ + if (fusfoo##__a (__b) != ((int) i) * (1.0f/(1lu << __a)) ) \ + __builtin_abort (); \ + if (fslfoo##__a (__b) != ((int) i) * (1.0f/(1lu << __a)) ) \ + __builtin_abort (); \ + if (fulfoo##__a (__b) != ((int) i) * (1.0f/(1lu << __a)) ) \ + __builtin_abort (); \ +} while (0) + +#define FUNC_TESTD(__a, __b) \ +do \ +{ \ + if (dsfoo##__a (__b) != ((int) i) * (1.0d/(1lu << __a)) ) \ + __builtin_abort (); \ + if (dusfoo##__a (__b) != ((int) i) * (1.0d/(1lu << __a)) ) \ + __builtin_abort (); \ + if (dslfoo##__a (__b) != ((int) i) * (1.0d/(1lu << __a)) ) \ + __builtin_abort (); \ + if (dulfoo##__a (__b) != ((int) i) * (1.0d/(1lu << __a)) ) \ + __builtin_abort (); \ +} while (0) + +int +main (void) +{ + int i; + + for (i = 0; i < 32; i ++) + { + FUNC_TESTS (4, i); + FUNC_TESTS (8, i); + FUNC_TESTS (16, i); + FUNC_TESTS (32, i); + + FUNC_TESTD (4, i); + FUNC_TESTD (8, i); + FUNC_TESTD (16, i); + FUNC_TESTD (32, i); + } + return 0; +} -- 2.17.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PING][AArch64] Use scvtf fbits option where appropriate 2019-06-26 9:35 ` [PING][AArch64] " Joel Hutton @ 2019-07-01 12:14 ` Wilco Dijkstra 2019-07-01 17:03 ` James Greenhalgh 1 sibling, 0 replies; 11+ messages in thread From: Wilco Dijkstra @ 2019-07-01 12:14 UTC (permalink / raw) To: Joel Hutton, GCC Patches; +Cc: Richard Sandiford, Richard Earnshaw, nd Hi Joel, This looks good. One more thing, the patterns need to be conditional on check flag_trapping_math since the division can underflow and reassociating it would remove that. Other than that I think this is ready, but I can't approve. Wilco ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PING][AArch64] Use scvtf fbits option where appropriate 2019-06-26 9:35 ` [PING][AArch64] " Joel Hutton 2019-07-01 12:14 ` Wilco Dijkstra @ 2019-07-01 17:03 ` James Greenhalgh 2019-07-08 16:05 ` Joel Hutton 1 sibling, 1 reply; 11+ messages in thread From: James Greenhalgh @ 2019-07-01 17:03 UTC (permalink / raw) To: Joel Hutton Cc: GCC Patches, Richard Sandiford, Wilco Dijkstra, Richard Earnshaw, nd On Wed, Jun 26, 2019 at 10:35:00AM +0100, Joel Hutton wrote: > Ping, plus minor rework (mostly non-functional changes) > > gcc/ChangeLog: > > 2019-06-12 Joel Hutton <Joel.Hutton@arm.com> > > * config/aarch64/aarch64-protos.h (aarch64_fpconst_pow2_recip): New prototype > * config/aarch64/aarch64.c (aarch64_fpconst_pow2_recip): New function > * config/aarch64/aarch64.md (*aarch64_<su_optab>cvtf<fcvt_target><GPF:mode>2_mult): New pattern Cool; I learned a new instruction! > (*aarch64_<su_optab>cvtf<fcvt_iesize><GPF:mode>2_mult): New pattern > * config/aarch64/constraints.md (Dt): New constraint > * config/aarch64/predicates.md (aarch64_fpconst_pow2_recip): New predicate > > gcc/testsuite/ChangeLog: > > 2019-06-12 Joel Hutton <Joel.Hutton@arm.com> > > * gcc.target/aarch64/fmul_scvtf_1.c: New test. This testcase will fail on ILP32 targets where unsigned long will still live in a 'w' register. Thanks, James ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PING][AArch64] Use scvtf fbits option where appropriate 2019-07-01 17:03 ` James Greenhalgh @ 2019-07-08 16:05 ` Joel Hutton 2019-08-19 17:47 ` James Greenhalgh 0 siblings, 1 reply; 11+ messages in thread From: Joel Hutton @ 2019-07-08 16:05 UTC (permalink / raw) To: James Greenhalgh Cc: GCC Patches, Richard Sandiford, Wilco Dijkstra, Richard Earnshaw, nd [-- Attachment #1: Type: text/plain, Size: 359 bytes --] On 01/07/2019 18:03, James Greenhalgh wrote: >> gcc/testsuite/ChangeLog: >> >> 2019-06-12 Joel Hutton <Joel.Hutton@arm.com> >> >> * gcc.target/aarch64/fmul_scvtf_1.c: New test. > This testcase will fail on ILP32 targets where unsigned long will still > live in a 'w' register. Updated to use long long and unsigned long long. Joel [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-SCVTF.patch --] [-- Type: text/x-patch; name="0001-SCVTF.patch", Size: 11314 bytes --] From e10d5fdb9430799cd2050b8a2f567d1b4e43cde1 Mon Sep 17 00:00:00 2001 From: Joel Hutton <Joel.Hutton@arm.com> Date: Mon, 8 Jul 2019 11:59:50 +0100 Subject: [PATCH] SCVTF --- gcc/config/aarch64/aarch64-protos.h | 1 + gcc/config/aarch64/aarch64.c | 23 +++ gcc/config/aarch64/aarch64.md | 39 +++++ gcc/config/aarch64/constraints.md | 7 + gcc/config/aarch64/predicates.md | 4 + .../gcc.target/aarch64/fmul_scvtf_1.c | 140 ++++++++++++++++++ 6 files changed, 214 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/fmul_scvtf_1.c diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index e2f4cc19e68a79368f939cb8a83cf1f6d0412264..568c2d5846c6501c60de85cfd2fa07e0a9e5831a 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -494,6 +494,7 @@ enum aarch64_symbol_type aarch64_classify_tls_symbol (rtx); enum reg_class aarch64_regno_regclass (unsigned); int aarch64_asm_preferred_eh_data_format (int, int); int aarch64_fpconst_pow_of_2 (rtx); +int aarch64_fpconst_pow2_recip (rtx); machine_mode aarch64_hard_regno_caller_save_mode (unsigned, unsigned, machine_mode); int aarch64_uxt_size (int, HOST_WIDE_INT); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index a18fbd0f0aa8acc000fd57af5d060961ef0a4e13..0dfcef454a1594497a6bc493d92f7b2b7335a244 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -18750,6 +18750,29 @@ aarch64_fpconst_pow_of_2 (rtx x) return exact_log2 (real_to_integer (r)); } +/* If X is a positive CONST_DOUBLE with a value that is the reciprocal of a + power of 2 (i.e 1/2^n) return the number of float bits. e.g. for x==(1/2^n) + return n. Otherwise return -1. */ + +int +aarch64_fpconst_pow2_recip (rtx x) +{ + REAL_VALUE_TYPE r0; + + if (!CONST_DOUBLE_P (x)) + return -1; + + r0 = *CONST_DOUBLE_REAL_VALUE (x); + if (exact_real_inverse (DFmode, &r0) + && !REAL_VALUE_NEGATIVE (r0)) + { + int ret = exact_log2 (real_to_integer (&r0)); + if (ret >= 1 && ret <= 32) + return ret; + } + return -1; +} + /* If X is a vector of equal CONST_DOUBLE values and that value is Y, return the aarch64_fpconst_pow_of_2 of Y. Otherwise return -1. */ diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 4d559c4c928e5949d0494bf384a9ea044cf6fc7c..1b03c1fe71630a72fd00221eb1bbde7f0ba2ac1a 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -6021,6 +6021,44 @@ [(set_attr "type" "f_cvtf2i")] ) +;; Equal width integer to fp and multiply combine. +(define_insn "*aarch64_<su_optab>cvtf<fcvt_target><GPF:mode>2_mult" + [(set (match_operand:GPF 0 "register_operand" "=w,w") + (mult:GPF (FLOATUORS:GPF + (match_operand:<FCVT_TARGET> 1 "register_operand" "w,?r")) + (match_operand:GPF 2 "aarch64_fp_pow2_recip" "Dt,Dt")))] + "TARGET_FLOAT" + { + operands[2] = GEN_INT (aarch64_fpconst_pow2_recip (operands[2])); + switch (which_alternative) + { + case 0: + return "<su_optab>cvtf\t%<GPF:s>0, %<s>1, #%2"; + case 1: + return "<su_optab>cvtf\t%<GPF:s>0, %<w1>1, #%2"; + default: + gcc_unreachable (); + } + } + [(set_attr "type" "neon_int_to_fp_<Vetype>,f_cvti2f") + (set_attr "arch" "simd,fp")] +) + +;; Unequal width integer to fp and multiply combine. +(define_insn "*aarch64_<su_optab>cvtf<fcvt_iesize><GPF:mode>2_mult" + [(set (match_operand:GPF 0 "register_operand" "=w") + (mult:GPF (FLOATUORS:GPF + (match_operand:<FCVT_IESIZE> 1 "register_operand" "r")) + (match_operand:GPF 2 "aarch64_fp_pow2_recip" "Dt")))] + "TARGET_FLOAT" + { + operands[2] = GEN_INT (aarch64_fpconst_pow2_recip (operands[2])); + return "<su_optab>cvtf\t%<GPF:s>0, %<w2>1, #%2"; + } + [(set_attr "type" "f_cvti2f")] +) + +;; Equal width integer to fp conversion. (define_insn "<optab><fcvt_target><GPF:mode>2" [(set (match_operand:GPF 0 "register_operand" "=w,w") (FLOATUORS:GPF (match_operand:<FCVT_TARGET> 1 "register_operand" "w,?r")))] @@ -6032,6 +6070,7 @@ (set_attr "arch" "simd,fp")] ) +;; Unequal width integer to fp conversions. (define_insn "<optab><fcvt_iesize><GPF:mode>2" [(set (match_operand:GPF 0 "register_operand" "=w") (FLOATUORS:GPF (match_operand:<FCVT_IESIZE> 1 "register_operand" "r")))] diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index 21f9549e660868900256157ea2f7154164ddd607..b0caa13b4358e89281cd5c0a75f459ceee2040f1 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -329,6 +329,13 @@ (match_test "aarch64_simd_scalar_immediate_valid_for_move (op, QImode)"))) +(define_constraint "Dt" + "@internal + A const_double which is the reciprocal of an exact power of two, can be + used in an scvtf with fract bits operation" + (and (match_code "const_double") + (match_test "aarch64_fpconst_pow2_recip (op) > 0"))) + (define_constraint "Dl" "@internal A constraint that matches vector of immediates for left shifts." diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 10100ca830a0cd753ef5759e3ce09914b1046d26..da295981286fb782c153037a7ee94203500e6f2a 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -98,6 +98,10 @@ (and (match_code "const_double") (match_test "aarch64_fpconst_pow_of_2 (op) > 0"))) +(define_predicate "aarch64_fp_pow2_recip" + (and (match_code "const_double") + (match_test "aarch64_fpconst_pow2_recip (op) > 0"))) + (define_predicate "aarch64_fp_vec_pow2" (match_test "aarch64_vec_fpconst_pow_of_2 (op) > 0")) diff --git a/gcc/testsuite/gcc.target/aarch64/fmul_scvtf_1.c b/gcc/testsuite/gcc.target/aarch64/fmul_scvtf_1.c new file mode 100644 index 0000000000000000000000000000000000000000..8bfe06ac3e611823afb19ddef7cb8db95f173bc8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/fmul_scvtf_1.c @@ -0,0 +1,140 @@ +/* { dg-do run } */ +/* { dg-options "-save-temps -O2 -fno-inline" } */ + +#define FUNC_DEFS(__a) \ +float \ +fsfoo##__a (int x) \ +{ \ + return ((float) x)/(1lu << __a); \ +} \ +float \ +fusfoo##__a (unsigned int x) \ +{ \ + return ((float) x)/(1lu << __a); \ +} \ +float \ +fslfoo##__a (long long x) \ +{ \ + return ((float) x)/(1lu << __a); \ +} \ +float \ +fulfoo##__a (unsigned long long x) \ +{ \ + return ((float) x)/(1lu << __a); \ +} \ + +#define FUNC_DEFD(__a) \ +double \ +dsfoo##__a (int x) \ +{ \ + return ((double) x)/(1lu << __a); \ +} \ +double \ +dusfoo##__a (unsigned int x) \ +{ \ + return ((double) x)/(1lu << __a); \ +} \ +double \ +dslfoo##__a (long long x) \ +{ \ + return ((double) x)/(1lu << __a); \ +} \ +double \ +dulfoo##__a (unsigned long long x) \ +{ \ + return ((double) x)/(1lu << __a); \ +} + +FUNC_DEFS (4) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#4" 1 } } */ + +FUNC_DEFD (4) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#4" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#4" 1 } } */ + +FUNC_DEFS (8) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#8" 1 } } */ + +FUNC_DEFD (8) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#8" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#8" 1 } } */ + +FUNC_DEFS (16) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#16" 1 } } */ + +FUNC_DEFD (16) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#16" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#16" 1 } } */ + +FUNC_DEFS (32) + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], w\[0-9\]*.*#32" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], w\[0-9\]*.*#32" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\ts\[0-9\], x\[0-9\]*.*#32" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\ts\[0-9\], x\[0-9\]*.*#32" 1 } } */ + +FUNC_DEFD (32) + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], w\[0-9\]*.*#32" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], w\[0-9\]*.*#32" 1 } } */ + /* { dg-final { scan-assembler-times "scvtf\td\[0-9\], x\[0-9\]*.*#32" 1 } } */ + /* { dg-final { scan-assembler-times "ucvtf\td\[0-9\], x\[0-9\]*.*#32" 1 } } */ + +#define FUNC_TESTS(__a, __b) \ +do \ +{ \ + if (fsfoo##__a (__b) != ((int) i) * (1.0f/(1lu << __a)) ) \ + __builtin_abort (); \ + if (fusfoo##__a (__b) != ((int) i) * (1.0f/(1lu << __a)) ) \ + __builtin_abort (); \ + if (fslfoo##__a (__b) != ((int) i) * (1.0f/(1lu << __a)) ) \ + __builtin_abort (); \ + if (fulfoo##__a (__b) != ((int) i) * (1.0f/(1lu << __a)) ) \ + __builtin_abort (); \ +} while (0) + +#define FUNC_TESTD(__a, __b) \ +do \ +{ \ + if (dsfoo##__a (__b) != ((int) i) * (1.0d/(1lu << __a)) ) \ + __builtin_abort (); \ + if (dusfoo##__a (__b) != ((int) i) * (1.0d/(1lu << __a)) ) \ + __builtin_abort (); \ + if (dslfoo##__a (__b) != ((int) i) * (1.0d/(1lu << __a)) ) \ + __builtin_abort (); \ + if (dulfoo##__a (__b) != ((int) i) * (1.0d/(1lu << __a)) ) \ + __builtin_abort (); \ +} while (0) + +int +main (void) +{ + int i; + + for (i = 0; i < 32; i ++) + { + FUNC_TESTS (4, i); + FUNC_TESTS (8, i); + FUNC_TESTS (16, i); + FUNC_TESTS (32, i); + + FUNC_TESTD (4, i); + FUNC_TESTD (8, i); + FUNC_TESTD (16, i); + FUNC_TESTD (32, i); + } + return 0; +} -- 2.17.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PING][AArch64] Use scvtf fbits option where appropriate 2019-07-08 16:05 ` Joel Hutton @ 2019-08-19 17:47 ` James Greenhalgh 0 siblings, 0 replies; 11+ messages in thread From: James Greenhalgh @ 2019-08-19 17:47 UTC (permalink / raw) To: Joel Hutton Cc: GCC Patches, Richard Sandiford, Wilco Dijkstra, Richard Earnshaw, nd On Mon, Jul 08, 2019 at 04:41:06PM +0100, Joel Hutton wrote: > On 01/07/2019 18:03, James Greenhalgh wrote: > > >> gcc/testsuite/ChangeLog: > >> > >> 2019-06-12 Joel Hutton <Joel.Hutton@arm.com> > >> > >> * gcc.target/aarch64/fmul_scvtf_1.c: New test. > > This testcase will fail on ILP32 targets where unsigned long will still > > live in a 'w' register. > Updated to use long long and unsigned long long. Sorry, this slipped through the cracks. OK for trunk. Thanks, James > > Joel > > From e10d5fdb9430799cd2050b8a2f567d1b4e43cde1 Mon Sep 17 00:00:00 2001 > From: Joel Hutton <Joel.Hutton@arm.com> > Date: Mon, 8 Jul 2019 11:59:50 +0100 > Subject: [PATCH] SCVTF > > --- > gcc/config/aarch64/aarch64-protos.h | 1 + > gcc/config/aarch64/aarch64.c | 23 +++ > gcc/config/aarch64/aarch64.md | 39 +++++ > gcc/config/aarch64/constraints.md | 7 + > gcc/config/aarch64/predicates.md | 4 + > .../gcc.target/aarch64/fmul_scvtf_1.c | 140 ++++++++++++++++++ > 6 files changed, 214 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/aarch64/fmul_scvtf_1.c > ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2019-08-19 16:19 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-06-13 17:26 [AArch64] Use scvtf fbits option where appropriate Wilco Dijkstra 2019-06-18 9:11 ` Joel Hutton 2019-06-18 10:37 ` Richard Earnshaw (lists) 2019-06-18 11:12 ` Wilco Dijkstra 2019-06-18 12:30 ` Richard Sandiford 2019-06-18 15:34 ` Joel Hutton 2019-06-26 9:35 ` [PING][AArch64] " Joel Hutton 2019-07-01 12:14 ` Wilco Dijkstra 2019-07-01 17:03 ` James Greenhalgh 2019-07-08 16:05 ` Joel Hutton 2019-08-19 17:47 ` James Greenhalgh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).