public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [Patch AArch64] Use software sqrt expansion always for -mlow-precision-recip-sqrt
@ 2016-01-11 11:53 James Greenhalgh
  2016-01-11 12:05 ` [AArch64] Remove AARCH64_EXTRA_TUNE_RECIP_SQRT from Cortex-A57 tuning James Greenhalgh
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: James Greenhalgh @ 2016-01-11 11:53 UTC (permalink / raw)
  To: gcc-patches
  Cc: nd, marcus.shawcroft, richard.earnshaw, Venkataramanan.Kumar,
	philipp.tomsich, pinskia, Kyrylo.Tkachov, e.menezes

[-- Attachment #1: Type: text/plain, Size: 2453 bytes --]


Hi,

I'd like to switch the logic around in aarch64.c such that
-mlow-precision-recip-sqrt causes us to always emit the low-precision
software expansion for reciprocal square root. I have two reasons to do
this; first is consistency across -mcpu targets, second is enabling more
-mcpu targets to use the flag for peak tuning.

I don't much like that the precision we use for -mlow-precision-recip-sqrt
differs between cores (and possibly compiler revisions). Yes, we're
under -ffast-math but I take this flag to mean the user explicitly wants the
low-precision expansion, and we should not diverge from that based on an
internal decision as to what is optimal for performance in the
high-precision case. I'd prefer to keep things as predictable as possible,
and here that means always emitting the low-precision expansion when asked.

Judging by the comments in the thread proposing the reciprocal square
root optimisation, this will benefit all cores currently supported by GCC.
To be clear, we would still not expand in the high-precision case for any
cores which do not explicitly ask for it. Currently that is Cortex-A57
and xgene, though I will be proposing a patch to remove Cortex-A57 from
that list shortly.

Which gives my second motivation for this patch. -mlow-precision-recip-sqrt
is intended as a tuning flag for situations where performance is more
important than precision, but the current logic requires setting an
internal flag which also changes the performance characteristics where
high-precision is needed. This conflates two decisions the target might
want to make, and reduces the applicability of an option targets might
want to enable for performance. In particular, I'd still like to see
-mlow-precision-recip-sqrt continue to emit the cheaper, low-precision
sequence for floats under Cortex-A57.

Based on that reasoning, this patch makes the appropriate change to the
logic. I've checked with the current -mcpu values to ensure that behaviour
without -mlow-precision-recip-sqrt does not change, and that behaviour
with -mlow-precision-recip-sqrt is to emit the low precision sequences.

I've also put this through bootstrap and test on aarch64-none-linux-gnu
with no issues.

OK?

Thanks,
James

---
2015-12-10  James Greenhalgh  <james.greenhalgh@arm.com>

	* config/aarch64/aarch64.c (use_rsqrt_p): Always use software
	reciprocal sqrt for -mlow-precision-recip-sqrt.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Patch-AArch64-Use-software-sqrt-expansion-always-for.patch --]
[-- Type: text/x-patch;  name=0001-Patch-AArch64-Use-software-sqrt-expansion-always-for.patch, Size: 567 bytes --]

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 9142ac0..1d5d898 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7485,8 +7485,9 @@ use_rsqrt_p (void)
 {
   return (!flag_trapping_math
 	  && flag_unsafe_math_optimizations
-	  && (aarch64_tune_params.extra_tuning_flags
-	      & AARCH64_EXTRA_TUNE_RECIP_SQRT));
+	  && ((aarch64_tune_params.extra_tuning_flags
+	       & AARCH64_EXTRA_TUNE_RECIP_SQRT)
+	      || flag_mrecip_low_precision_sqrt));
 }
 
 /* Function to decide when to use

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-02-16 20:46 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-11 11:53 [Patch AArch64] Use software sqrt expansion always for -mlow-precision-recip-sqrt James Greenhalgh
2016-01-11 12:05 ` [AArch64] Remove AARCH64_EXTRA_TUNE_RECIP_SQRT from Cortex-A57 tuning James Greenhalgh
2016-01-11 13:31   ` Dr. Philipp Tomsich
2016-01-25 11:20   ` James Greenhalgh
2016-02-01 14:00     ` James Greenhalgh
2016-02-08 10:57       ` James Greenhalgh
2016-02-15 10:50         ` James Greenhalgh
2016-02-15 17:25           ` Evandro Menezes
2016-02-16 10:28             ` James Greenhalgh
2016-02-16 20:46               ` Evandro Menezes
2016-02-16  8:49   ` Marcus Shawcroft
2016-01-11 22:58 ` [Patch AArch64] Use software sqrt expansion always for -mlow-precision-recip-sqrt Evandro Menezes
2016-01-12 11:32   ` James Greenhalgh
2016-01-12 11:44     ` Kyrill Tkachov
2016-01-12  5:53 ` Kumar, Venkataramanan
2016-01-12 11:48   ` James Greenhalgh
2016-01-25 11:21 ` James Greenhalgh
2016-02-01 13:59   ` James Greenhalgh
2016-02-08 10:57     ` James Greenhalgh
2016-02-15 10:48       ` James Greenhalgh
2016-02-16  8:40 ` Marcus Shawcroft

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).