From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-419930-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 6391 invoked by alias); 25 Jan 2016 11:21:33 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 6380 invoked by uid 89); 25 Jan 2016 11:21:33 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy=peak, judging, predictable, characteristics
X-HELO: cam-smtp0.cambridge.arm.com
Received: from fw-tnat.cambridge.arm.com (HELO cam-smtp0.cambridge.arm.com) (217.140.96.140) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Mon, 25 Jan 2016 11:21:32 +0000
Received: from arm.com (e107456-lin.cambridge.arm.com [10.2.206.78])	by cam-smtp0.cambridge.arm.com (8.13.8/8.13.8) with ESMTP id u0PBLPrT032518;	Mon, 25 Jan 2016 11:21:25 GMT
Date: Mon, 25 Jan 2016 11:21:00 -0000
From: James Greenhalgh <james.greenhalgh@arm.com>
To: gcc-patches@gcc.gnu.org
Cc: nd@arm.com, marcus.shawcroft@arm.com, richard.earnshaw@arm.com,        Venkataramanan.Kumar@amd.com, philipp.tomsich@theobroma-systems.com,        pinskia@gmail.com, Kyrylo.Tkachov@arm.com, e.menezes@samsung.com
Subject: Re: [Patch AArch64] Use software sqrt expansion always for -mlow-precision-recip-sqrt
Message-ID: <20160125112124.GB8599@arm.com>
References: <1452513219-25168-1-git-send-email-james.greenhalgh@arm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1452513219-25168-1-git-send-email-james.greenhalgh@arm.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-IsSubscribed: yes
X-SW-Source: 2016-01/txt/msg01860.txt.bz2

On Mon, Jan 11, 2016 at 11:53:39AM +0000, James Greenhalgh wrote:
> 
> Hi,
> 
> I'd like to switch the logic around in aarch64.c such that
> -mlow-precision-recip-sqrt causes us to always emit the low-precision
> software expansion for reciprocal square root. I have two reasons to do
> this; first is consistency across -mcpu targets, second is enabling more
> -mcpu targets to use the flag for peak tuning.
> 
> I don't much like that the precision we use for -mlow-precision-recip-sqrt
> differs between cores (and possibly compiler revisions). Yes, we're
> under -ffast-math but I take this flag to mean the user explicitly wants the
> low-precision expansion, and we should not diverge from that based on an
> internal decision as to what is optimal for performance in the
> high-precision case. I'd prefer to keep things as predictable as possible,
> and here that means always emitting the low-precision expansion when asked.
> 
> Judging by the comments in the thread proposing the reciprocal square
> root optimisation, this will benefit all cores currently supported by GCC.
> To be clear, we would still not expand in the high-precision case for any
> cores which do not explicitly ask for it. Currently that is Cortex-A57
> and xgene, though I will be proposing a patch to remove Cortex-A57 from
> that list shortly.
> 
> Which gives my second motivation for this patch. -mlow-precision-recip-sqrt
> is intended as a tuning flag for situations where performance is more
> important than precision, but the current logic requires setting an
> internal flag which also changes the performance characteristics where
> high-precision is needed. This conflates two decisions the target might
> want to make, and reduces the applicability of an option targets might
> want to enable for performance. In particular, I'd still like to see
> -mlow-precision-recip-sqrt continue to emit the cheaper, low-precision
> sequence for floats under Cortex-A57.
> 
> Based on that reasoning, this patch makes the appropriate change to the
> logic. I've checked with the current -mcpu values to ensure that behaviour
> without -mlow-precision-recip-sqrt does not change, and that behaviour
> with -mlow-precision-recip-sqrt is to emit the low precision sequences.
> 
> I've also put this through bootstrap and test on aarch64-none-linux-gnu
> with no issues.
> 
> OK?

*Ping*

Thanks,
James

> 2015-12-10  James Greenhalgh  <james.greenhalgh@arm.com>
> 
> 	* config/aarch64/aarch64.c (use_rsqrt_p): Always use software
> 	reciprocal sqrt for -mlow-precision-recip-sqrt.
> 

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 9142ac0..1d5d898 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -7485,8 +7485,9 @@ use_rsqrt_p (void)
>  {
>    return (!flag_trapping_math
>  	  && flag_unsafe_math_optimizations
> -	  && (aarch64_tune_params.extra_tuning_flags
> -	      & AARCH64_EXTRA_TUNE_RECIP_SQRT));
> +	  && ((aarch64_tune_params.extra_tuning_flags
> +	       & AARCH64_EXTRA_TUNE_RECIP_SQRT)
> +	      || flag_mrecip_low_precision_sqrt));
>  }
>  
>  /* Function to decide when to use