From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 52023 invoked by alias); 9 Dec 2015 17:16:24 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 52012 invoked by uid 89); 9 Dec 2015 17:16:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.3 required=5.0 tests=AWL,BAYES_50,KAM_MANYTO,SPF_PASS autolearn=no version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 09 Dec 2015 17:16:22 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-31-ILyVZecZQu2Gl6JLrluIsg-1; Wed, 09 Dec 2015 17:16:16 +0000 Received: from [10.2.206.200] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 9 Dec 2015 17:16:15 +0000 Message-ID: <566861DF.6050904@arm.com> Date: Wed, 09 Dec 2015 17:16:00 -0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Evandro Menezes , GCC Patches , Marcus Shawcroft , James Greenhalgh , Andrew Pinski , Benedikt Huber , philipp.tomsich@theobroma-systems.com Subject: Re: [AArch64] Emit square root using the Newton series References: <56674D34.80806@samsung.com> <56685C62.9070705@arm.com> <56685DF2.4040306@samsung.com> <56685EBD.5040401@arm.com> In-Reply-To: <56685EBD.5040401@arm.com> X-MC-Unique: ILyVZecZQu2Gl6JLrluIsg-1 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2015-12/txt/msg01035.txt.bz2 On 09/12/15 17:02, Kyrill Tkachov wrote: > > On 09/12/15 16:59, Evandro Menezes wrote: >> On 12/09/2015 10:52 AM, Kyrill Tkachov wrote: >>> Hi Evandro, >>> >>> On 08/12/15 21:35, Evandro Menezes wrote: >>>> Emit square root using the Newton series >>>> >>>> 2015-12-03 Evandro Menezes >>>> >>>> gcc/ >>>> * config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt): >>>> Declare new >>>> function. >>>> * config/aarch64/aarch64-simd.md (sqrt2): New >>>> expansion and >>>> insn definitions. >>>> * config/aarch64/aarch64-tuning-flags.def >>>> (AARCH64_EXTRA_TUNE_FAST_SQRT): New tuning macro. >>>> * config/aarch64/aarch64.c (aarch64_emit_swsqrt): Define >>>> new function. >>>> * config/aarch64/aarch64.md (sqrt2): New expansion >>>> and insn >>>> definitions. >>>> * config/aarch64/aarch64.opt (mlow-precision-recip-sqrt): >>>> Expand option >>>> description. >>>> * doc/invoke.texi (mlow-precision-recip-sqrt): Likewise. >>>> >>>> This patch extends the patch that added support for implementing x^-1/= 2 using the Newton series by adding support for x^1/2 as well. >>>> >>>> Is it OK at this point of stage 3? >>>> >>> >>> A comment on the patch itself from me... >>> >>> diff --git a/gcc/config/aarch64/aarch64-tuning-flags.def b/gcc/config/a= arch64/aarch64-tuning-flags.def >>> index 6f7dbce..11c6c9a 100644 >>> --- a/gcc/config/aarch64/aarch64-tuning-flags.def >>> +++ b/gcc/config/aarch64/aarch64-tuning-flags.def >>> @@ -30,4 +30,4 @@ >>> >>> AARCH64_EXTRA_TUNING_OPTION ("rename_fma_regs", RENAME_FMA_REGS) >>> AARCH64_EXTRA_TUNING_OPTION ("recip_sqrt", RECIP_SQRT) >>> - >>> +AARCH64_EXTRA_TUNING_OPTION ("fast_sqrt", FAST_SQRT) >>> >>> That seems like a misleading name to me. >>> If we're doing this, that means that the sqrt instruction is not faster >>> than doing the inverse sqrt estimation followed by a multiply. >>> I think a name like "synth_sqrt" or "estimate_sqrt" or something along = those lines >>> is more appropriate. >> >> Unfortunately, this is the case on Exynos M1: the series is faster than = the instruction. :-( So, other targets when this is also true, using the "= fast_sqrt" option might make sense. >> > > Sure, but the way your patch is written, we will emit the series when "fa= st_sqrt" is set, rather > than the other way around, unless I'm misreading the logic in: > Sorry, what I meant to say is it would be clearer, IMO, to describe the com= piler action that is being taken (e.g. the rename_fma_regs tuning flag), in this case it's estimating sqrt u= sing a series. Kyrill > diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarc= h64-simd.md > index 030a101..f6d2da4 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -4280,7 +4280,23 @@ > > ;; sqrt > > -(define_insn "sqrt2" > +(define_expand "sqrt2" > + [(set (match_operand:VDQF 0 "register_operand") > + (sqrt:VDQF (match_operand:VDQF 1 "register_operand")))] > + "TARGET_SIMD" > +{ > + if ((AARCH64_EXTRA_TUNE_FAST_SQRT & aarch64_tune_params.extra_tuning_f= lags) > + && !optimize_function_for_size_p (cfun) > + && flag_finite_math_only > + && !flag_trapping_math > + && flag_unsafe_math_optimizations) > + { > + aarch64_emit_swsqrt (operands[0], operands[1]); > + DONE; > + } > +}) > > > Thanks, > Kyrill > >