From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 66050 invoked by alias); 16 Mar 2016 19:45:52 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 66027 invoked by uid 89); 16 Mar 2016 19:45:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.5 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,KAM_MANYTO,RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=vast, approximation, leaning, choices X-HELO: usmailout2.samsung.com Received: from mailout2.w2.samsung.com (HELO usmailout2.samsung.com) (211.189.100.12) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Wed, 16 Mar 2016 19:45:41 +0000 Received: from uscpsbgm1.samsung.com (u114.gpu85.samsung.co.kr [203.254.195.114]) by mailout2.w2.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0O4500M2XDK3XM30@mailout2.w2.samsung.com> for gcc-patches@gcc.gnu.org; Wed, 16 Mar 2016 15:45:39 -0400 (EDT) Received: from ussync1.samsung.com ( [203.254.195.81]) by uscpsbgm1.samsung.com (USCPMTA) with SMTP id 44.A1.04845.3E7B9E65; Wed, 16 Mar 2016 15:45:39 -0400 (EDT) Received: from [172.31.207.194] ([105.140.31.10]) by ussync1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0O4500ELEDK23A90@ussync1.samsung.com>; Wed, 16 Mar 2016 15:45:39 -0400 (EDT) Subject: Re: [AArch64] Emit square root using the Newton series To: GCC Patches , Marcus Shawcroft , James Greenhalgh , Andrew Pinski , Benedikt Huber , philipp.tomsich@theobroma-systems.com, Kyrill Tkachov References: <56674D34.80806@samsung.com> <56C38D00.9000403@samsung.com> <56D8D553.6060902@samsung.com> <56DF4D50.4060804@samsung.com> From: Evandro Menezes Message-id: <56E9B7E1.4010601@samsung.com> Date: Wed, 16 Mar 2016 19:45:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-version: 1.0 In-reply-to: <56DF4D50.4060804@samsung.com> Content-type: text/plain; charset=utf-8; format=flowed Content-transfer-encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2016-03/txt/msg00952.txt.bz2 On 03/08/16 16:08, Evandro Menezes wrote: > On 02/16/16 14:56, Evandro Menezes wrote: >> On 12/08/15 15:35, Evandro Menezes wrote: >>> Emit square root using the Newton series >>> >>> 2015-12-03 Evandro Menezes >>> >>> gcc/ >>> * config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt): >>> Declare new >>> function. >>> * config/aarch64/aarch64-simd.md (sqrt2): New >>> expansion and >>> insn definitions. >>> * config/aarch64/aarch64-tuning-flags.def >>> (AARCH64_EXTRA_TUNE_FAST_SQRT): New tuning macro. >>> * config/aarch64/aarch64.c (aarch64_emit_swsqrt): Define >>> new function. >>> * config/aarch64/aarch64.md (sqrt2): New expansion >>> and insn >>> definitions. >>> * config/aarch64/aarch64.opt (mlow-precision-recip-sqrt): >>> Expand option >>> description. >>> * doc/invoke.texi (mlow-precision-recip-sqrt): Likewise. >>> >>> This patch extends the patch that added support for implementing >>> x^-1/2 using the Newton series by adding support for x^1/2 as well. >>> >>> Is it OK at this point of stage 3? >>> >>> Thank you, >>> >> >> James, >> >> As I was saying, this patch results in some validation errors in >> CPU2000 benchmarks using DF. Although proving the algorithm to be >> pretty solid with a vast set of random values, I'm confused why some >> benchmarks fail to validate with this implementation of the Newton >> series for square root too, when they pass with the Newton series for >> reciprocal square root. >> >> Since I had no problems with the same algorithm on x86-64, I wonder >> if the initial estimate on AArch64, which offers just 8 bits, whereas >> x86-64 offers 11 bits, has to do with it. Then again, the algorithm >> iterated 1 less time on x86-64 than on AArch64. >> >> Since it seems that the initial estimate is sufficient for CPU2000 to >> validate when using SF, I'm leaning towards restricting the Newton >> series for square root only for SF. >> >> Your thoughts on the matter are appreciated, > > Add choices for the reciprocal square root approximation > > Allow a target to prefer such operation depending on the FP > precision. > > gcc/ > * config/aarch64/aarch64-protos.h > (AARCH64_EXTRA_TUNE_APPROX_RSQRT): New macro. > * config/aarch64/aarch64-tuning-flags.def > (AARCH64_EXTRA_TUNE_APPROX_RSQRT_DF): New mask. > (AARCH64_EXTRA_TUNE_APPROX_RSQRT_SF): Likewise. > * config/aarch64/aarch64.c > (use_rsqrt_p): New argument for the mode. > (aarch64_builtin_reciprocal): Devise mode from builtin. > (aarch64_optab_supported_p): New argument for the mode. > > > Now that the patch is attached, feedback is appreciated. Ping. -- Evandro Menezes