From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 126513 invoked by alias); 24 Jun 2015 18:19:52 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 126499 invoked by uid 89); 24 Jun 2015 18:19:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: usmailout2.samsung.com Received: from mailout2.w2.samsung.com (HELO usmailout2.samsung.com) (211.189.100.12) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Wed, 24 Jun 2015 18:19:50 +0000 Received: from uscpsbgm1.samsung.com (u114.gpu85.samsung.co.kr [203.254.195.114]) by mailout2.w2.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0NQG00D5TO8ZVS80@mailout2.w2.samsung.com> for gcc-patches@gcc.gnu.org; Wed, 24 Jun 2015 14:19:47 -0400 (EDT) Received: from ussync3.samsung.com ( [203.254.195.83]) by uscpsbgm1.samsung.com (USCPMTA) with SMTP id 49.D2.03663.3C4FA855; Wed, 24 Jun 2015 14:19:47 -0400 (EDT) Received: from WEMENEZES ([105.140.33.224]) by ussync3.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0NQG0065CO8ZVI80@ussync3.samsung.com>; Wed, 24 Jun 2015 14:19:47 -0400 (EDT) From: Evandro Menezes To: 'Benedikt Huber' , "'Dr. Philipp Tomsich'" Cc: gcc-patches@gcc.gnu.org References: <1434629045-24650-1-git-send-email-benedikt.huber@theobroma-systems.com> <027701d0ae9c$b8f3eff0$2adbcfd0$@samsung.com> <56A9A836-05BF-409C-A8D4-91B7ABEC5EE9@theobroma-systems.com> In-reply-to: Subject: RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math Date: Wed, 24 Jun 2015 18:37:00 -0000 Message-id: <02a401d0aeaa$5a5e7ec0$0f1b7c40$@samsung.com> MIME-version: 1.0 Content-type: text/plain; charset=UTF-8 Content-transfer-encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2015-06/txt/msg01734.txt.bz2 Benedikt, Are you developing the reciprocal approximation just for 1/x proper or for = any division, as in x/y =3D x * 1/y? Thank you, --=20 Evandro Menezes Austin, TX > -----Original Message----- > From: Benedikt Huber [mailto:benedikt.huber@theobroma-systems.com] > Sent: Wednesday, June 24, 2015 12:11 > To: Dr. Philipp Tomsich > Cc: Evandro Menezes; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) > estimation in -ffast-math >=20 > Evandro, >=20 > Yes, we also have the 1/x approximation. > However we do not have the test cases yet, and it also would need some cl= ean > up. > I am going to provide a patch for that soon (say next week). > Also, for this optimization we have *not* yet found a benchmark with > significant improvements. >=20 > Best Regards, > Benedikt >=20 >=20 > > On 24 Jun 2015, at 18:52, Dr. Philipp Tomsich systems.com> wrote: > > > > Evandro, > > > > We=E2=80=99ve seen a 28% speed-up on gromacs in SPECfp for the (scalar)= reciprocal > sqrt. > > > > Also, the =E2=80=9Creciprocal divide=E2=80=9D patches are floating arou= nd in various > > of our git-tree, but aren=E2=80=99t ready for public consumption, yet= =E2=80=A6 I=E2=80=99ll > > leave Benedikt to comment on potential timelines for getting that pushed > out. > > > > Best, > > Philipp. > > > >> On 24 Jun 2015, at 18:42, Evandro Menezes wrot= e: > >> > >> Benedikt, > >> > >> You beat me to it! :-) Do you have the implementation for dividing > >> using the Newton series as well? > >> > >> I'm not sure that the series is always for all data types and on all > >> processors. It would be useful to allow each AArch64 processor to > >> enable this or not depending on the data type. BTW, do you have some > >> tests showing the speed up? > >> > >> Thank you, > >> > >> -- > >> Evandro Menezes Austin, TX > >> > >>> -----Original Message----- > >>> From: gcc-patches-owner@gcc.gnu.org > >>> [mailto:gcc-patches-owner@gcc.gnu.org] > >> On > >>> Behalf Of Benedikt Huber > >>> Sent: Thursday, June 18, 2015 7:04 > >>> To: gcc-patches@gcc.gnu.org > >>> Cc: benedikt.huber@theobroma-systems.com; philipp.tomsich@theobroma- > >>> systems.com > >>> Subject: [PATCH] [aarch64] Implemented reciprocal square root > >>> (rsqrt) estimation in -ffast-math > >>> > >>> arch64 offers the instructions frsqrte and frsqrts, for rsqrt > >>> estimation > >> and > >>> a Newton-Raphson step, respectively. > >>> There are ARMv8 implementations where this is faster than using fdiv > >>> and rsqrt. > >>> It runs three steps for double and two steps for float to achieve > >>> the > >> needed > >>> precision. > >>> > >>> There is one caveat and open question. > >>> Since -ffast-math enables flush to zero intermediate values between > >>> approximation steps will be flushed to zero if they are denormal. > >>> E.g. This happens in the case of rsqrt (DBL_MAX) and rsqrtf (FLT_MAX). > >>> The test cases pass, but it is unclear to me whether this is > >>> expected behavior with -ffast-math. > >>> > >>> The patch applies to commit: > >>> svn+ssh://gcc.gnu.org/svn/gcc/trunk@224470 > >>> > >>> Please consider including this patch. > >>> Thank you and best regards, > >>> Benedikt Huber > >>> > >>> Benedikt Huber (1): > >>> 2015-06-15 Benedikt Huber > >>> > >>> gcc/ChangeLog | 9 +++ > >>> gcc/config/aarch64/aarch64-builtins.c | 60 ++++++++++++++++ > >>> gcc/config/aarch64/aarch64-protos.h | 2 + > >>> gcc/config/aarch64/aarch64-simd.md | 27 ++++++++ > >>> gcc/config/aarch64/aarch64.c | 63 +++++++++++++++++ > >>> gcc/config/aarch64/aarch64.md | 3 + > >>> gcc/testsuite/gcc.target/aarch64/rsqrt.c | 113 > >>> +++++++++++++++++++++++++++++++ > >>> 7 files changed, 277 insertions(+) > >>> create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt.c > >>> > >>> -- > >>> 1.9.1 > >> > >