From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 66777 invoked by alias); 24 Jun 2015 20:39:34 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 66765 invoked by uid 89); 24 Jun 2015 20:39:33 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: usmailout3.samsung.com Received: from mailout3.w2.samsung.com (HELO usmailout3.samsung.com) (211.189.100.13) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Wed, 24 Jun 2015 20:39:31 +0000 Received: from uscpsbgm2.samsung.com (u115.gpu85.samsung.co.kr [203.254.195.115]) by usmailout3.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0NQG0045IUPTZ7A0@usmailout3.samsung.com> for gcc-patches@gcc.gnu.org; Wed, 24 Jun 2015 16:39:29 -0400 (EDT) Received: from ussync4.samsung.com ( [203.254.195.84]) by uscpsbgm2.samsung.com (USCPMTA) with SMTP id CD.D4.29819.1851B855; Wed, 24 Jun 2015 16:39:29 -0400 (EDT) Received: from WEMENEZES ([105.140.33.224]) by ussync4.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0NQG00GC1UPSJD40@ussync4.samsung.com>; Wed, 24 Jun 2015 16:39:29 -0400 (EDT) From: Evandro Menezes To: "'Dr. Philipp Tomsich'" Cc: 'Benedikt Huber' , gcc-patches@gcc.gnu.org References: <1434629045-24650-1-git-send-email-benedikt.huber@theobroma-systems.com> <027701d0ae9c$b8f3eff0$2adbcfd0$@samsung.com> <56A9A836-05BF-409C-A8D4-91B7ABEC5EE9@theobroma-systems.com> <02a401d0aeaa$5a5e7ec0$0f1b7c40$@samsung.com> <3F0CA634-AF3D-4AD9-8702-9B5D19821889@theobroma-systems.com> In-reply-to: <3F0CA634-AF3D-4AD9-8702-9B5D19821889@theobroma-systems.com> Subject: RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math Date: Wed, 24 Jun 2015 20:54:00 -0000 Message-id: <02c701d0aebd$de09f1b0$9a1dd510$@samsung.com> MIME-version: 1.0 Content-type: text/plain; charset=UTF-8 Content-transfer-encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2015-06/txt/msg01744.txt.bz2 Philipp, I think that execute_cse_reciprocals_1() applies only when the denominator = is known at compile-time, otherwise the division stays. It doesn't seem to= know whether the target supports the approximate reciprocal or not. Cheers, --=20 Evandro Menezes Austin, TX > -----Original Message----- > From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-owner@gcc.gnu.org= ] On > Behalf Of Dr. Philipp Tomsich > Sent: Wednesday, June 24, 2015 15:08 > To: Evandro Menezes > Cc: Benedikt Huber; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) > estimation in -ffast-math >=20 > Evandro, >=20 > Shouldn't =E2=80=98execute_cse_reciprocals_1=E2=80=99 take care of this, = once the reciprocal- > division is implemented? > Do you think there=E2=80=99s additional work needed to catch all cases/op= portunities? >=20 > Best, > Philipp. >=20 > > On 24 Jun 2015, at 20:19, Evandro Menezes wrote: > > > > Benedikt, > > > > Are you developing the reciprocal approximation just for 1/x proper or = for > any division, as in x/y =3D x * 1/y? > > > > Thank you, > > > > -- > > Evandro Menezes Austin, TX > > > > > >> -----Original Message----- > >> From: Benedikt Huber [mailto:benedikt.huber@theobroma-systems.com] > >> Sent: Wednesday, June 24, 2015 12:11 > >> To: Dr. Philipp Tomsich > >> Cc: Evandro Menezes; gcc-patches@gcc.gnu.org > >> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root > >> (rsqrt) estimation in -ffast-math > >> > >> Evandro, > >> > >> Yes, we also have the 1/x approximation. > >> However we do not have the test cases yet, and it also would need > >> some clean up. > >> I am going to provide a patch for that soon (say next week). > >> Also, for this optimization we have *not* yet found a benchmark with > >> significant improvements. > >> > >> Best Regards, > >> Benedikt > >> > >> > >>> On 24 Jun 2015, at 18:52, Dr. Philipp Tomsich > >>> >> systems.com> wrote: > >>> > >>> Evandro, > >>> > >>> We=E2=80=99ve seen a 28% speed-up on gromacs in SPECfp for the (scala= r) > >>> reciprocal > >> sqrt. > >>> > >>> Also, the =E2=80=9Creciprocal divide=E2=80=9D patches are floating ar= ound in various > >>> of our git-tree, but aren=E2=80=99t ready for public consumption, yet= =E2=80=A6 I=E2=80=99ll > >>> leave Benedikt to comment on potential timelines for getting that > >>> pushed > >> out. > >>> > >>> Best, > >>> Philipp. > >>> > >>>> On 24 Jun 2015, at 18:42, Evandro Menezes wr= ote: > >>>> > >>>> Benedikt, > >>>> > >>>> You beat me to it! :-) Do you have the implementation for dividing > >>>> using the Newton series as well? > >>>> > >>>> I'm not sure that the series is always for all data types and on > >>>> all processors. It would be useful to allow each AArch64 processor > >>>> to enable this or not depending on the data type. BTW, do you have > >>>> some tests showing the speed up? > >>>> > >>>> Thank you, > >>>> > >>>> -- > >>>> Evandro Menezes Austin, TX > >>>> > >>>>> -----Original Message----- > >>>>> From: gcc-patches-owner@gcc.gnu.org > >>>>> [mailto:gcc-patches-owner@gcc.gnu.org] > >>>> On > >>>>> Behalf Of Benedikt Huber > >>>>> Sent: Thursday, June 18, 2015 7:04 > >>>>> To: gcc-patches@gcc.gnu.org > >>>>> Cc: benedikt.huber@theobroma-systems.com; > >>>>> philipp.tomsich@theobroma- systems.com > >>>>> Subject: [PATCH] [aarch64] Implemented reciprocal square root > >>>>> (rsqrt) estimation in -ffast-math > >>>>> > >>>>> arch64 offers the instructions frsqrte and frsqrts, for rsqrt > >>>>> estimation > >>>> and > >>>>> a Newton-Raphson step, respectively. > >>>>> There are ARMv8 implementations where this is faster than using > >>>>> fdiv and rsqrt. > >>>>> It runs three steps for double and two steps for float to achieve > >>>>> the > >>>> needed > >>>>> precision. > >>>>> > >>>>> There is one caveat and open question. > >>>>> Since -ffast-math enables flush to zero intermediate values > >>>>> between approximation steps will be flushed to zero if they are > denormal. > >>>>> E.g. This happens in the case of rsqrt (DBL_MAX) and rsqrtf (FLT_MA= X). > >>>>> The test cases pass, but it is unclear to me whether this is > >>>>> expected behavior with -ffast-math. > >>>>> > >>>>> The patch applies to commit: > >>>>> svn+ssh://gcc.gnu.org/svn/gcc/trunk@224470 > >>>>> > >>>>> Please consider including this patch. > >>>>> Thank you and best regards, > >>>>> Benedikt Huber > >>>>> > >>>>> Benedikt Huber (1): > >>>>> 2015-06-15 Benedikt Huber > >>>>> > >>>>> gcc/ChangeLog | 9 +++ > >>>>> gcc/config/aarch64/aarch64-builtins.c | 60 ++++++++++++++++ > >>>>> gcc/config/aarch64/aarch64-protos.h | 2 + > >>>>> gcc/config/aarch64/aarch64-simd.md | 27 ++++++++ > >>>>> gcc/config/aarch64/aarch64.c | 63 +++++++++++++++++ > >>>>> gcc/config/aarch64/aarch64.md | 3 + > >>>>> gcc/testsuite/gcc.target/aarch64/rsqrt.c | 113 > >>>>> +++++++++++++++++++++++++++++++ > >>>>> 7 files changed, 277 insertions(+) create mode 100644 > >>>>> gcc/testsuite/gcc.target/aarch64/rsqrt.c > >>>>> > >>>>> -- > >>>>> 1.9.1 > >>>> > >>> > > > >