public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Evandro Menezes <e.menezes@samsung.com>
To: GCC Patches <gcc-patches@gcc.gnu.org>,
	Marcus Shawcroft <Marcus.Shawcroft@arm.com>,
	James Greenhalgh <james.greenhalgh@arm.com>,
	Andrew Pinski <pinskia@gmail.com>,
	Benedikt Huber <benedikt.huber@theobroma-systems.com>,
	philipp.tomsich@theobroma-systems.com
Subject: Re: [AArch64] Emit square root using the Newton series
Date: Tue, 16 Feb 2016 20:56:00 -0000	[thread overview]
Message-ID: <56C38D00.9000403@samsung.com> (raw)
In-Reply-To: <56674D34.80806@samsung.com>

On 12/08/15 15:35, Evandro Menezes wrote:
> Emit square root using the Newton series
>
>    2015-12-03  Evandro Menezes  <e.menezes@samsung.com>
>
>    gcc/
>             * config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt):
>    Declare new
>             function.
>             * config/aarch64/aarch64-simd.md (sqrt<mode>2): New
>    expansion and
>             insn definitions.
>             * config/aarch64/aarch64-tuning-flags.def
>             (AARCH64_EXTRA_TUNE_FAST_SQRT): New tuning macro.
>             * config/aarch64/aarch64.c (aarch64_emit_swsqrt): Define
>    new function.
>             * config/aarch64/aarch64.md (sqrt<mode>2): New expansion
>    and insn
>             definitions.
>             * config/aarch64/aarch64.opt (mlow-precision-recip-sqrt):
>    Expand option
>             description.
>             * doc/invoke.texi (mlow-precision-recip-sqrt): Likewise.
>
> This patch extends the patch that added support for implementing 
> x^-1/2 using the Newton series by adding support for x^1/2 as well.
>
> Is it OK at this point of stage 3?
>
> Thank you,
>

James,

As I was saying, this patch results in some validation errors in CPU2000 
benchmarks using DF.  Although proving the algorithm to be pretty solid 
with a vast set of random values, I'm confused why some benchmarks fail 
to validate with this implementation of the Newton series for square 
root too, when they pass with the Newton series for reciprocal square root.

Since I had no problems with the same algorithm on x86-64, I wonder if 
the initial estimate on AArch64, which offers just 8 bits, whereas 
x86-64 offers 11 bits, has to do with it.  Then again, the algorithm 
iterated 1 less time on x86-64 than on AArch64.

Since it seems that the initial estimate is sufficient for CPU2000 to 
validate when using SF, I'm leaning towards restricting the Newton 
series for square root only for SF.

Your thoughts on the matter are appreciated,

-- 
Evandro Menezes

  parent reply	other threads:[~2016-02-16 20:56 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-08 21:35 Evandro Menezes
2015-12-09 14:05 ` Marcus Shawcroft
2015-12-09 16:31   ` Evandro Menezes
2015-12-09 16:52 ` Kyrill Tkachov
2015-12-09 16:59   ` Evandro Menezes
2015-12-09 17:03     ` Kyrill Tkachov
2015-12-09 17:16       ` Kyrill Tkachov
2015-12-09 18:50         ` Evandro Menezes
2015-12-10 10:30           ` Kyrill Tkachov
2016-02-23  0:50             ` Evandro Menezes
2016-02-26 15:00               ` James Greenhalgh
2016-02-26 23:42                 ` Evandro Menezes
2016-02-26 23:46                   ` Evandro Menezes
2016-02-16 20:56 ` Evandro Menezes [this message]
2016-03-04  0:22   ` Evandro Menezes
2016-03-08 22:08     ` Evandro Menezes
2016-03-08 22:18       ` Evandro Menezes
2016-03-08 22:20         ` Evandro Menezes
2016-03-16 19:45       ` Evandro Menezes
2016-03-17 14:55         ` James Greenhalgh
2016-03-17 16:25           ` Evandro Menezes
     [not found] <AM3PR08MB00886499882773F3C8B9F71D83B30@AM3PR08MB0088.eurprd08.prod.outlook.com>
     [not found] ` <011d01d17a26$31b3ade0$951b09a0$@samsung.com>
2016-03-10 16:52   ` Wilco Dijkstra
2016-03-10 16:58     ` Evandro Menezes
2016-03-10 19:10       ` Wilco Dijkstra
2016-03-10 22:15         ` Evandro Menezes
2016-03-11  1:06           ` Wilco Dijkstra
2016-03-14 16:39             ` Evandro Menezes
2016-03-14 19:13               ` Wilco Dijkstra
2016-03-16 21:44             ` Evandro Menezes
2016-03-17 22:50 Evandro Menezes
2016-03-24 20:30 ` [AArch64] " Evandro Menezes
2016-04-01 22:45   ` Evandro Menezes
2016-04-04 16:32     ` Evandro Menezes
     [not found]       ` <DB3PR08MB008902F0F0AFA3B1F1C91511839E0@DB3PR08MB0089.eurprd08.prod.outlook.com>
2016-04-05 22:30         ` Evandro Menezes
2016-04-12 18:15           ` Evandro Menezes
2016-04-21 18:44             ` Evandro Menezes
2016-04-27 14:24             ` James Greenhalgh
2016-04-27 15:45               ` Evandro Menezes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56C38D00.9000403@samsung.com \
    --to=e.menezes@samsung.com \
    --cc=Marcus.Shawcroft@arm.com \
    --cc=benedikt.huber@theobroma-systems.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=james.greenhalgh@arm.com \
    --cc=philipp.tomsich@theobroma-systems.com \
    --cc=pinskia@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).