From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <libc-alpha-return-95117-listarch-libc-alpha=sources.redhat.com@sourceware.org>
Received: (qmail 52969 invoked by alias); 7 Aug 2018 20:11:17 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Received: (qmail 52625 invoked by uid 89); 7 Aug 2018 20:11:13 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS,URIBL_RED autolearn=ham version=3.3.2 spammy=Inf
X-HELO: relay1.mentorg.com
Date: Tue, 07 Aug 2018 20:11:00 -0000
From: Joseph Myers <joseph@codesourcery.com>
To: Szabolcs Nagy <szabolcs.nagy@arm.com>
CC: GNU C Library <libc-alpha@sourceware.org>, <nd@arm.com>, Wilco Dijkstra
	<Wilco.Dijkstra@arm.com>
Subject: Re: [PATCH 02/10] Improve performance of sincosf
In-Reply-To: <e9ca9a7b-05df-715e-0e4b-b5c80be344b2@arm.com>
Message-ID: <alpine.DEB.2.20.1808071952350.20640@digraph.polyomino.org.uk>
References: <d4963b4e-0013-adaf-df2f-698cf421a487@arm.com> <725dd4a7-f002-65da-4e5c-370cb78c3e77@arm.com> <e9ca9a7b-05df-715e-0e4b-b5c80be344b2@arm.com>
User-Agent: Alpine 2.20 (DEB 67 2015-01-07)
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
X-SW-Source: 2018-08/txt/msg00148.txt.bz2

On Wed, 11 Jul 2018, Szabolcs Nagy wrote:

> +/* Fast sincosf implementation.  Worst-case ULP is 0.56072, maximum relative
> +   error is 0.5303p-23.  A single-step signed range reduction is used for

I don't think this funny mixture of decimal and hex float notation, not a 
valid constant in C, is a good idea.  (Saying 0.5303*0x1p-23 or similar 
would be fine, if that's what you mean.)

> +#if WANT_ERRNO
> +      /* Needed to set errno for +-Inf, the add is a hack to work
> +	 around a gcc register allocation issue: just passing y
> +	 affects code generation in the fast path.  */
> +      __math_invalidf (y + y);

Is there a GCC bug filed for this?  A reference to a specific bug number 
is much better than "a gcc register allocation issue" for anyone trying to 
tell in future if the issue is still relevant or not.

> +/* PI * 2^-64.  */
> +static const double pi64 = 0x1.921FB54442D18p-62;

That's actually PI * 2^-63, so the comment and variable name are 
misleading.

> +   Use round/lround if inlined, otherwise convert to int.  To avoid inaccuracies
> +   introduced by truncating negative values, compute the quadrant * 2^24.  */

I think this last statement is actually describing what the 
[!TOINT_INTRINSICS] code does, rather than part of the interface of the 
function.  So it should be inside that part of the #if inside the 
function; the comment above the function should be about what the 
function's interface is, not avoid such details of the implementation.

However, the comment relates to *another* interface - that is, what the 
semantics of the hpi_inv member of sincos_t are.  So I think the 
definition of the sincos_t struct should have comments explaining the 
semantics of its members, and the one on hpi_inv would explain the 
dependence on TOINT_INTRINSICS.

> +/* Reduce the range of XI to a multiple of PI/4 using fast integer arithmetic.
> +   XI is a reinterpreted float and must be >= 2.0f (the sign bit is ignored).
> +   Return the modulo between -PI/4 and PI/4 and store the quadrant in NP.

Aren't you actually reducing to an interval of length PI/2 (not PI/4) 
(it's not really clear to me what "to a multiple of PI/4" is meant to 
mean), with the result thus in the range -PI/4 and PI/4?

(Whether __inv_pio4 is really a table of 4/PI or 2/PI is a matter of 
semantics for how you say the bits of that number are stored in that 
table.  But ultimately you end up with a signed 62-bit fractional part of 
x*2/PI and so need to multiply by PI/2/2^62, which is why the value of 
your pi64 constant above is appropriate for this multiplication even 
though the name and comment are wrong.)

-- 
Joseph S. Myers
joseph@codesourcery.com