From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=BR44=JB=troutmask.apl.washington.edu=sgk@sourceware.org>
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21])
	by sourceware.org (Postfix) with ESMTPS id 9C9773858D3C;
	Tue, 23 Jan 2024 17:21:46 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9C9773858D3C
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=troutmask.apl.washington.edu
Authentication-Results: sourceware.org; spf=none smtp.mailfrom=troutmask.apl.washington.edu
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9C9773858D3C
Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=128.95.76.21
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706030508; cv=none;
	b=AhTTiDRmQn75ythuRQS1qLvlgWEYrNddFyNLmcRJjImzY5okrWTFdnRWapDh+ZF25n09CPBWBb6aptJ932XfCplbTAwsyZ3sz5ReI261URsEV512EyaVcy0rCuNBFgpHloMGEVIBQAdXsfdtsoqBwNvZdK0soKpjlldj2c/XbyY=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
	t=1706030508; c=relaxed/simple;
	bh=SjyBOhkqNw4kblJf6NMbVvZwPGDTHlnZCAqOkREcj0c=;
	h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=M5kwqI/IcZGe5Z81VL/Qeuxhc2gtmZvZuJUcdPfx+PGNG3/Dtu03eJlTMlw5B+VizLWZ7fR/SbX0loqVLErNF6yXuXeN05jDg9taCfky5jPcnOHEbfNDVTvCrHZaEnJaIgSb8hhGbGJcEuBCCNoTgOt+W8xp+JTk3cFmaZF0Apc=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
	by troutmask.apl.washington.edu (8.17.1/8.17.1) with ESMTP id 40NHLjf1052617;
	Tue, 23 Jan 2024 09:21:45 -0800 (PST)
	(envelope-from sgk@troutmask.apl.washington.edu)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
	d=troutmask.apl.washington.edu; s=myselector-troutmask;
	t=1706030505; bh=SjyBOhkqNw4kblJf6NMbVvZwPGDTHlnZCAqOkREcj0c=;
	h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To;
	b=Rvobq+BhMvrdUlE9qDutNy8m19uf31pgoH794yuzSubNQuKZOnBe3Oa3vTel+Cv6/
	 y5L/wSxflMCcHeBrf2SZ/m6snngu2Jp/my8pVkwWnnnCQjA+JDvnNNtpcZXqZA4rS3
	 fgRbDhVJC65ACjxWrqLOLWenz9E4ehppnhJqZYY2HOlQxST2G3HjLQDclQZw5a2R1m
	 lGvEU/mv0JM8yN+EWSIoTvGCmnWODvC4gJ+ADUKYMSMdiC1y/5pP9boMYyaLQu5Iae
	 ETkyPt0ciOFMguUin2Cbl1gMUn31sSltgK7z+Rl3NFVgerAn+4l51VqDONYnyRRYJy
	 d/iAlqT583kxg==
Received: (from sgk@localhost)
	by troutmask.apl.washington.edu (8.17.1/8.17.1/Submit) id 40NHLjrv052616;
	Tue, 23 Jan 2024 09:21:45 -0800 (PST)
	(envelope-from sgk)
Date: Tue, 23 Jan 2024 09:21:45 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: FX Coudert <fxcoudert@gmail.com>
Cc: fortran <fortran@gcc.gnu.org>, gcc-patches@gcc.gnu.org
Subject: Re: [Fortran] half-cycle trig functions and atan[d] fixes
Message-ID: <Za_1qarpdCP3W73y@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <Zawoo1Ofk56wNa7m@troutmask.apl.washington.edu>
 <AF08DC72-6BBE-4B1E-B182-A2D5EB572EC2@gmail.com>
 <Za_P26vvQtzYBGKW@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <Za_P26vvQtzYBGKW@troutmask.apl.washington.edu>
X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00,DKIM_INVALID,DKIM_SIGNED,KAM_DMARC_STATUS,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <fortran.gcc.gnu.org>

On Tue, Jan 23, 2024 at 06:40:27AM -0800, Steve Kargl wrote:
> On Tue, Jan 23, 2024 at 10:08:43AM +0100, FX Coudert wrote:
> > Hi Steve,
> > 
> > Thanks for the patch. I’ll take time to do a proper review, but
> > after a first read I had the following questions:
> > 
> > - "an OS's libm may/will contain cospi(), etc.”: do those functions
> > conform to any standard? Are there plans to implement them outside
> >FreeBSD at this point?
> 
> Yes.  cospi, sinpi, and tanpi are in IEEE754-2008.  These are
> also in ISO/IEC TS 18661-4 along with acospi, asinpi, and atanpi,
> which I believe where merged into C23 (see n3096.pdf).  I've

I haven't checked.

> checked if atan2pi is in C23, but it is in F2023.
> 
> AFAIK, FreeBSD's libm is the only OS that contains cospi, sinpi,
> and tanpi.  The arc functions haven't been written/committed,
> because something like cospi(x) = cos(x) / pi with pi split
> into hi-lo parts is good enough.

Whoops. acospi(x) = acos(x) / pi.  The fallback implements this
as acospi(x) = acos(x) * invpihi + acos(x) * invpilo with invpihi
the leading digits(x)/2 bits of 1/pi and invpilo the trailing
digits(x) of 1/pi.

For most libm's, acos(x) with |x| <= 1 have good numerical accuracy
and handle subnormal, |x| > 1, +-inf, and NaN by setting appropriate
exceptions.  With FreeBSD and exhaustive testing in the interval, I see

% tlibm acos -f -x 0x1p-9 -X 1 -PED
Interval tested for acosf: [0.00195312,1]
       ulp <= 0.5:  96.432%  72803871 |  96.432%  72803871
0.5 <  ulp <  0.6:  3.110%   2348017 |  99.542%  75151888
0.6 <  ulp <  0.7:  0.419%    316134 |  99.961%  75468022
0.7 <  ulp <  0.8:  0.039%     29450 | 100.000%  75497472
Max ulp: 0.796035 at 4.99814630e-01

Exhaustive testing of the fallback routine for acospi(x) gives

   program foo

   implicit none

   integer n
   real(4) f4, x, xm
   real(8) f8, u, v

   u = -1
   x = scale(1., -9)
   do
      f4 = acospi(x)
      f8 = acospi(real(x,8))
      n = exponent(f8)
      v = abs(f8 - f4)
      v = scale(v, digits(f4) - n)
      if (v > u) then
         u = v
         xm = x
      end if
      x = nearest(x, 2.)
      if (x > 1) exit
   end do

   print *, u, acospi(xm), acospi(real(xm,8))

   end program foo

% gfcx -o z -O a.f90 && ./z
   1.9551682807505131       0.337791681      0.33779173955822828 

so a max ulp of 1.955.  Hopefully, OS libm's will catch up and
provide an implementation that gets the extra 1+ bit of precision.

Admittedly, the fallback routines for cospi(), sinpi(), and
tanpi() could be better, but much slower.  For now, I do for
example,

float
cospif (float x)
{
   return cosf (x * pihi_f + x * pilo_f);
}

A better routine would be,

/* cospi(x) = cos(pi*x) = cos(pi*n+pi*r) = +-cos(pi*r) with 0 <= r < 1i
   and n integer. */
float
cospif (float x)
{
   int s;
   float ax, n, r;
   ax = fabsf(ax);
   if (ax > 0x1p23) {
      return 1;
   } else {
      /* FreeBSD uses bit twiddling.  See
         https://cgit.freebsd.org/src/tree/lib/msun/src/s_cospif.c  */
      n = floor(ax);  /* integer part */
      r = ax - n;     /* remainder */
      s = floor(n/2) * 2 - n == 0 : 1 : -1;  /* even n? */ 
      return s * cosf (r * pi);
   }
}

> > - On systems where libquadmath is used for _Float128, does
> > libquadmath contain the implementation of the q() variants for
> > those functions?
> 
> AFAIK, libquadmath does not have the half-cycle functions.  I
> wrote the function trig_fallback2.F90 to deal with REAL(16) (and
> maybe REAL17).
> 
> > - If I get this right, to take one example, the Fortran front-end
> > will emit a call to gfortran_acospi_r4(), libgfortran provides this
> > as a wrapper calling acospif(), which is called either from libm
> > or from libgfortran.
> 
> Yes, that is correct.  I tried to install __builtin_acospif in
> trans-intrinsic to generate a direct call to acospif, but that
> led to many hours/days of frustration.  I gave up.
> 
> > This is different from other math library functions, like ACOS()
> > where the acosf() call is generated directly from the front-end
> > (and then the implementation comes either from libm or from
> > libgfortran). Why not follow our usual way of doing things?
> 
> I gave up on that approach when I got some real obscure error
> about some math function which I did not touch. 
> 

I tried adding

DEFINE_MATH_BUILTIN_C (COSPI,  "cospi",   0)
or
DEFINE_MATH_BUILTIN (COSPI,  "cospi",   0)
or
OTHER_BUILTIN (COSPI, "cospi",  1, false)  (and variations).

to mathbuiltins.def but this led to some obscure error message
about some other unrelated intrinsic subprogram.

-- 
Steve