Re: Re: [PATCH 0/3] ARM with only 32-bit floats do not have fast 64-bit FMA

public inbox for newlib@sourceware.org
 help / color / mirror / Atom feed

* Re: Re: [PATCH 0/3] ARM with only 32-bit floats do not have fast 64-bit FMA
@ 2020-09-07 14:09 Eric Bresie
  2020-09-07 17:16 ` Keith Packard
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Bresie @ 2020-09-07 14:09 UTC (permalink / raw)
  To: newlib

Not directly related (and as I’m not really an expert on these things, nor able to change in any way) but was looking at the code mentioned and saw line like:

if (x == 0.0 || y == 0.0)

return (x * y + z);

If either x or y is zero would it be better to just return z and avoid an extra multiplication operation here?
Eric Bresie
Ebresie@gmail.com

> On September 2, 2020 at 12:59:43 PM CDT, Sebastian Huber <sebastian.huber@embedded-brains.de> wrote:
> On 02/09/2020 19:12, Joseph Myers wrote:
>
> > On Wed, 2 Sep 2020, Sebastian Huber wrote:
> >
> > > https://sourceware.org/git/?p=glibc.git;a=blob;f=math/s_fma.c;h=4d73af4f65d511594b2395d032a135721c578484;hb=HEAD
> > No glibc configurations use that; they all use either a hardware
> > instruction, an implementation based on sticky rounding as described by
> > Boldo and Melquiond, or, in the absence of hardware exceptions and
> > rounding modes, a soft-fp implementation.
>
> Sorry for pointing to this dead code in glibc.
>
> Maybe we can use the FreeBSD implementation:
>
> https://github.com/freebsd/freebsd/blob/master/lib/msun/src/s_fma.c
>
> It is probably also used by Bionic:
>
> https://android.googlesource.com/platform/bionic/+/refs/heads/master/libm/upstream-freebsd/lib/msun/src/s_fma.c
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Re: [PATCH 0/3] ARM with only 32-bit floats do not have fast 64-bit FMA
  2020-09-07 14:09 Re: [PATCH 0/3] ARM with only 32-bit floats do not have fast 64-bit FMA Eric Bresie
@ 2020-09-07 17:16 ` Keith Packard
  2020-09-07 20:16   ` Brian Inglis
  0 siblings, 1 reply; 4+ messages in thread
From: Keith Packard @ 2020-09-07 17:16 UTC (permalink / raw)
  To: Eric Bresie, newlib

[-- Attachment #1: Type: text/plain, Size: 838 bytes --]

Eric Bresie via Newlib <newlib@sourceware.org> writes:

> Not directly related (and as I’m not really an expert on these things, nor able to change in any way) but was looking at the code mentioned and saw line like:
>
> if (x == 0.0 || y == 0.0)
>
> return (x * y + z);
>
> If either x or y is zero would it be better to just return z and avoid
> an extra multiplication operation here?

You want to compute the correct result and get the right exceptions in
all of the delightful IEEE754 corner cases (e.g. 0 × ∞). It's easier to
just execute the two operations than to try and synthesize the right
result (which is implementation-dependent in the case of 0 × ∞ +
qNaN). The key here is that if x or y is zero, then you won't lose any
intermediate precision by performing the operation this way.

-- 
-keith

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/3] ARM with only 32-bit floats do not have fast 64-bit FMA
  2020-09-07 17:16 ` Keith Packard
@ 2020-09-07 20:16   ` Brian Inglis
  2020-09-07 22:23     ` Keith Packard
  0 siblings, 1 reply; 4+ messages in thread
From: Brian Inglis @ 2020-09-07 20:16 UTC (permalink / raw)
  To: newlib

On 2020-09-07 11:16, Keith Packard via Newlib wrote:
> Eric Bresie via Newlib <newlib@sourceware.org> writes:
> 
>> Not directly related (and as I’m not really an expert on these things, nor able to change in any way) but was looking at the code mentioned and saw line like:
>>
>> if (x == 0.0 || y == 0.0)
>>
>> return (x * y + z);
>>
>> If either x or y is zero would it be better to just return z and avoid
>> an extra multiplication operation here?
> 
> You want to compute the correct result and get the right exceptions in
> all of the delightful IEEE754 corner cases (e.g. 0 × ∞). It's easier to
> just execute the two operations than to try and synthesize the right
> result (which is implementation-dependent in the case of 0 × ∞ +
> qNaN). The key here is that if x or y is zero, then you won't lose any
> intermediate precision by performing the operation this way.

Can't the "super-smart" compiler use that information to work around your
careful approach by conditionally skipping the FMA and conditionally return just
z, or even unconditionally return z, as C makes no guarantees?
And couldn't the "super-smart" instruction scheduler do similar at the hardware
level?

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in IEC units and prefixes, physical quantities in SI.]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/3] ARM with only 32-bit floats do not have fast 64-bit FMA
  2020-09-07 20:16   ` Brian Inglis
@ 2020-09-07 22:23     ` Keith Packard
  0 siblings, 0 replies; 4+ messages in thread
From: Keith Packard @ 2020-09-07 22:23 UTC (permalink / raw)
  To: Brian Inglis, newlib

[-- Attachment #1: Type: text/plain, Size: 612 bytes --]

Brian Inglis <Brian.Inglis@SystematicSw.ab.ca> writes:

> Can't the "super-smart" compiler use that information to work around your
> careful approach by conditionally skipping the FMA and conditionally return just
> z, or even unconditionally return z, as C makes no guarantees?
> And couldn't the "super-smart" instruction scheduler do similar at the hardware
> level?

I don't think that would be in conformance with the C specification
which says that arithmetic follows IEC 60559 that defines the various
exceptions and results. Now, if you enable -ffast-math, all bets are off...

-- 
-keith

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-09-07 22:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-07 14:09 Re: [PATCH 0/3] ARM with only 32-bit floats do not have fast 64-bit FMA Eric Bresie
2020-09-07 17:16 ` Keith Packard
2020-09-07 20:16   ` Brian Inglis
2020-09-07 22:23     ` Keith Packard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).