From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id E41F33858D39; Thu, 21 Sep 2023 14:29:10 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E41F33858D39
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1695306550;
	bh=/8s/2SZcK6jbd9pT8w+6ViD5ScVdSD7X7zRM12OqaXY=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=MDVrMQ8atUcv15sNKAhmF3z/0dCaqd0Cllf5fvWpOvwgRqaUiyyFR0wtcwCMDfUj1
	 uf3Cy5bqEfJTDqzOJZwNYCv7CbJBWWbKFOwUJir1gG8gZv0yThtsPjykrEgVG3Y9HK
	 6AKWtYcfknyHCsObnWAp9nF5SZ6RKnZ9zD6gx60Y=
From: "malat at debian dot org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/110622] x87: Miscompilation at O2 level (O1 is working)
Date: Thu, 21 Sep 2023 14:29:10 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 13.1.0
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: normal
X-Bugzilla-Who: malat at debian dot org
X-Bugzilla-Status: RESOLVED
X-Bugzilla-Resolution: DUPLICATE
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-110622-4-cmNLXLIkhs@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-110622-4@http.gcc.gnu.org/bugzilla/>
References: <bug-110622-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110622
--- Comment #16 from Mathieu Malaterre <malat at debian dot org> ---
(In reply to Xi Ruoyao from comment #15)
> (In reply to Mathieu Malaterre from comment #14)
> > (In reply to Andrew Pinski from comment #13)
> > > (In reply to Mathieu Malaterre from comment #12)
> > > > I am seeing a difference in result (log1p computation) in the range:
> > > >=20
> > > > 4318952042648305665 - 0x1.0000000000001p-64
> > > > 4368493837572636672 - 0x1.002p-53
> > > >=20
> > > > the other values seems to match expectation of log1p computation.
> > >=20
> > > But you used excess-precision=3Dfast
> > >=20
> > > *** This bug has been marked as a duplicate of bug 323 ***
> >=20
> > AFAIK bug #323 does not mention my trick:
> >=20
> >   asm volatile("" : "+r"(y.raw[0]) : : "memory");
> >=20
> > That simple line totally changed the optimizer code generation.
>=20
> Because in x87 the excessive precision only exists in x87 stack-like
> registers.  The "memory" clobber forces a store and reload for all
> non-register variables, thus the value is truncated into a normal double
> value and the excessive precision is lost.
>=20
> There are infinite ways to work around an issue, but it does not mean PR =
323
> must mention all of them.

Oh, I see. Basically my trick is convoluted `-ffloat-store`.

I finally took myself by the hand and convinced me with a simple code:

```
// gcc -m32 -fexcess-precision=3Dfast -O2 t.c
#include <inttypes.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>

[[gnu::noipa]] void test(uint64_t v, double x, double y) {
  const double y2 =3D x + 1.0;
  if (y !=3D y2)
    printf("error %" PRIu64 " %.17g %a\n", v, x, x);
  else
    printf("ok %" PRIu64 " %.17g %a\n", v, x, x);
}

void main() {

  uint64_t kSamplesPerRange =3D 4000, start =3D 0, stop =3D 921886843722740=
5311;
  uint64_t step =3D (stop / kSamplesPerRange);
  for (uint64_t value_bits =3D start; value_bits <=3D stop; value_bits +=3D=
 step) {
    double value;
    memcpy(&value, &value_bits, sizeof value);
    double x =3D value;
    double y =3D x + 1.0;

    test(value_bits, x, y);
  }
}
```

please accept my apologies for the noise.=