[Bug c/61399] New: LDBL_MAX is incorrect with IBM long double format / overflow issues near large values

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug c/61399] New: LDBL_MAX is incorrect with IBM long double format / overflow issues near large values
@ 2014-06-02 23:37 vincent-gcc at vinc17 dot net
  2014-06-03  7:04 ` [Bug c/61399] " jakub at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: vincent-gcc at vinc17 dot net @ 2014-06-02 23:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61399

            Bug ID: 61399
           Summary: LDBL_MAX is incorrect with IBM long double format /
                    overflow issues near large values
           Product: gcc
           Version: 4.7.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vincent-gcc at vinc17 dot net

On PowerPC, which uses the IBM long double format for the type long double, one
gets:

LDBL_MANT_DIG = 106 (this is the precision of the FP model),
LDBL_MAX_EXP = 1024 (this is the maximum exponent of the FP model).

Even though the IBM long double format, a.k.a. double-double format, is not a
floating-point format, it is some superset of a floating-point format as
specified by the C standard (see C11, §5.2.4.2.2 from p1 to p3), where p = 106
and emax = 1024.

By definition, for radix b = 2, LDBL_MAX is the value (1 - 2^(-p)) * 2^emax
(see §5.2.4.2.2p12), which is the largest value representable in the
floating-point model.

However, with GCC 4.7.2 20121109 (Red Hat 4.7.2-8), I get:

LDBL_MAX = 0x1.fffffffffffff7ffffffffffff8p+1023

instead of 0x1.ffffffffffffffffffffffffff8p+1023.

The following program shows that this is not a display bug:

#include <stdio.h>
#include <float.h>

int main (void)
{
  long double dmax = DBL_MAX;
  printf ("%La\n", LDBL_MAX);
  printf ("%La\n", LDBL_MAX - dmax);
  printf ("%La\n", dmax + dmax * DBL_EPSILON / 4);
  printf ("%La\n", dmax + dmax * DBL_EPSILON / 2);
  return 0;
}

It outputs:

0x1.fffffffffffff7ffffffffffff8p+1023
0x1.ffffffffffffep+969
0x1.fffffffffffff7ffffffffffffcp+1023
inf

also showing that the arithmetic is buggy. I suppose that the high double is
the value rounded to double precision, but this rule is incorrect near the
largest values, due to the overflow.

One may choose to keep the behavior, i.e. consider that the high double is the
value rounded to double precision, but this means that the floating-point model
would need to be changed; otherwise some values are not representable, as shown
above.
>From gcc-bugs-return-453036-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Tue Jun 03 00:07:58 2014
Return-Path: <gcc-bugs-return-453036-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 32021 invoked by alias); 3 Jun 2014 00:07:58 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 31996 invoked by uid 48); 3 Jun 2014 00:07:53 -0000
From: "vincent-gcc at vinc17 dot net" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c/61399] LDBL_MAX is incorrect with IBM long double format / overflow issues near large values
Date: Tue, 03 Jun 2014 00:07:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c
X-Bugzilla-Version: 4.7.2
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: vincent-gcc at vinc17 dot net
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-61399-4-KtPpQhvGZi@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-61399-4@http.gcc.gnu.org/bugzilla/>
References: <bug-61399-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2014-06/txt/msg00118.txt.bz2
Content-length: 841

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61399

--- Comment #1 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> ---
(In reply to Vincent Lefèvre from comment #0)
> One may choose to keep the behavior, i.e. consider that the high double is
> the value rounded to double precision, but this means that the
> floating-point model would need to be changed; otherwise some values are not
> representable, as shown above.

By "the floating-point model would need to be changed", I mean, for instance,
choose LDBL_MAX_EXP = 1023. I think that this would be correct. A possible
drawback is that one would have LDBL_MAX_EXP < DBL_MAX_EXP, but I don't think
that this is a problem (note that one already has LDBL_MIN_EXP > DBL_MIN_EXP).
This would just mean that one has "not normalized" values outside the normal
range.
>From gcc-bugs-return-453037-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Tue Jun 03 00:18:45 2014
Return-Path: <gcc-bugs-return-453037-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 12932 invoked by alias); 3 Jun 2014 00:18:44 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 12884 invoked by uid 48); 3 Jun 2014 00:18:40 -0000
From: "jason at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c++/51253] [C++11][DR 1030] Evaluation order (sequenced-before relation) among initializer-clauses in braced-init-list
Date: Tue, 03 Jun 2014 00:18:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c++
X-Bugzilla-Version: 4.7.0
X-Bugzilla-Keywords: wrong-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: jason at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: jason at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 4.10.0
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_status assigned_to target_milestone
Message-ID: <bug-51253-4-zkMgn5mLvo@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-51253-4@http.gcc.gnu.org/bugzilla/>
References: <bug-51253-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2014-06/txt/msg00119.txt.bz2
Content-length: 521

https://gcc.gnu.org/bugzilla/show_bug.cgi?idQ253

Jason Merrill <jason at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |jason at gcc dot gnu.org
   Target Milestone|---                         |4.10.0

--- Comment #12 from Jason Merrill <jason at gcc dot gnu.org> ---
Fixed on trunk so far.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/61399] LDBL_MAX is incorrect with IBM long double format / overflow issues near large values
  2014-06-02 23:37 [Bug c/61399] New: LDBL_MAX is incorrect with IBM long double format / overflow issues near large values vincent-gcc at vinc17 dot net
@ 2014-06-03  7:04 ` jakub at gcc dot gnu.org
  2014-06-03  7:45 ` vincent-gcc at vinc17 dot net
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-06-03  7:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61399

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I think lowering LDBL_MAX_EXP would be far worse than this.
The double-double format is unusable for any real numerics in so many ways that
this is just one small part of that, the variable precision from 106 to
thousands of precision bits depending on exact value is far worse.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/61399] LDBL_MAX is incorrect with IBM long double format / overflow issues near large values
  2014-06-02 23:37 [Bug c/61399] New: LDBL_MAX is incorrect with IBM long double format / overflow issues near large values vincent-gcc at vinc17 dot net
  2014-06-03  7:04 ` [Bug c/61399] " jakub at gcc dot gnu.org
@ 2014-06-03  7:45 ` vincent-gcc at vinc17 dot net
  2021-08-18 15:04 ` vincent-gcc at vinc17 dot net
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: vincent-gcc at vinc17 dot net @ 2014-06-03  7:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61399

--- Comment #3 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> ---
The variable precision is unavoidable with this format (this is even a feature,
despite the drawbacks). But the fact that the variable precision is problematic
by itself isn't a reason not to try to solve other issues.
>From gcc-bugs-return-453054-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Tue Jun 03 07:54:20 2014
Return-Path: <gcc-bugs-return-453054-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 10694 invoked by alias); 3 Jun 2014 07:54:20 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 10672 invoked by uid 48); 3 Jun 2014 07:54:17 -0000
From: "alexey.kudinkin at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c++/61401] New: Wrong treatment of empty template-argument packs during deduction
Date: Tue, 03 Jun 2014 07:54:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c++
X-Bugzilla-Version: 4.9.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: alexey.kudinkin at gmail dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter
Message-ID: <bug-61401-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2014-06/txt/msg00136.txt.bz2
Content-length: 747

https://gcc.gnu.org/bugzilla/show_bug.cgi?ida401

            Bug ID: 61401
           Summary: Wrong treatment of empty template-argument packs
                    during deduction
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: alexey.kudinkin at gmail dot com

template<typename, typename>
struct X;

template<typename T>
struct X<T, T(T)>
{
  typedef int type;
};

template<typename T, typename ...Args>
struct Y : X<Args..., T, T(T)>
{
};

Y<int>::type x;

Compiling this snippet g++ (4.9.0) complains about wrong amount of template
arguments while shouldn't.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/61399] LDBL_MAX is incorrect with IBM long double format / overflow issues near large values
  2014-06-02 23:37 [Bug c/61399] New: LDBL_MAX is incorrect with IBM long double format / overflow issues near large values vincent-gcc at vinc17 dot net
  2014-06-03  7:04 ` [Bug c/61399] " jakub at gcc dot gnu.org
  2014-06-03  7:45 ` vincent-gcc at vinc17 dot net
@ 2021-08-18 15:04 ` vincent-gcc at vinc17 dot net
  2021-08-18 15:13 ` jakub at gcc dot gnu.org
  2021-08-18 15:40 ` vincent-gcc at vinc17 dot net
  4 siblings, 0 replies; 6+ messages in thread
From: vincent-gcc at vinc17 dot net @ 2021-08-18 15:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61399

--- Comment #11 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> ---
In addition to the maximum exponent issue, for LDBL_MAX following the defect
report, instead of

  0x1.fffffffffffff7ffffffffffff8p+1023

I would expect

  0x1.fffffffffffff7ffffffffffffcp+1023 = DBL_MAX + DBL_MAX * DBL_EPSILON / 4

as it is larger (it has one more trailing 1) and representable.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/61399] LDBL_MAX is incorrect with IBM long double format / overflow issues near large values
  2014-06-02 23:37 [Bug c/61399] New: LDBL_MAX is incorrect with IBM long double format / overflow issues near large values vincent-gcc at vinc17 dot net
                   ` (2 preceding siblings ...)
  2021-08-18 15:04 ` vincent-gcc at vinc17 dot net
@ 2021-08-18 15:13 ` jakub at gcc dot gnu.org
  2021-08-18 15:40 ` vincent-gcc at vinc17 dot net
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-08-18 15:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61399

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Vincent Lefèvre from comment #11)
> In addition to the maximum exponent issue, for LDBL_MAX following the defect
> report, instead of
> 
>   0x1.fffffffffffff7ffffffffffff8p+1023
> 
> I would expect
> 
>   0x1.fffffffffffff7ffffffffffffcp+1023 = DBL_MAX + DBL_MAX * DBL_EPSILON / 4
> 
> as it is larger (it has one more trailing 1) and representable.

That isn't representable in the GCC internal representation, which pretends the
type has fixed 106 bit precision (like double has 53 bit precision),
0x1.fffffffffffff7ffffffffffffcp+1023 needs 107 bit precision (and generally
the type has variable precision).
The only way around that would be to actually represent it in GCC internal
representation as sum of two doubles and rewrite all operations on this mode to
treat it specially.  That is a lot of work and given that powerpc64le is
switching from this floating point format to IEEE quad long double format, I'm
certain nobody is willing to spend that much time (months) on it.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/61399] LDBL_MAX is incorrect with IBM long double format / overflow issues near large values
  2014-06-02 23:37 [Bug c/61399] New: LDBL_MAX is incorrect with IBM long double format / overflow issues near large values vincent-gcc at vinc17 dot net
                   ` (3 preceding siblings ...)
  2021-08-18 15:13 ` jakub at gcc dot gnu.org
@ 2021-08-18 15:40 ` vincent-gcc at vinc17 dot net
  4 siblings, 0 replies; 6+ messages in thread
From: vincent-gcc at vinc17 dot net @ 2021-08-18 15:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61399

--- Comment #13 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> ---
(In reply to Jakub Jelinek from comment #12)
> That isn't representable in the GCC internal representation, which pretends
> the type has fixed 106 bit precision
[...]

So, if I understand correctly, this is due to a limitation of GCC internals.

> The only way around that would be to actually represent it in GCC internal
> representation as sum of two doubles and rewrite all operations on this mode
> to treat it specially.  That is a lot of work and given that powerpc64le is
> switching from this floating point format to IEEE quad long double format,
> I'm certain nobody is willing to spend that much time (months) on it.

Well, if there is a difficulty of implementation, perhaps leave LDBL_MAX as is,
and consider for instance that an operation that with the result
0x1.fffffffffffff7ffffffffffffcp+1023 would overflow (which is AFAIK undefined
behavior for non-IEEE-754 formats, so that this would be conforming). But this
should be documented.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-08-18 15:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-02 23:37 [Bug c/61399] New: LDBL_MAX is incorrect with IBM long double format / overflow issues near large values vincent-gcc at vinc17 dot net
2014-06-03  7:04 ` [Bug c/61399] " jakub at gcc dot gnu.org
2014-06-03  7:45 ` vincent-gcc at vinc17 dot net
2021-08-18 15:04 ` vincent-gcc at vinc17 dot net
2021-08-18 15:13 ` jakub at gcc dot gnu.org
2021-08-18 15:40 ` vincent-gcc at vinc17 dot net

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).