[Bug libstdc++/110045] New: std::normal_distribution<float> (and likely others) give wrong min() and max() values

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug libstdc++/110045] New: std::normal_distribution<float> (and likely others) give wrong min() and max() values
@ 2023-05-31  0:24 gccbugs at elkpod dot com
  2023-05-31  0:53 ` [Bug libstdc++/110045] " pinskia at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: gccbugs at elkpod dot com @ 2023-05-31  0:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

            Bug ID: 110045
           Summary: std::normal_distribution<float> (and likely others)
                    give wrong min() and max() values
           Product: gcc
           Version: 12.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gccbugs at elkpod dot com
  Target Milestone: ---

From my (non-expert) reading of the C++ spec (I'm using
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n4950.pdf for the
current 2023-05-10 draft), the min() and max() methods for
std::normal_distribution<float> are returning incorrect values.

"28.5.3.6 Random number distribution requirements" says that "A class D meets
the requirements of a random number distribution if the expressions shown in
Table 97 are valid and have the indicated semantics", and that in Table 97 "x"
is a "(possibly const) value[s] of D", and that "x.min()" "Returns glb", and
that "x.max()" "Returns lub", and that "glb and lub are values of T
respectively corresponding to the greatest lower bound and the least upper
bound on the values potentially returned by d's operator(), as determined by
the current values of d's parameters".

By my reading, this means that if I have "std::normal_distribution<float> norm
{0, 1};", then "norm.max()" should return the least upper bound on the values
potentially returned by norm's operator(). It does not seem to do that. 

For example, the highest value I was able to get norm() to generate was
0x1.ff13ccp+2 (== 7.985583). norm.max() reports 0x1.fffffep+127 (==
340282346638528859811704183484516925440.000000). While the values returned are
technically valid bounds, they do not seem to be the least upper or greatest
lower. I could find no Generator outputs which produced a value of
0x1.fffffep+127 from std::normal_distribution<float>.

It is possible that my interpretation of the spec is wrong, and that somehow
the min/max values should encompass all possible instantiations of
std::normal_distribution, or some other loophole may exist. I could not find a
better forum to find the answer to that; sorry.

I have looked at the implementation of std::normal_distribution, and written
the following code to generate what I believe to be the actual extreme values
that are "potentially returned by norm's operator()".

Reproduction code:
#include <cstdio>
#include <cstdint>
#include <random>

int32_t offsets[8] = { -1, +0, +0, -1, +0, +1, +1, +0 };
class SimpleGen {
    using result_type = uint32_t;
 public:
    result_type val, ctr = 0;
    static constexpr result_type min() { return 0; }
    static constexpr result_type max() { return 0xffffff; }
    result_type operator()()           { val = 0x800000 + offsets[(ctr++)%8];
                                         printf("\tG 0x%06x\n", val);
                                         return val; }
};

int main(void) {
    SimpleGen gen;
    std::normal_distribution<float> norm {0, 1};

    for (int i = 0; i < 8; i++) {
        float r = norm(gen);
        printf("%d %f %a\n", i, r, r);
    }
    printf("\n%f %a\n%f %a\n", norm.min(), norm.min(), norm.max(), norm.max());
}

Build output:
$ g++-12.2 -Wall -Wextra -o normdist2 normdist2.c -fno-strict-aliasing -fwrapv
-fno-aggressive-loop-optimizations -fsanitize=undefined
$ echo $?
0

Output:
        G 0x7fffff
        G 0x800000
0 0.000000 0x0p+0
1 -7.985583 -0x1.ff13ccp+2
        G 0x800000
        G 0x7fffff
2 -7.985583 -0x1.ff13ccp+2
3 0.000000 0x0p+0
        G 0x800000
        G 0x800001
4 7.985583 0x1.ff13ccp+2
5 0.000000 0x0p+0
        G 0x800001
        G 0x800000
6 0.000000 0x0p+0
7 7.985583 0x1.ff13ccp+2

-340282346638528859811704183484516925440.000000 -0x1.fffffep+127
340282346638528859811704183484516925440.000000 0x1.fffffep+127

Expected output:
        G 0x7fffff
        G 0x800000
0 0.000000 0x0p+0
1 -7.985583 -0x1.ff13ccp+2
        G 0x800000
        G 0x7fffff
2 -7.985583 -0x1.ff13ccp+2
3 0.000000 0x0p+0
        G 0x800000
        G 0x800001
4 7.985583 0x1.ff13ccp+2
5 0.000000 0x0p+0
        G 0x800001
        G 0x800000
6 0.000000 0x0p+0
7 7.985583 0x1.ff13ccp+2

-7.985583 -0x1.ff13ccp+2
7.985583 0x1.ff13ccp+2

$ g++-12.2 -v
Using built-in specs.
COLLECT_GCC=g++-12.2
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-slackware-linux/12.2.0/lto-wrapper
Target: x86_64-slackware-linux
Configured with: ../gcc-12.2.0/configure --prefix=/usr/local
--program-suffix=-12.2 -enable-languages=c,c++,lto --enable-lto
--disable-multilib --with-gnu-ld --enable-threads --verbose
--target=x86_64-slackware-linux --build=x86_64-slackware-linux
--host=x86_64-slackware-linux --enable-tls --with-fpmath=avx
--enable-__cxa_atexit --enable-gnu-indirect-function --enable-bootstrap
--enable-libssp
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 12.2.0 (GCC)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/110045] std::normal_distribution<float> (and likely others) give wrong min() and max() values
  2023-05-31  0:24 [Bug libstdc++/110045] New: std::normal_distribution<float> (and likely others) give wrong min() and max() values gccbugs at elkpod dot com
@ 2023-05-31  0:53 ` pinskia at gcc dot gnu.org
  2023-05-31  0:58 ` gccbugs at elkpod dot com
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-31  0:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The heavy weight goes to potentially. The way I understand it is not the max of
what operator() has produced currently but will potentially return in the
future.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/110045] std::normal_distribution<float> (and likely others) give wrong min() and max() values
  2023-05-31  0:24 [Bug libstdc++/110045] New: std::normal_distribution<float> (and likely others) give wrong min() and max() values gccbugs at elkpod dot com
  2023-05-31  0:53 ` [Bug libstdc++/110045] " pinskia at gcc dot gnu.org
@ 2023-05-31  0:58 ` gccbugs at elkpod dot com
  2023-05-31  1:12 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: gccbugs at elkpod dot com @ 2023-05-31  0:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #2 from Frank J. T. Wojcik <gccbugs at elkpod dot com> ---
(In reply to Andrew Pinski from comment #1)
> The heavy weight goes to potentially. The way I understand it is not the max
> of what operator() has produced currently but will potentially return in the
> future.

And I would agree with that interpretation. My inquiry is based on the fact
that I can find *no* Generator outputs which produce the currently-given max()
value.

The goal of the demo program was to show how I arrived at my "7.985583" value
for what seems to be the actual maximum value, and that I didn't make up some
arbitrary value.

If it helps, imagine the min/max printf() to appear before any generation has
been done, or even without any generation. I would expect the same result,
modulo order of printouts.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/110045] std::normal_distribution<float> (and likely others) give wrong min() and max() values
  2023-05-31  0:24 [Bug libstdc++/110045] New: std::normal_distribution<float> (and likely others) give wrong min() and max() values gccbugs at elkpod dot com
  2023-05-31  0:53 ` [Bug libstdc++/110045] " pinskia at gcc dot gnu.org
  2023-05-31  0:58 ` gccbugs at elkpod dot com
@ 2023-05-31  1:12 ` pinskia at gcc dot gnu.org
  2023-05-31  1:19 ` gccbugs at elkpod dot com
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-31  1:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
With 24bit precision, maybe it is ~8 standard deviations away from the mean.
But the generator argument can change for each call though so that does not
mean the next call to operator() could produce one with more bits ...

Also the standard says: "as determined by the current values of d's parameters"

The parameters is only mean and standard deviations and not the generator.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/110045] std::normal_distribution<float> (and likely others) give wrong min() and max() values
  2023-05-31  0:24 [Bug libstdc++/110045] New: std::normal_distribution<float> (and likely others) give wrong min() and max() values gccbugs at elkpod dot com
                   ` (2 preceding siblings ...)
  2023-05-31  1:12 ` pinskia at gcc dot gnu.org
@ 2023-05-31  1:19 ` gccbugs at elkpod dot com
  2023-05-31  1:41 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: gccbugs at elkpod dot com @ 2023-05-31  1:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #4 from Frank J. T. Wojcik <gccbugs at elkpod dot com> ---
(In reply to Andrew Pinski from comment #3)
> With 24bit precision, maybe it is ~8 standard deviations away from the mean.
> But the generator argument can change for each call though so that does not
> mean the next call to operator() could produce one with more bits ...
> 
> Also the standard says: "as determined by the current values of d's
> parameters"
> 
> The parameters is only mean and standard deviations and not the generator.

I would agree with all of this also, I think. :)

But can you or someone demonstrate *any* generator which produces (e.g.) the
current value of max() for std::normal_distribution<float> {0, 1}? I can find
no generator implementation which does that, and by my reading of the
implementation there cannot be one.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/110045] std::normal_distribution<float> (and likely others) give wrong min() and max() values
  2023-05-31  0:24 [Bug libstdc++/110045] New: std::normal_distribution<float> (and likely others) give wrong min() and max() values gccbugs at elkpod dot com
                   ` (3 preceding siblings ...)
  2023-05-31  1:19 ` gccbugs at elkpod dot com
@ 2023-05-31  1:41 ` pinskia at gcc dot gnu.org
  2023-05-31  2:02 ` gccbugs at elkpod dot com
  2023-05-31  2:14 ` gccbugs at elkpod dot com
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-31  1:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
2 things, I think your result_type in SimpleGen needs to be public.


Second is LLVM's libc++ outputs:
-inf -inf
inf inf

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/110045] std::normal_distribution<float> (and likely others) give wrong min() and max() values
  2023-05-31  0:24 [Bug libstdc++/110045] New: std::normal_distribution<float> (and likely others) give wrong min() and max() values gccbugs at elkpod dot com
                   ` (4 preceding siblings ...)
  2023-05-31  1:41 ` pinskia at gcc dot gnu.org
@ 2023-05-31  2:02 ` gccbugs at elkpod dot com
  2023-05-31  2:14 ` gccbugs at elkpod dot com
  6 siblings, 0 replies; 8+ messages in thread
From: gccbugs at elkpod dot com @ 2023-05-31  2:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #6 from Frank J. T. Wojcik <gccbugs at elkpod dot com> ---
(In reply to Andrew Pinski from comment #5)
> 2 things, I think your result_type in SimpleGen needs to be public.

Sure, I'll change that.

> Second is LLVM's libc++ outputs:
> -inf -inf
> inf inf

I only have clang 6.0 on my system and its outputs are identical to g++'s. And
I do not have access to MSVC, but I believe it also produces similar values.
These data points are why I bring up the possibility that there is some detail
in the specification that I'm missing.

Still, the behavior I'm looking for is useful, and it does seem to be what the
spec is requiring, so I would like a fix if appropriate.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/110045] std::normal_distribution<float> (and likely others) give wrong min() and max() values
  2023-05-31  0:24 [Bug libstdc++/110045] New: std::normal_distribution<float> (and likely others) give wrong min() and max() values gccbugs at elkpod dot com
                   ` (5 preceding siblings ...)
  2023-05-31  2:02 ` gccbugs at elkpod dot com
@ 2023-05-31  2:14 ` gccbugs at elkpod dot com
  6 siblings, 0 replies; 8+ messages in thread
From: gccbugs at elkpod dot com @ 2023-05-31  2:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #7 from Frank J. T. Wojcik <gccbugs at elkpod dot com> ---
After playing with this a bit more, I was able to find a Generator which
produces a slightly wider bound, but still nowhere near 0x1.fffffep+127.
This also includes the requested change to SimpleGen.

#include <cstdio>
#include <cstdint>
#include <random>

#define MAXOFFSET 3

// Return 25-bit values centered around 0x1000000 (the halfway-point of all
// possible 25-bit outputs). Widening the Generator does not alter the
// program's output.
class SimpleGen {
 public:
    using result_type = uint32_t;
    result_type val, offset, ctr = 0;
    static constexpr result_type min() { return 0; }
    static constexpr result_type max() { return 0x1ffffff; }
    result_type operator()()           { offset = (ctr & 1) ?
                                            ((ctr / 2) / (2 * MAXOFFSET)) :
                                            ((ctr / 2) % (2 * MAXOFFSET));
                                         val = 0x1000000 + offset - MAXOFFSET;
                                         printf("\tG 0x%07x\n", val);
                                         ++ctr; return val; }
};

int main(void) {
    SimpleGen gen;
    std::normal_distribution<float> norm {0, 1};

    printf("min: %+f %a\nmax: %+f %a\n\n", norm.min(), norm.min(),
        norm.max(), norm.max());

    for (int i = 0; i < ((2 * MAXOFFSET) * (2 * MAXOFFSET - 1) * 2) ; i++) {
        float r = norm(gen);
        printf("%d %f %a\n", i, r, r);
    }
}

Build output:
$ g++-12.2 -Wall -Wextra -o normdist2 normdist2.cpp -fno-strict-aliasing
-fwrapv -fno-aggressive-loop-optimizations -fsanitize=undefined
$ echo $?
0

Actual outputs (excerpts only!):
min: -340282346638528859811704183484516925440.000000 -0x1.fffffep+127
max: +340282346638528859811704183484516925440.000000 0x1.fffffep+127

        G 0x1000000
        G 0x0ffffff
30 -8.157336 -0x1.0508e6p+3
31 0.000000 0x0p+0

        G 0x1000001
        G 0x0ffffff
32 -8.157336 -0x1.0508e6p+3
33 0.000000 0x0p+0

        G 0x0ffffff
        G 0x1000000
40 0.000000 0x0p+0
41 -8.157336 -0x1.0508e6p+3

        G 0x1000002
        G 0x1000000
42 0.000000 0x0p+0
43 7.985583 0x1.ff13ccp+2

        G 0x0ffffff
        G 0x1000001
48 0.000000 0x0p+0
49 -8.157336 -0x1.0508e6p+3

        G 0x1000002
        G 0x1000001
50 0.000000 0x0p+0
51 7.985583 0x1.ff13ccp+2

        G 0x1000000
        G 0x1000002
58 7.985583 0x1.ff13ccp+2
59 0.000000 0x0p+0

Expected outputs (excerpt only!):
min: -8.157336 -0x1.0508e6p+3
max: +7.985583 0x1.ff13ccp+2

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-05-31  2:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-31  0:24 [Bug libstdc++/110045] New: std::normal_distribution<float> (and likely others) give wrong min() and max() values gccbugs at elkpod dot com
2023-05-31  0:53 ` [Bug libstdc++/110045] " pinskia at gcc dot gnu.org
2023-05-31  0:58 ` gccbugs at elkpod dot com
2023-05-31  1:12 ` pinskia at gcc dot gnu.org
2023-05-31  1:19 ` gccbugs at elkpod dot com
2023-05-31  1:41 ` pinskia at gcc dot gnu.org
2023-05-31  2:02 ` gccbugs at elkpod dot com
2023-05-31  2:14 ` gccbugs at elkpod dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).