public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug d/109221] New: std.math.floor, core.math.ldexp, std.math.poly poor inlining
@ 2023-03-21 1:09 witold.baryluk+gcc at gmail dot com
2023-03-21 1:15 ` [Bug d/109221] " witold.baryluk+gcc at gmail dot com
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: witold.baryluk+gcc at gmail dot com @ 2023-03-21 1:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109221
Bug ID: 109221
Summary: std.math.floor, core.math.ldexp, std.math.poly poor
inlining
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: d
Assignee: ibuclaw at gdcproject dot org
Reporter: witold.baryluk+gcc at gmail dot com
Target Milestone: ---
Example:
static float sRGB_case4(float x) {
// import std.math : exp;
return 1.055f * expImpl(x) - 0.055f; // expImpl not inlined by default
// (inlined when using pragma(inline, true), but that fails to inline in
DMD)
}
// pragma(inline, true)
// This is borrowed from phobos/exponential.d to help gcc inline it fully.
// Only T == float case is here (as some traits are private to phobos).
// Also isNaN and range checks are removed, as sRGB performs own checks.
static private T expImpl(T)(T x) @safe pure nothrow @nogc
{
//import std.math : floatTraits, RealFormat;
//import std.math.traits : isNaN;
//import std.math.rounding : floor;
//import std.math.algebraic : poly;
//import std.math.constants : LOG2E;
import std.math;
import core.math;
static immutable T[6] P = [
5.0000001201E-1,
1.6666665459E-1,
4.1665795894E-2,
8.3334519073E-3,
1.3981999507E-3,
1.9875691500E-4,
];
enum T C1 = 0.693359375;
enum T C2 = -2.12194440e-4;
// Overflow and Underflow limits.
enum T OF = 88.72283905206835;
enum T UF = -103.278929903431851103; // ln(2^-149)
// Special cases.
//if (isNaN(x))
// return x;
//if (x > OF)
// return real.infinity;
//if (x < UF)
// return 0.0;
// Express: e^^x = e^^g * 2^^n
// = e^^g * e^^(n * LOG2E)
// = e^^(g + n * LOG2E)
T xx = floor((cast(T) LOG2E) * x + cast(T) 0.5); // NOT INLINED!
const int n = cast(int) xx;
x -= xx * C1;
x -= xx * C2;
xx = x * x;
x = poly(x, P) * xx + x + 1.0f; // poly is generated optimally, but
not inlined
// Scale by power of 2.
x = core.math.ldexp(x, n); // NOT INLINED
return x;
}
gdc gdc
(Compiler-Explorer-Build-gcc-454a4d5041f53cd1f7d902f6c0017b7ce95b36df-binutils-2.38)
13.0.1 20230318 (experimental)
gdc -O3 -march=znver2 -frelease -fbounds-check=off
pure nothrow @nogc @safe float std.math.algebraic.poly!(float, float,
6).poly(float, ref const(float[6])):
vmovss xmm1, DWORD PTR [rdi+20]
vfmadd213ss xmm1, xmm0, DWORD PTR [rdi+16]
vfmadd213ss xmm1, xmm0, DWORD PTR [rdi+12]
vfmadd213ss xmm1, xmm0, DWORD PTR [rdi+8]
vfmadd213ss xmm1, xmm0, DWORD PTR [rdi+4]
vfmadd213ss xmm0, xmm1, DWORD PTR [rdi]
ret
pure nothrow @nogc @safe float example.expImpl!(float).expImpl(float):
push rbx
vmovaps xmm1, xmm0
sub rsp, 16
vmovss xmm0, DWORD PTR .LC0[rip]
vfmadd213ss xmm0, xmm1, DWORD PTR .LC1[rip]
vmovss DWORD PTR [rsp+8], xmm1
call pure nothrow @nogc @trusted float
std.math.rounding.floor(float)
vmovss xmm1, DWORD PTR [rsp+8]
mov edi, OFFSET FLAT:immutable(float[6])
example.expImpl!(float).expImpl(float).P
vfnmadd231ss xmm1, xmm0, DWORD PTR .LC2[rip]
vmovss DWORD PTR [rsp+12], xmm0
vfnmadd231ss xmm1, xmm0, DWORD PTR .LC3[rip]
vmulss xmm3, xmm1, xmm1
vmovaps xmm0, xmm1
vmovss DWORD PTR [rsp+8], xmm1
vmovd ebx, xmm3
call pure nothrow @nogc @safe float std.math.algebraic.poly!(float,
float, 6).poly(float, ref const(float[6]))
vmovss xmm1, DWORD PTR [rsp+8]
vmovd xmm4, ebx
vmovss xmm2, DWORD PTR [rsp+12]
vfmadd132ss xmm0, xmm1, xmm4
vaddss xmm0, xmm0, DWORD PTR .LC4[rip]
add rsp, 16
pop rbx
vcvttss2si edi, xmm2
jmp ldexpf
float example.sRGB_case4(float):
sub rsp, 8
call pure nothrow @nogc @safe float
example.expImpl!(float).expImpl(float)
vmovss xmm1, DWORD PTR .LC6[rip]
vfmadd132ss xmm0, xmm1, DWORD PTR .LC5[rip]
add rsp, 8
ret
https://godbolt.org/z/YMoMPdjn5
Additionally
std.math.exp itself, is never inlined by gcc. This is important, as some early
checks (isNaN, OF, UF checks) in exp could be removed by proper inlining.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug d/109221] std.math.floor, core.math.ldexp, std.math.poly poor inlining
2023-03-21 1:09 [Bug d/109221] New: std.math.floor, core.math.ldexp, std.math.poly poor inlining witold.baryluk+gcc at gmail dot com
@ 2023-03-21 1:15 ` witold.baryluk+gcc at gmail dot com
2023-03-21 1:18 ` witold.baryluk+gcc at gmail dot com
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: witold.baryluk+gcc at gmail dot com @ 2023-03-21 1:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109221
--- Comment #1 from Witold Baryluk <witold.baryluk+gcc at gmail dot com> ---
PS. LDC 1.23.0 - 1.32.0 produce optimal code. LDC 1.22.0 a bit worse (due to
use of x87 codegen), and 1.21 and older fail to inline `ldexp`, but still
inline `poly` and `floor` perfectly.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug d/109221] std.math.floor, core.math.ldexp, std.math.poly poor inlining
2023-03-21 1:09 [Bug d/109221] New: std.math.floor, core.math.ldexp, std.math.poly poor inlining witold.baryluk+gcc at gmail dot com
2023-03-21 1:15 ` [Bug d/109221] " witold.baryluk+gcc at gmail dot com
@ 2023-03-21 1:18 ` witold.baryluk+gcc at gmail dot com
2023-03-21 1:26 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: witold.baryluk+gcc at gmail dot com @ 2023-03-21 1:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109221
--- Comment #2 from Witold Baryluk <witold.baryluk+gcc at gmail dot com> ---
Interesting enough, GDC 10.2 does inline `poly` instantiation with all the
constants.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug d/109221] std.math.floor, core.math.ldexp, std.math.poly poor inlining
2023-03-21 1:09 [Bug d/109221] New: std.math.floor, core.math.ldexp, std.math.poly poor inlining witold.baryluk+gcc at gmail dot com
2023-03-21 1:15 ` [Bug d/109221] " witold.baryluk+gcc at gmail dot com
2023-03-21 1:18 ` witold.baryluk+gcc at gmail dot com
@ 2023-03-21 1:26 ` pinskia at gcc dot gnu.org
2023-03-21 1:37 ` pinskia at gcc dot gnu.org
2023-03-21 7:19 ` crazylht at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-21 1:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109221
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Even for C/C++, GCC does not inline:
float f(float a, int b)
{
return __builtin_ldexpf(a,b);
}
So ...
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug d/109221] std.math.floor, core.math.ldexp, std.math.poly poor inlining
2023-03-21 1:09 [Bug d/109221] New: std.math.floor, core.math.ldexp, std.math.poly poor inlining witold.baryluk+gcc at gmail dot com
` (2 preceding siblings ...)
2023-03-21 1:26 ` pinskia at gcc dot gnu.org
@ 2023-03-21 1:37 ` pinskia at gcc dot gnu.org
2023-03-21 7:19 ` crazylht at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-21 1:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109221
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
With -ffast-math -mfpmath=387,sse (or -mavx512f instead of -mfpmath=387 as
there is a avx512f instruction too) added, ldexp gets inlined.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug d/109221] std.math.floor, core.math.ldexp, std.math.poly poor inlining
2023-03-21 1:09 [Bug d/109221] New: std.math.floor, core.math.ldexp, std.math.poly poor inlining witold.baryluk+gcc at gmail dot com
` (3 preceding siblings ...)
2023-03-21 1:37 ` pinskia at gcc dot gnu.org
@ 2023-03-21 7:19 ` crazylht at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: crazylht at gmail dot com @ 2023-03-21 7:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109221
Hongtao.liu <crazylht at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |crazylht at gmail dot com
--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Andrew Pinski from comment #4)
> With -ffast-math -mfpmath=387,sse (or -mavx512f instead of -mfpmath=387 as
> there is a avx512f instruction too) added, ldexp gets inlined.
Note, vscaless accept a float32 operand for exp which is int32 in ldexpf, there
maybe some precision loss to convert an int32 to float32, that's why Ofast is
needed here.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-03-21 7:19 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-21 1:09 [Bug d/109221] New: std.math.floor, core.math.ldexp, std.math.poly poor inlining witold.baryluk+gcc at gmail dot com
2023-03-21 1:15 ` [Bug d/109221] " witold.baryluk+gcc at gmail dot com
2023-03-21 1:18 ` witold.baryluk+gcc at gmail dot com
2023-03-21 1:26 ` pinskia at gcc dot gnu.org
2023-03-21 1:37 ` pinskia at gcc dot gnu.org
2023-03-21 7:19 ` crazylht at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).