public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/104535] New: don't use fmod?
@ 2022-02-14 20:16 fx at gnu dot org
2022-02-15 8:04 ` [Bug fortran/104535] " rguenth at gcc dot gnu.org
2022-02-17 19:02 ` kargl at gcc dot gnu.org
0 siblings, 2 replies; 3+ messages in thread
From: fx at gnu dot org @ 2022-02-14 20:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104535
Bug ID: 104535
Summary: don't use fmod?
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: fortran
Assignee: unassigned at gcc dot gnu.org
Reporter: fx at gnu dot org
Target Milestone: ---
I was reminded by comments on the report I made about poor fmod performance on
x86 that I should have commented on the original observation.
I'd looked at one of the Polyhedron benchmarks which suffers badly from a
simple random number routine that calls DMOD. That gets compiled to fmod,
which is only inlined, albeit poorly on x86, with the relevant component(s) of
-ffast-math. It seems to me that MOD should compile to the arithmetical
expression in the standard, which doesn't have the complication of having to
treat errors. (When I defined DMOD as a statement function for it in that
routine, I got performance much closer to ifort. I should have kept the
profiles I compared, but could regenerate them.
Is there a good reason not to do that (and maybe similarly with other
intrinsics I haven't checked)? I could probably have a go at implementing it
if appropriate, though I don't know my way around now.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug fortran/104535] don't use fmod?
2022-02-14 20:16 [Bug fortran/104535] New: don't use fmod? fx at gnu dot org
@ 2022-02-15 8:04 ` rguenth at gcc dot gnu.org
2022-02-17 19:02 ` kargl at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-15 8:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104535
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Version|unknown |12.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug fortran/104535] don't use fmod?
2022-02-14 20:16 [Bug fortran/104535] New: don't use fmod? fx at gnu dot org
2022-02-15 8:04 ` [Bug fortran/104535] " rguenth at gcc dot gnu.org
@ 2022-02-17 19:02 ` kargl at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: kargl at gcc dot gnu.org @ 2022-02-17 19:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104535
kargl at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |kargl at gcc dot gnu.org
--- Comment #1 from kargl at gcc dot gnu.org ---
This should be closed with WONTFIX. gcc/gfortran already provides at
least three options to get the requested behavior; namely, -ffast-math,
-Ofast, and -funsafe-math-optimization. In addition, gfortran maps mod()
to __builtin_fmod() for REAL arguments. It is trivial to show that gfortran
compiled Fortran code matches gcc compiled C code.
% cat u.f90
function mymod(x,y)
real mymod
real, intent(in), value :: x, y
mymod = mod(x,y)
end function mymod
% cat u.c
#include <math.h>
float
mymod(float x, float y)
{
return (fmodf(x,y));
}
% gfortran -O3 -S u.f90
% cat u.s
...
.cfi_startproc
jmp fmodf
.cfi_endproc
...
% gcc -O3 -S u.f90
% cat u.s
...
.cfi_startproc
jmp fmodf
.cfi_endproc
...
% gfortran -O3 -S -ffast-math u.f90
% cat u.s
...
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movss %xmm0, -4(%rbp)
movss %xmm1, -8(%rbp)
flds -8(%rbp)
flds -4(%rbp)
.L2:
fprem
fnstsw %ax
testb $4, %ah
jne .L2
fstp %st(1)
fstps -4(%rbp)
movss -4(%rbp), %xmm0
popq %rbp
.cfi_def_cfa 7, 8
.ret
.cfi_endproc
...
% gcc -O3 -S -ffast-math u.f90
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movss %xmm0, -4(%rbp)
movss %xmm1, -8(%rbp)
flds -8(%rbp)
flds -4(%rbp)
.L2:
fprem
fnstsw %ax
testb $4, %ah
jne .L2
fstp %st(1)
fstps -4(%rbp)
movss -4(%rbp), %xmm0
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
Now, let's look at trans-intrinsic.cc. One finds the following comment
/* Remainder function MOD(A, P) = A - INT(A / P) * P
MODULO(A, P) = A - FLOOR (A / P) * P
The obvious algorithms above are numerically instable for large
arguments, hence these intrinsics are instead implemented via calls
to the fmod family of functions. It is the responsibility of the
user to ensure that the second argument is non-zero. */
In infinite precision arithmetic the above equations could use
a straightforward naive in-lined implementation. Unfortunately,
computers tend to use finite precision arithmetic, so numerical
algorithms must be appropriately considered (See Goldberg).
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-02-17 19:02 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-14 20:16 [Bug fortran/104535] New: don't use fmod? fx at gnu dot org
2022-02-15 8:04 ` [Bug fortran/104535] " rguenth at gcc dot gnu.org
2022-02-17 19:02 ` kargl at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).